Molecular

What are microarrays?

As we move further into the new millennium, the great promise of molecular technological innovations continues to be realised. Exploration and analyses of the human genome are currently ubiquitous in the healthcare and research sectors. One such technology which has become standard in the diagnostic molecular laboratory is DNA microarrays. At its most basic, the technology takes advantage of the propensity of nucleic acids to hybridise to complementary sequences. Nucleic acid probes which are representative of sections of genes of interest are attached to a solid medium in an array fashion, sometimes by “spotting” the material onto sections of the solid medium, or by synthesis in situ. These nucleic acid probes can be derived from several sources, with the sizing, and density of probes targeting their region of interest in the genome, influencing the resolving capabilities of the assay. Older array technologies included the use of bacterial artificial chromosomes (BACs); human sequences were cloned into these large DNA constructs and then spotted onto the array slides, providing resolutions down to ~100 kb. Today the highest resolutions are achieved by oligonucleotide probes that can be manufactured directly onto slides, down to sizes of tens of bases.

The procedure typically entails nucleic acid extraction from the sample of interest. This can be whole genomic DNA (gDNA) or total RNA and is dependent on the research question being asked (Figure 1.) (Doug Chung & Le Roch, 2013). The sample of interest then undergoes a labeling step with a fluorescent dye or tag, either directly or by first converting the sample of interest to complementary DNA or RNA (cDNA or cRNA). The labelled sample is then hybridised to the array which contains probes that are complementary to genomic regions found within the sample of interest. Probe-target binding will result in light emission due to the fluorescent tag which is then recorded by the scanning instrument (eg. Agilent SureScan Microarray Scanner).

Figure 1. Typical microarray procedure. Reproduced from Chung et al, Encyclopedia of Biological Chemistry (Doug Chung & Le Roch, 2013)

Agilent offers one of the most versatile microarray solutions on the market today, with a plethora of analyses possible on the platform. At the core of Agilent’s microarray solutions is their SurePrint™ inkjet technology, allowing for 60-mer oligonucleotides to be synthesised in situ and in high density on their array slides (Figure 2.). Multiple array formats as well as customisation tools are available to create project-specific arrays.

Figure 2. Evolution of Agilent microarray slides and typical layouts available. The 11k, 22k, 180k etc. refer to the number of probes manufactured on to the array

Types of Microarrays

The Agilent microarray platform allows for several broad types of analyses with numerous possibilities for researchers and clinicians. The platform is highly compelling for transcriptomic analyses, allowing for a thorough look at the entirety of gene regulation (Figure 3.).

Figure 3. The entire picture of gene regulation is possible using Agilent's microarray platform at all levels of the flow of genetic information
Gene expression and exon arrays

RNA expression analysis from samples such as tissues or cultured cells is possible on the Agilent microarray platform, with expression levels of thousands of genes measurable simultaneously. An excellent example of the use of Agilent gene expression arrays was in the impactful “WINTHER” trial, an open-label study that enrolled 303 patients, with results published in 2019 (Rodon et al., 2019). Transcriptomic data obtained was able, for the first time, to inform treatment for cancer patients based on RNA expression levels in tumours as compared to normal tissue (Figure 4.) (Agilent, 2019).

Figure 4. Schematic summary of WINTHER trial workflow

Another use of the gene expression arrays is in a genome editing workflow. Following editing, outcomes can be measured including any off-target editing that may have occurred. Gene expression arrays are an accurate, low-cost, and high-throughput way to do this. As an example, Cromer et al used the Agilent SurePrint G3 Human Gene Expression arrays to measure gene expression on human CD34+ hematopoietic stem and progenitor cells that underwent CRISPR/Cas9-AAV6 genome editing (Cromer et al., 2018).

‘MammaPrint’ is a custom microarray-based assay that analyses early gene expression signatures in breast cancer, developed by the company Agendia and utilising Agilent microarray technology (van ‘t Veer et al., 2002; van de Vijver et al., 2002). The assay is based on determining the expression profile of 70 genes that can aid in predicting the likelihood of metastases within a five-year period and ultimately determine the benefits of treatment or no treatment with chemotherapy for patients. The assay received FDA approval in the late 2000s and was the first multi-gene expression assay to receive regulatory clearance by the FDA (Ledford, 2007). The assay continued to show its utility and robustness in the MINDACT trial, a multicentre, randomised, phase 3 trial involving over 100 academic and community hospitals in several European countries. The large study showed the clinical utility of using the MammaPrint assay as an addition to standard clinical–pathological criteria in selecting patients for adjuvant chemotherapy (Cardoso et al., 2016; Piccart et al., 2021). NGS versions of MammaPrint using RNA-seq have been developed in recent years, and show equivalent performance in the detection of targets, whilst simultaneously reciprocating the continued robustness and reliability of the microarray assay (Mittempergher et al., 2019; Schuler et al., 2022).

Further interrogation of gene expression is possible using Agilent’s Exon arrays. These allow for analyses at the gene and exon-level in the same experiment making the detection of alternate splice forms possible. Pre-mRNA transcripts can undergo splicing whereby certain sections are deleted, resulting in alternate mRNA transcripts for the same gene, and ultimately different protein products (Figure 5.). The broad dynamic range in these particular Exon arrays also ensures that detection of expression levels is possible for genes of interest, both low and high. Agilent gene expression and exon arrays base their designs on a number of well-known and popular publicly curated genomics databases including RefSeq, Ensemble, GenBank, LNCipedia and Unigene.

Figure 5. Alternative Splicing. A single gene can produce multiple related proteins, or isoforms, by means of alternative splicing. Reproduced from Guttmacher et al, NEJM (Guttmacher & Collins, 2002)
miRNA arrays

Gene expression can be regulated in a few ways. One of the important ways this is accomplished is by microRNAs (miRNAs). These are small, non-coding RNA molecules (~22 nucleotides) that can bind directly to mRNA transcripts. This can result in target cleavage or deadenylation resulting in mRNA degradation, or direct inhibition of translation (Figure 6.). Detection of specific miRNAs can be used as biomarkers of disease, often detected in plasma, serum and even in exosomes (cell-derived vesicles manufactured by all cell-types and used to carry DNA, RNA, lipids, and proteins around the body). Fehlmann et al undertook a multicenter, cohort study which enrolled 3102 patients and using Agilent SurePrint G3 Human miRNA (8 × 60k) microarray slides, were able to identify patterns of circulating miRNAs, enabling them to be used as biomarkers in liquid biopsies to complement other forms of diagnostic testing for lung cancer detection (Fehlmann et al., 2020). Agilent miRNA microarray designs (for humans, mice and rats) are based on the miRbase public database, which is a searchable resource of published miRNA sequences and annotations.

Figure 6. Canonical miRNA biogenesis pathway. Reproduced from Winter et al, Nature (Winter, Jung, Keller, Gregory, & Diederichs, 2009)
CGH + SNP Arrays

Agilent is a market leader in array comparative genomic hybridisation (aCGH), which incorporates both microarray and cytogenetic principles to achieve high-resolution detection of copy number variations (CNV) i.e. losses and gains in the genome responsible for many genetic diseases. Traditional “comparative genomic hybridisation” (CGH) was developed several decades ago and is a cytogenetic technique that entails the mixing of fluorescently-labelled test DNA and reference whole gDNA in equal ratios, followed by competitive hybridisation to a reference metaphase chromosomal spread on slides (Kallioniemi et al., 1992).

Figure 7. A typical aCGH workflow. Losses and gains in gene regions are determined by the log2 ratio of fluorescence intensity of the test sample vs. the reference. Reproduced from Frost et al, Genomics Education Programme (Frost, 2022)

The relative fluorescence of the bound test sample vs. reference DNA is then compared across the reference chromosomes to determine losses or gains in each chromosomal region. The newer aCGH method takes this a step further by utilising in silico designed nucleic acid probe arrays rather than whole chromosomal spreads, which can then be analysed on a DNA microarray scanner (Figure 7.). Depending on the configuration of the array and the gene segments of interest, certain regions of the genome may be assigned more or less probes with the interval between them further or closer apart. aCGH allows detection of disorders associated with an abnormal chromosomal number (aneuploidies) such as trisomies and monosomies (e.g. Down and Turner syndromes), or microdeletions/microduplications which are smaller gene deletions/duplications under ~5Mb that are not detectable via traditional karyotyping methods. aCGH microarrays play an important role in pre-and post-natal testing, with various organisations such as the American College of Medical Genetics recognising the technique as a first-tier test for pre- and post-natal conditions (Shao et al., 2021). Agilent CGH arrays are based on Genome Reference Consortium Human Builds 18, 19 or 38, dependent on the array being used.

CGH array options are available from Agilent which include additional probes targeting single nucleotide polymorphisms (SNPs) i.e CGH + SNP arrays. These are single nucleotide variations in the human genome that occur at a greater frequency than 1% in a population. Using SNPs allows for the detection of gene aberrations such as losses of an entire copy of an allele resulting in a cell becoming homozygous for that allele, known as loss of heterozygosity (LOH); this is often associated with development of cancers, or inherited disorders. LOH can occur either with a copy number loss of an allele (CNL-LOH) which typical aCGH arrays can detect (aCGH detects losses or gains of a gene segment relative to a reference), or with no change in the copy number of an allele known as copy number neutral LOH (CNN-LOH) which typical aCGH array techniques struggle to detect and would require detection by SNP analysis.

A specific type of CNN-LOH is uniparental disomy (UPD). Typically, each chromosome of the 23 pairs in humans are inherited from either parent; UPD can occur when both chromosomes of a pair (or parts of them) are inherited from one parent only. This often happens due to errors occurring during the phases of meiosis (cell division of the germ cells) (Figure 8). UPD can result in rare but grave disorders such as Prader-Willi syndrome, which is characterised by developmental delays, intellectual impairment, hypogonadism, short stature, low muscle tone and behavioural problems. A large portion of Prader-Willi cases are associated with a maternal UPD of chromosome 15 (del Gaudio et al., 2020). Another related developmental disorder that is characterised by intellectual disability and seizures, and involving UPD of chromosome 15, is Angelman syndrome. This disorder can occur when two copies of segments of the paternal chromosome 15 are inherited, in contrast to maternal segments with Prader-Willi. SNP arrays also allow for the detection of large regions in the genome that have a high incidence of homozygosity, which is not only indicative of UPD, but also consanguinity i.e.. conception involving two parents that are closely related (Gonzales et al., 2022). This increase in homozygosity can increase the progeny’s likelihood of developing an autosomal recessive disorder and is therefore of high clinical significance.

Figure 8. Different mechanisms that can lead to UPD. Often UPD occurs due to nondisjunction events during meiosis I of ooctyes or sperm (a and b). Crossover events can also result in segmental crossover during mitosis in somatic cells (c). Reproduced from del Gaudio et al, Genetics in Medicine (del Gaudio et al., 2020)
Microarray customisation

Oligonucleotide-based microarrays provide some advantages such as an improvement in resolution over microarrays that use other types of nucleic acid probes. A clear advantage is the customisation capabilities due to the in-silico design and in situ manufacturing procedures of the technology. Agilent can facilitate the custom design of microarrays for gene expression, miRNA and CGH analyses through their eArray and SureDesign web platforms. The platform allows for collaborative work and sharing of designs, and the possibility to create custom probes.

Custom designs can be shared through Agilent’s Community Designs initiative. These are microarrays designed by their customers, used successfully in the customer’s laboratories and then the design made commercially available for others to purchase and use. A highly successful community design is the SurePrint Community Design PathoChip (8×60k), a custom designed multi-pathogen microarray designed by Erle Robertson, MD, PhD, Perelman School of Medicine at the University of Pennsylvania. The design allows the detection of over 6,000 different pathogens, both RNA and DNA based, whilst still customisable to target new pathogens. Sensitivity is high enough that DNA contamination from the host does not have a significant impact on the analysis (Agilent, 2022; Baldwin, Feldman, Alwine, & Robertson, 2014)

Figure 9. Types of pathogens targeted by the PathoChip
In conclusion

Microarrays continue to be an important tool in the modern molecular biology laboratory, complementing other powerful genomics technologies such as next-generation sequencing (NGS). Agilent’s high-quality gene expression and aCGH arrays are powerful tools both now, and for the foreseeable future. Gene expression arrays continue to be important for researchers to detect biomarkers of diseases such as those in cancers and aCGH remains an important first-tier test for pre- and post-natal disorders and diseases.

References:

Agilent. (2019). Highlights from the WINTHER Trial [White Paper] (PR7000-2316). Retrieved from Agilent.com: https://www.agilent.com/cs/library/casestudies/public/Winther%20Trial%20Highlights_5994-1367EN%201.3.pdf
Agilent. (2022). SurePrint Community Design PathoChip 8x60k. Retrieved from https://www.agilent.com/cs/library/flyers/public/flyer-sureprint-pathoChip-8x60k-5994-3062en-agilent.pdf
Baldwin, D. A., Feldman, M., Alwine, J. C., & Robertson, E. S. (2014). Metagenomic assay for identification of microbial pathogens in tumor tissues. mBio, 5(5), e01714-01714. doi:10.1128/mBio.01714-14
Cardoso, F., van’t Veer, L. J., Bogaerts, J., Slaets, L., Viale, G., Delaloge, S., . . . Piccart, M. (2016). 70-Gene Signature as an Aid to Treatment Decisions in Early-Stage Breast Cancer. New England Journal of Medicine, 375(8), 717-729. doi:10.1056/NEJMoa1602253
Cromer, M. K., Vaidyanathan, S., Ryan, D. E., Curry, B., Lucas, A. B., Camarena, J., . . . Steinfeld, I. (2018). Global transcriptional response to CRISPR/Cas9-AAV6-based genome editing in CD34+ hematopoietic stem and progenitor cells. Molecular Therapy, 26(10), 2431-2442.
del Gaudio, D., Shinawi, M., Astbury, C., Tayeh, M. K., Deak, K. L., & Raca, G. (2020). Diagnostic testing for uniparental disomy: a points to consider statement from the American College of Medical Genetics and Genomics (ACMG). Genetics in Medicine, 22(7), 1133-1141. doi:10.1038/s41436-020-0782-9
Doug Chung, D. W., & Le Roch, K. G. (2013). Genome-Wide Analysis of Gene Expression. In W. J. Lennarz & M. D. Lane (Eds.), Encyclopedia of Biological Chemistry (Second Edition) (pp. 369-374). Waltham: Academic Press.
Fehlmann, T., Kahraman, M., Ludwig, N., Backes, C., Galata, V., Keller, V., . . . Keller, A. (2020). Evaluating the Use of Circulating MicroRNA Profiles for Lung Cancer Detection in Symptomatic Patients. JAMA Oncology, 6(5), 714-723. doi:10.1001/jamaoncol.2020.0001
Frost, A., van Campen J. (2022, 09/06/2022). Microarray (array CGH).   Retrieved from https://www.genomicseducation.hee.nhs.uk/genotes/knowledge-hub/microarray-array-cgh/
Gonzales, P. R., Andersen, E. F., Brown, T. R., Horner, V. L., Horwitz, J., Rehder, C. W., . . . on behalf of the, A. L. Q. A. C. (2022). Interpretation and reporting of large regions of homozygosity and suspected consanguinity/uniparental disomy, 2021 revision: A technical standard of the American College of Medical Genetics and Genomics (ACMG). Genetics in Medicine, 24(2), 255-261. doi:10.1016/j.gim.2021.10.004
Guttmacher, A. E., & Collins, F. S. (2002). Genomic Medicine — A Primer. New England Journal of Medicine, 347(19), 1512-1520. doi:10.1056/NEJMra012240
Kallioniemi, A., Kallioniemi, O. P., Sudar, D., Rutovitz, D., Gray, J. W., Waldman, F., & Pinkel, D. (1992). Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science, 258(5083), 818-821. doi:10.1126/science.1359641
Ledford, H. (2007). Genetic test gets approval. Nature. doi:10.1038/news070205-7
Mittempergher, L., Delahaye, L. J. M. J., Witteveen, A. T., Spangler, J. B., Hassenmahomed, F., Mee, S., . . . Glas, A. M. (2019). MammaPrint and BluePrint Molecular Diagnostics Using Targeted RNA Next-Generation Sequencing Technology. The Journal of Molecular Diagnostics, 21(5), 808-823. doi:https://doi.org/10.1016/j.jmoldx.2019.04.007
Nichols, C. A., Gibson, W. J., Brown, M. S., Kosmicki, J. A., Busanovich, J. P., Wei, H., . . . Beroukhim, R. (2020). Loss of heterozygosity of essential genes represents a widespread class of potential cancer vulnerabilities. Nature Communications, 11(1), 2517. doi:10.1038/s41467-020-16399-y
Piccart, M., van ‘t Veer, L. J., Poncet, C., Lopes Cardozo, J. M. N., Delaloge, S., Pierga, J.-Y., . . . Rutgers, E. J. T. (2021). 70-gene signature as an aid for treatment decisions in early breast cancer: updated results of the phase 3 randomised MINDACT trial with an exploratory analysis by age. The Lancet Oncology, 22(4), 476-488. doi:10.1016/S1470-2045(21)00007-3
Rodon, J., Soria, J.-C., Berger, R., Miller, W. H., Rubin, E., Kugel, A., . . . Kurzrock, R. (2019). Genomic and transcriptomic profiling expands precision cancer medicine: the WINTHER trial. Nature Medicine, 25(5), 751-758. doi:10.1038/s41591-019-0424-4
Schuler, E., Uygun, S., Mittempergher, L., Pronin, D., Mee, S., Bao, S., . . . Glas, A. (2022). 234P Equivalence of NGS-based MammaPrint 70-gene signature risk of recurrence and BluePrint 80-gene signature of molecular subtyping tests to the centralized microarray tests. Annals of Oncology, 33, S232. doi:10.1016/j.annonc.2022.03.256
Shao, L., Akkari, Y., Cooley, L. D., Miller, D. T., Seifert, B. A., Wolff, D. J., & Mikhail, F. M. (2021). Chromosomal microarray analysis, including constitutional and neoplastic disease applications, 2021 revision: a technical standard of the American College of Medical Genetics and Genomics (ACMG). Genetics in Medicine, 23(10), 1818-1829. doi:10.1038/s41436-021-01214-w
van ‘t Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., . . . Friend, S. H. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415(6871), 530-536. doi:10.1038/415530a
van de Vijver, M. J., He, Y. D., van’t Veer, L. J., Dai, H., Hart, A. A., Voskuil, D. W., . . . Bernards, R. (2002). A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med, 347(25), 1999-2009. doi:10.1056/NEJMoa021967
Winter, J., Jung, S., Keller, S., Gregory, R. I., & Diederichs, S. (2009). Many roads to maturity: microRNA biogenesis pathways and their regulation. Nature Cell Biology, 11(3), 228-234. doi:10.1038/ncb0309-228