NGS Tips & Tricks

Part 1: Set-up and some of the basics

Performing NGS experiments? This guide will help make sure you have the required reagents and equipment to get started as well as breakdown a few of the lesser-known topics in NGS.

NGS workflows are becoming more widely used by research and diagnostic labs throughout the world as the cost of sequencing decreases due to advances in the technology. Whether you are performing whole-genome sequencing (WGS) of viruses or bacteria or carrying out targeted sequencing for human genetic testing, there are many similar features shared between all NGS workflows.

Below you will find a checklist of critical equipment and reagents required before starting any NGS workflow in your lab. Always check the specific protocol you’re using to make sure you have access to all that is required prior to starting off.

EQUIPMENTPURPOSE
Thermal cyclerPerform the necessary incubation steps during library prep
Qubit FluorometerQuantify the concentration of DNA in sample or library
Bioanalyzer or TapeStationQualify the size of libraries before sequencing.
Used for nucleic acid integrity QC
MicrocentrifugeSpinning down of reagent mixes (e.g PCR master mix)
Magnetic stand/96-well plate magnetPerforming bead-based DNA clean-up or hybridizations
Calibrated pipettes, incl. multi-channelFor accurate pipetting in library prep
DNA LoBind 1.5mL Eppendorf’sMaking up reagent mixes
96-well PCR plates or strip tubesFor all library prep and amplification reactions
Shearing instrument (e.g. Biorupter)Fragment the DNA
NanoDropNucleic acid purity QC
REAGENTSPURPOSE
Cleanup beads (AMPure XP)DNA purification and size-selection
Streptavidin capture beads (Capture-based workflow only)Bind to biotinylated capture probes
Molecular grade H20Qualify the size of libraries before sequencing.
Diluting samples or reagent mixes
Low-TE bufferElute DNA and for sample dilutions
100% EthanolDNA purification wash steps

Importance of sample and Library QC:

One of the famous sayings in NGS is “rubbish in, rubbish out”, which applies not only to the quality of input material, whether that be DNA or RNA but also to the final libraries that are loaded onto the sequencing instrument. As NGS workflows can be costly there are several important QC steps during the protocol to ensure that low-quality samples are identified before they are sequenced, limiting unnecessary expenditure.

Sample (DNA/RNA) QC:

The best way to ensure a successful NGS run is to start off with good quality nucleic acid material. A NanoDrop spectrophotometer is used to assess the purity of your nucleic acid which is represented as an absorbance value. DNA and RNA have absorbance maxima at 260 nm while protein contaminants absorb at 280 nm, therefore a A260/A280 ratio of ~1.8 is generally accepted as “pure” for DNA and a ratio of ~2.0 is generally accepted as “pure” for RNA. Nucleic acid integrity can be evaluated via several different fragment analyzer systems (e.g. Bioanalyzer, TapeStation, Fragment Analyzer, FemtoPulse)  You want as much of the nucleic acid intact as possible so DNA should appear as high molecular weight fragments (>10kb), while the RIN value for RNA should be as high as possible, indicating minimal degradation of the 18S and 28S ribosomal bands.

Library QC:

The final QC step is performed to determine the molarity of your pooled library as this is required to dilute the pool down to the recommended loading concentration. A qPCR assay is the preferred method of quantification as the primers used are specific to adapter-ligated libraries, and not just any DNA in the sample. To convert from ng/µl to nM, the average fragment size (bp) of the library is measured using a fragment analysis system such as Bioanalyzer/TapeStation/Fragment Analyzer instrument, see Figure 1 below.

Figure 1: Bioanalyzer electropherogram of an NGS library. The fragment size of this library ranges from 200-600 bp with a major peak of 291 bp

Types of beads used in NGS workflows:

Solid Phase Reversible Immobilization (SPRI) beads are paramagnetic particles coated with carboxyl groups that can bind DNA non-specifically and reversibly. They are used during important steps in any NGS workflow, mostly for DNA cleanup after amplification reactions to remove contaminants such as left-over primers, adaptors and dNTPs. The beads can also be used for size-selection of DNA and a modified type of bead containing Streptavidin is used in hybrid-capture NGS assays.

Clean-up beads (e.g. AMPure XP):

AMPure XP are the most commonly used commercial version of SPRI beads made by Beckman-Coulter. They bind non-specifically to DNA in the presence of polyethylene glycol (PEG) and salt (NaCl) which is contained in the buffer they ship in. Beads are mixed with amplified product and their magnetic properties allow them to be held in place on a magnetic stand while unbound molecules are washed away. They are ideal for dealing with low concentration DNA cleanups and due to the reagents’ liquid properties, they can be used in high-throughput, automated workflows. The binding capacity of the beads is impressive, over 3 µg of DNA can bind to just 1 µl of beads.

Capture beads (e.g. Dynabeads™ MyOne™ Streptavidin T1):

These SPRI beads are similar to those described above with regards to their magnetic properties and DNA binding capabilities however they are also coated with the Streptavidin protein. They are used primarily in hybrid-capture NGS assays and rely on the very high binding affinity of the streptavidin-biotin interaction. Biotin (also known as vitamin B7 or vitamin H) is coupled to the capture probes used in these reactions which bind to complementary DNA sequences of interest which are then ‘captured’ using the Streptavidin-coated beads.

Figure 2: A representation of the hybrid-capture reaction showing streptavidin beads binding to the biotin on the capture probe

Target enrichment methods for NGS – Capture vs Amplicon

Some NGS workflows require the enrichment of specific genomic sequences (Targets), this is done to reduce ‘wasted’ data and maximize the value of the sequencing output. To enrich for the region of interest there are two different methods used, amplicon-based enrichment, or capture-based enrichment. Here is a quick breakdown of the two:

Amplicon-based:

One of the most common and well-known methods of target enrichment is the use of primers to amplify out the specific region of interest via PCR. This is usually performed at the start of the library prep workflow after DNA has been isolated from your sample, whether that be blood, tissue or microbial culture. This method can be used to enrich for whole genes or exons using a pool of primers in a single multiplex-PCR reaction. The downside to using this approach is that PCR errors may be introduced into your libraries prior to sequencing and PCR bias causes some regions to be preferentially amplified, leading to non-uniform coverage in the data.

Capture-based:

Also known as hybrid-capture, or hybridization-based target enrichment, this method differs from the amplicon-based method in that PCR is not used to enrich for the region of interest. Instead, oligonucleotide probes, also called baits, specific to a certain region are hybridized to a DNA library and bind to their complementary sequences. The probes are generally 100-120 bp long and are biotinylated, which allows them to bind to the Streptavidin-coated capture beads. After probes are hybridized to the DNA, Streptavidin beads are added and bind to the probes, allowing the unbound DNA to be washed away. Large gene panels or exome sequencing approaches generally use the hybrid-capture method to ensure highly uniform coverage over these large genomic regions.

FEATURECAPTURE_BASEDAMPLICON-BASED
ExampleSureSelect

HaloPlexHS and SureMASTR
Input amount1–250 ng for library prep10–100 ng
Workflow timeUsually more stepsFewer steps
Number of genes per panelUnlimited by panel size (up to 20,000 genes)Smaller gene content (~50 genes)
VAF Sensitivity1% without UMIs5%
Library complexityVery highLow
Coverage uniformityHighLow
Cost per sampleVaries by panel sizeGenerally lower cost per sample
Cost per sample• Exome sequencing
• Large gene panels for molecular pathology (E.g. Cardiomyopathy, epilepsy)
• Whole-genome sequencing of pathogens.
• Genotyping by sequencing
• Small gene panels (E.g. BRCA1/2)
• Prenatal genetics (NIPT)

Frequently asked questions

It is the first step in any NGS workflow and is the process whereby DNA or RNA are processed prior to sequencing. It begins with fragmentation, either mechanical or enzymatic to shear the DNA into the appropriate insert size. The ends of the DNA are then repaired and primed for the ligation of adapters which are added to the 3’ and 5’ ends. Adapters contain crucial primer binding sites used during sequencing as well as complimentary sequences to bind to the flow cell or bead surface.

Figure 3: This is the basic structure of an NGS library ready for sequencing. A variant in the DNA is indicated by a red block

Also referred to as read depth, it is defined as the number of times a base is read during sequencing. A higher number of reads will lead to higher coverage and therefore increased confidence in the sequencing data. A minimum depth is usually set as a threshold when performing variant calling.

Figure 4: An exon that was sequenced showing 10X coverage, i.e. 10 reads mapping to that location with a variant detected in 5 of those reads (heterozygous)

The complexity of an NGS library is measured by the number of unique sequencing reads mapped to the reference genome. Sequencing libraries should be as complex as possible with a high number of unique reads and very few duplicate reads, as this reflects the true nature of the starting material. Generally, the greater the amount of input material used for library prep leads to higher library complexity, providing confidence that a detected variant is real.

Figure 5: Library complexity visualized with different colours representing unique reads. The high complexity library has 5 unique reads with the variant detected while the variant seen in the low complexity library is from the same duplicate read

The on-target rate is the percentage of sequencing reads which cover a region of interest (the target region) and should be as high as possible. Reads outside of this region are referred to as off-target reads and these reduce the depth of coverage. To compensate for low on-target rate an increased sequencing depth is required for each library.

Figure 6: The example on the left shows 100% on-target rate where none of the reads mapped to a non-specific genomic region. The example with 80% on-target has 20% of its total reads mapping to a region that was not meant to be targeted (off-target)

Contact us on ross@diagnostech.co.za or michelle@diagnostech.co.za for more information and assistance with your NGS projects!