Spike-in RNA Variant Control Mixes
Spike-in controls are essential in RNA-Seq experiments to assess workflow and platform properties. However, existing external RNA controls are generally mono-exonic and non-variant, significantly limiting their ability to reflect the true nature of eukaryotic transcriptomes. These are characterized by extensive splicing, alternative and antisense transcription, overlapping genes, and rare events like the formation of fusion genes. The performance of RNA preparation, library generation, sequencing, and bioinformatics algorithms can furthermore not be assessed adequately without known transcript spike-in controls of representative complexity.
To address this gap, Lexogen has conceived Spike-In RNA Variants (SIRVs) for the quantification of mRNA isoforms in Next Generation Sequencing (NGS). The SIRVs are a set of 69 artificial transcript variants which mimic 7 human model genes. They are complemented by additional isoforms to comprehensively reflect variations of alternative splicing, alternative transcription start- and end-sites, overlapping genes, and antisense transcripts. The accuracy of mapping, isoform assembly and quantification can be assessed, making isoform-quantification based experiments comparable.
Validate your RNA-Seq quantification pipeline and its annotation-robustness
The a priori knowledge of SIRV transcript sequences and concentrations allows to assess the isoform-specific performance of an RNA-Seq experiment. In addition to the correct annotation of the SIRVs, one insufficient and one over-annotation are supplied to enable the testing of NGS data evaluation algorithms for their robustness towards "real life", imperfect annotations.
The SIRVs are designed to cover 7 synthetic genes, modeled on human counterparts, with up to 18 transcript variants each representing alternative splicing, differential promoter and poly(A) site usage, overlapping genes and antisense transcription. Natural occurring, annotated variants were supplemented by rational designed ones to enhance complexity and comprehensively cover splicing and transcription variation aspects in all 7 SIRV genes. The SIRVs are provided as three mixes, with molar ratios of the SIRV RNAs differing up to two orders of magnitudes.
The SIRVs can be analyzed with any RNA-Seq protocol starting from cell extracts or purified RNA and on any NGS platform (Illumina, Life Technologies, PacBio, Oxford Nanopore ...). Since the SIRV RNAs are polyadenylated, library preparation can start from poly(A) selected fractions as well as from total RNA, depleted RNA, etc. The SIRV mixes are also suitable for quantification on microarray platforms and in qPCR assays.
The SIRVs can be used with crude cell extract, purified total RNA, rRNA-depleted RNA or poly(A) enriched RNA, also together with the mono-exonic ERCC controls. Due to their sequences being non-identical to genomic and transcriptomic database entries they can be combined with RNA from almost any organism.
3 SIRV mixes for differential expression
The SIRVs are provided as a set of 3 SIRV mixes, E0, E1, and E2, with each mix containing all 69 SIRVs but in different concentration ratios. E0 contains the RNAs in equimolar ratio, E1 covers one order of magnitude (up to 1:8) and in E2 SIRV concentration range over more than two orders of magnitude (up to 1:128). The comparison of samples spiked with these different mixes enables the quantitative assessment of differential expression workflows on the transcript level.
The SIRVs are provided as three mixtures containing the 69 transcripts of the 7 SIRV genes in different concentrations.