Richter, F; et al..
Pass. Report from WMF copyvios tool. Abstract overlaps with multiple other internet pages, however those pages are reusing the content from Wikipedia (often outside the cc-by-sa license). No genuine plagiarism detected. T.Shafee(Evo﹠Evo)talk 01:30, 21 August 2019 (UTC)
First peer review
Till Adhikary , Institute for Medical Bioinformatics and Biostatistics, Centre for Tumour Biology and Immunology, Philipps University of Marburg
This review was submitted on , and refers to this previous version of the article
The article by Richter et al. gives a comprehensive and balanced overview of RNA-seq. It summarises technical principles, in silico tools, and common approaches, the advantages and disadvantages of which are discussed. In my opinion, only minor additions are needed.
- Library generation, step 2
I suggest to mention that poly(A) selection not only ignores noncoding RNA but also several protein-coding transcripts such as those encoding for core histones. These are not polyadenylated. Furthermore, many cytokine-encoding transcripts are regulated via the length of their poly(A) tail which can lead to failure of detection when using oligo(dT) for enrichment.
Simple, streamlined approaches such as Quantseq (https://doi.org/10.1038/nmeth.f.376) omit enrichment/depletion steps and generate libraries with strong 3' bias.
- Single cell RNA sequencing, Experimental procedures
It is correct that each droplet carries a different barcode, but this may be confusing for readers with a basic level of knowledge. I think it should be mentioned that this is mediated by the presence of a bead which carries the barcoded oligonucleotides on its surface. Labeling of single cells is usually achieved by supplying limiting amounts of both cells and beads so that co-occupancy of a droplet by more than one cell, more than one bead, or more than one of each is a very rare event. Since the use of beads is a standard approach, I suggest also to include this in the workflow presented in figure 5. Another option is to mechanically separate cells into single wells (e.g. Takara ICELL8 or similar devices) and to process them individually.
Possibly it is also worth to include information on unique molecular identifiers which enable the identification of artefacts (overrepresentation of sequences derived from a single RNA molecule) introduced during library preparation.
- Gene expression quantification
I suggest to change the sentence ""Gene expression is often used as a proxy for protein abundance"" to ""Transcript levels are often used as a proxy for protein abundance"" since gene expression encompasses processes beyond transcription.
The use of spike-ins does not only allow for absolute quantification but also enables detection of genome-wide effects, e.g. after depletion of global regulators such as MYC, chromatin remodelers, and acetyltransferase complexes. Here, standard normalisation procedures strongly distort the outcome of the analysis (see e.g. DOI:10.1128/MCB.00970-14), which can easily go unnoticed if spike-ins are omitted.
- Minor comments
RNA-seq is sometimes spelled RNA-Seq, RNAseq, or RNASeq. If I am not mistaken, ""RNA-seq"" is the common spelling.Several figures are not referred to in the main text, and some of them (3,4) represent specific workflows, which might confuse entry-level readers.