RNA测序

File:Summary of RNA-Seq.svg

RNA-Seq概述，包括：样本准备、上机测序、生信数据处理与分析。
In vivo：在生物体内，基因被转录，并在（真核生物中）经过剪接，最终产生成熟的mRNA转录本（红色）。
In vitro：随后从生物体中提取mRNA，将其片段化并合成（逆转录加杂合链转换）稳定的双链cDNA（蓝色）。接着利用高通量短读测序法，对双链cDNA进行测序。
In silico：最后，将这些序列与参考基因组序列进行比对，从而重建并确定哪些基因组区域发生了转录。此类数据可用于注释表达基因的位置、其相对表达水平以及各种可变剪接变体^[1]。

RNA测序（RNA sequencing，RNA-seq^[2]^[3]）或核糖核酸测序，是指分析特定RNA片段的碱基序列，也就是腺嘌呤（A）、尿嘧啶（U）、胞嘧啶（C）、鸟嘌呤（G）的排列方式。

全转录物组测序（whole transcriptome sequencing，WTS^[4]^[5]），相对于全基因组测序（WGS），是利用高通量测序技术，检测并获得细胞或组织在特定功能状态下所转录的所有RNA产物序列的方法^[6]，属于一种狭义的RNA测序。WTS基于第二代测序技术的转录组学研究方法，使用第二代测序的能力，在给定时刻从一个基因组中，揭示RNA的存在和数量的一个快照^[7]。

首先提取生物样品的全部转录的RNA，然后反转录为c-DNA后进行的二代高通量测序，在此基础上进行片段的重叠组装，从而可得到一个个的转录本。进而可以形成对该生物样品当前发育状态的基因表达状况的全局了解（global）。进一步说，若和下一阶段的生物样品的RNA-Seq转录组进行比较，则可以得到全部的（在转录层面）基因表达的上调及下调—这就形成了表达谱，针对关键基因则可以形成你要想要的通路（pathway）的构建。

介绍[编辑]

相较于一个静态的染色体而言，细胞内的转录物组是一个处于不断变化的动态过程。随着现在的次世代基因测序（NGS）技术的发展，使得可测得的DNA碱基覆盖面增加且样本输出的吞吐量增大。有助于对细胞内RNA转录物进行测序，提供包括选择性剪接的转录、转录后的修饰、基因融合、突变/SNPs以及基因表达量改变等细节^[8]。，RNA测序不仅能检测mRNA的转录，还能观测到包括总RNA和小RNA（miRNA、tRNA和核糖体RNA）在内不同尺度的RNA表达谱^[9]。RNA测序还能用来确定外显子/内含子的边界，修正之前注释的5'和3'端基因边界。未来的RNA测序研究还包括观察感染时细胞传导路径的变化^[10]和癌症中不同基因表达程度^[11]。下一代基因测序之前，对转录物组学和基因表达的研究主要基于基因表达芯片（微阵列），后者包含数以千计用于探测靶向序列的DNA探针，可以得到所有表达出转录物的表达谱。基因表达芯片之后，基因表达的系列分析（英语：Serial analysis of gene expression）（SAGE）是主要的基因分析技术。相较于一个静态的染色体而言，细胞内的转录物组是一个处于不断变化的动态过程。随着现在的次世代基因测序（NGS）技术的发展，使得可测得的DNA碱基覆盖面增加且样本输出的吞吐量增大。有助于对细胞内RNA转录物进行测序，提供包括选择性剪接的转录、转录后的修饰、基因融合、突变/SNPs以及基因表达量改变等细节^[8]。，RNA测序不仅能检测mRNA的转录，还能观测到包括总RNA和小RNA（miRNA、tRNA和核糖体RNA）在内不同尺度的RNA表达谱^[9]。RNA测序还能用来确定外显子/内含子的边界，修正之前注释的5'和3'端基因边界。未来的RNA测序研究还包括观察感染时细胞传导路径的变化^[10]和癌症中不同基因表达程度^[12]。下一代基因测序之前，对转录物组学和基因表达的研究主要基于基因表达芯片（微阵列），后者包含数以千计用于探测靶向序列的DNA探针，可以得到所有表达出转录物的表达谱。基因表达芯片之后，基因表达的系列分析（英语：Serial analysis of gene expression）（SAGE）是主要的基因分析技术。

相对于RNA测序，基因表达芯片（微阵列）测序结果的覆盖面很窄，只能覆盖染色体中1千多万SNP中的常见等位基因的SNP（50万到200万）。因此，现有数据库中一般没有罕见等位基因的测序结果，而只有常见的SNP的数据，这对研究者来说是一个重大缺陷。很多癌症源于突变概率小于1%的突变，因而很难被检测出。但是，基因表达芯片（微阵列）测序在已知的等位基因检测中仍很重要，使它们非常适合监管机构批准的诊断，如囊性纤维化。

分析[编辑]

File:RNASeqWorkflow2016.png

Diagram outlining the RNASeq analyses described in this section

转录体组装[编辑]

有两种方法用于将原始序列读数分配给基因体特征（即组装转录体）：

De novo: 这种方法不需要参考基因体来重建转录体，通常基因体未知、不完整或与参考基因体相比有显著不同时使用^[13]。短读长序列进行de novo组装时的挑战包括：(1) 确定哪些序列应连接成连续序列（重叠序列群, contigs）(2) 测序错误和其他人为的稳定性 (3) 计算效率。使用在de novo组装的主要算法是从重叠图转换而来，称为de Bruijn图，其将序列读长切分为长度k的序列并将所有k-mer转存成杂凑表^[14]。使用de Bruijn图做组装的工具有 Velvet^[15]、Trinity^[13]、Oases^[16]和 Bridger^[17]。同一样品的双端序列和长序列读长可作为模板或骨架来弥补短读长序列的缺陷。评估de novo组装品质的指标包括重叠序列群长度的中位数、重叠序列群数量和 N50（英语：N50, L50, and related statistics）^[18]。

File:RNA-Seq-alignment.png

RNA-seq mapping of short reads in exon-exon junctions. The final mRNA is sequenced, which is missing the intronic sections of the pre-mRNA.

引导式组装：这种方法使用与DNA比对相同的方式，比对序列至参考基因体的非连续部分则需要额外的计算复杂度^[19]。这些非连续序列读数是对剪接产物进行测序的结果（如图）。通常比对算法分为两个步骤：(1) 对齐序列较短的部分 (seed) (2) 使用动态规划来找到最佳比对，有时结合已知的注释。使用基因体引导比对的工具包括 Bowtie^[20]TopHat（基于Bowtie结果对齐剪接点）^[21]^[22]、Subread^[23]、STAR^[19]、HISAT2^[24]、Sailfish^[25]、Kallisto^[26]和 GMAP^[27]。基因体引导式组装的品质可以借由以下两者来测量：(1) de novo组装指标（如N50）2）使用精确度、召回率或它们的组合（如F1 score）（与已知的转录本、剪接点、基因体和蛋白质序列比较）^[18]。此外，可以使用模拟序列读数的方式进行电脑模拟评估^[28]^[29]。

关于组装品质的说明：目前的共识是：1) 组装品质会因所采用的指标而异；2) 在某个物种中表现优异的组装工具，未必能在其他物种中同样表现出色；以及 3) 结合不同的方法可能是最可靠的。^[30]^[31]^[32]

参考文献[编辑]

^ Lowe R, Shirley N, Bleackley M, Dolan S, Shafee T. Transcriptomics technologies. PLOS Computational Biology. May 2017, 13 (5). Bibcode:2017PLSCB..13E5457L. PMC 5436640 可免费查阅. PMID 28545146. doi:10.1371/journal.pcbi.1005457 可免费查阅. 已忽略未知参数|article-number= (帮助)
^ Ayturk U. RNA-seq in Skeletal Biology. Curr Osteoporos Rep. 2019;17(4):178-185. doi:10.1007/s11914-019-00517-x
^ Simoneau J, Dumontier S, Gosselin R, Scott MS. Current RNA-seq methodology reporting limits reproducibility. Brief Bioinform. 2021;22(1):140-145. doi:10.1093/bib/bbz124
^ Esmeray Sönmez E, Hatipoğlu T, Kurşun D, et al. Whole Transcriptome Sequencing Reveals Cancer-Related, Prognostically Significant Transcripts and Tumor-Infiltrating Immunocytes in Mantle Cell Lymphoma. Cells. 2022;11(21):3394. Published 2022 Oct 27. doi:10.3390/cells11213394
^ Meggendorfer M, Walter W, Haferlach T. WGS and WTS in leukaemia: A tool for diagnostics?. Best Pract Res Clin Haematol. 2020;33(3):101190. doi:10.1016/j.beha.2020.101190
^ 转录物组测序. 术语在线. 全国科学技术名词审定委员会. （简体中文）
^ Chu Y, Corey DR. RNA sequencing: platform selection, experimental design, and data interpretation. Nucleic Acid Ther. August 2012, 22 (4): 271–4. PMC 3426205 可免费查阅. PMID 22830413. doi:10.1089/nat.2012.0367.
^ ^8.0 ^8.1 Maher CA, Kumar-Sinha C, Cao X,; et al. Transcriptome sequencing to detect gene fusions in cancer. Nature. March 2009, 458 (7234): 97–101. PMC 2725402 可免费查阅. PMID 19136943. doi:10.1038/nature07638.
^ ^9.0 ^9.1 Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. August 2012, 7 (8): 1534–50. PMC 3535016 可免费查阅. PMID 22836135. doi:10.1038/nprot.2012.086.
^ ^10.0 ^10.1 Qian F, Chung L, Zheng W; et al. Identification of Genes Critical for Resistance to Infection by West Nile Virus Using RNA-Seq Analysis. Viruses. 2013, 5 (7): 1664–81. PMID 23881275. doi:10.3390/v5071664.
^ Beane J, Vick J, Schembri F. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer Prev Res (Phila). June 2011, 4 (6): 803–17. PMC 3694393 可免费查阅. PMID 21636547. doi:10.1158/1940-6207.CAPR-11-0212.
^ Beane J, Vick J, Schembri F. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer Prev Res (Phila). June 2011, 4 (6): 803–17. PMC 3694393 可免费查阅. PMID 21636547. doi:10.1158/1940-6207.CAPR-11-0212.
^ ^13.0 ^13.1 Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology. May 2011, 29 (7): 644–52. PMC 3571712 可免费查阅. PMID 21572440. doi:10.1038/nbt.1883.
^ De Novo Assembly Using Illumina Reads (PDF). [22 October 2016]. （原始内容存档 (PDF)于2020-09-24）.
^ Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research. May 2008, 18 (5): 821–9. PMC 2336801 可免费查阅. PMID 18349386. doi:10.1101/gr.074492.107.
^ Oases: a transcriptome assembler for very short reads. [2019-02-16]. （原始内容存档于2018-11-29）.
^ Chang Z, Li G, Liu J, Zhang Y, Ashby C, Liu D, Cramer CL, Huang X. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biology. February 2015, 16 (1): 30. PMC 4342890 可免费查阅. PMID 25723335. doi:10.1186/s13059-015-0596-2.
^ ^18.0 ^18.1 Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, Dewey CN. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biology. December 2014, 15 (12): 553. PMC 4298084 可免费查阅. PMID 25608678. doi:10.1186/s13059-014-0553-5.
^ ^19.0 ^19.1 Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. January 2013, 29 (1): 15–21. PMC 3530905 可免费查阅. PMID 23104886. doi:10.1093/bioinformatics/bts635.
^ Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009, 10 (3): R25. PMC 2690996 可免费查阅. PMID 19261174. doi:10.1186/gb-2009-10-3-r25.
^ Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. May 2009, 25 (9): 1105–11. PMC 2672628 可免费查阅. PMID 19289445. doi:10.1093/bioinformatics/btp120.
^ Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols. March 2012, 7 (3): 562–78. PMC 3334321 可免费查阅. PMID 22383036. doi:10.1038/nprot.2012.016.
^ Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research. May 2013, 41 (10): e108. PMC 3664803 可免费查阅. PMID 23558742. doi:10.1093/nar/gkt214.
^ Kim, D; Langmead, B; Salzberg, SL. HISAT: a fast spliced aligner with low memory requirements.. Nature Methods. April 2015, 12 (4): 357–60. PMC 4655817 可免费查阅. PMID 25751142. doi:10.1038/nmeth.3317.
^ Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature Biotechnology. May 2014, 32 (5): 462–4. PMC 4077321 可免费查阅. PMID 24752080. arXiv:1308.3700 可免费查阅. doi:10.1038/nbt.2862.
^ Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology. May 2016, 34 (5): 525–7. PMID 27043002. doi:10.1038/nbt.3519.
^ Wu TD, Watanabe CK. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. May 2005, 21 (9): 1859–75. PMID 15728110. doi:10.1093/bioinformatics/bti310.
^ Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. February 2017, 14 (2): 135–139. PMC 5792058 可免费查阅. PMID 27941783. doi:10.1038/nmeth.4106 （English）.
^ Engström PG, Steijger T, Sipos B, Grant GR, Kahles A, Rätsch G, et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nature Methods. December 2013, 10 (12): 1185–91. PMC 4018468 可免费查阅. PMID 24185836. doi:10.1038/nmeth.2722 （English）.
^ Lu B, Zeng Z, Shi T. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq. Science China Life Sciences. February 2013, 56 (2): 143–55. PMID 23393030. doi:10.1007/s11427-013-4442-z 可免费查阅.
^ Bradnam KR, Fass JN, Alexandrov A, Baranay P, Bechner M, Birol I, Boisvert S, Chapman JA, Chapuis G, Chikhi R, Chitsaz H, Chou WC, Corbeil J, Del Fabbro C, Docking TR, Durbin R, Earl D, Emrich S, Fedotov P, Fonseca NA, Ganapathy G, Gibbs RA, Gnerre S, Godzaridis E, Goldstein S, Haimel M, Hall G, Haussler D, Hiatt JB, Ho IY, Howard J, Hunt M, Jackman SD, Jaffe DB, Jarvis ED, Jiang H, Kazakov S, Kersey PJ, Kitzman JO, Knight JR, Koren S, Lam TW, Lavenier D, Laviolette F, Li Y, Li Z, Liu B, Liu Y, Luo R, Maccallum I, Macmanes MD, Maillet N, Melnikov S, Naquin D, Ning Z, Otto TD, Paten B, Paulo OS, Phillippy AM, Pina-Martins F, Place M, Przybylski D, Qin X, Qu C, Ribeiro FJ, Richards S, Rokhsar DS, Ruby JG, Scalabrin S, Schatz MC, Schwartz DC, Sergushichev A, Sharpe T, Shaw TI, Shendure J, Shi Y, Simpson JT, Song H, Tsarev F, Vezzi F, Vicedomini R, Vieira BM, Wang J, Worley KC, Yin S, Yiu SM, Yuan J, Zhang G, Zhang H, Zhou S, Korf IF. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience. July 2013, 2 (1): 10. Bibcode:2013arXiv1301.5406B. PMC 3844414 可免费查阅. PMID 23870653. arXiv:1301.5406 可免费查阅. doi:10.1186/2047-217X-2-10 可免费查阅. 已忽略未知参数|article-number= (帮助)
^ Hölzer M, Marz M. De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers. GigaScience. May 2019, 8 (5). PMC 6511074 可免费查阅. PMID 31077315. doi:10.1093/gigascience/giz039. 已忽略未知参数|article-number= (帮助)

外部链接[编辑]

（英文）RNA-Seq for Everyone （页面存档备份，存于互联网档案馆）: a high-level guide to designing and implementing an RNA-Seq experiment.
（英文）ChIPBase database: 提供的蛋白质编码基因的表现谱和长链非编码RNA(lncRNAs) (lincRNAs) 从22个组织的RNA测序的数据。
Martin A. Perdacher (September 2011) Next-Generation Sequencing and its Applications in RNA-Seq^{[永久失效链接]}. Theory part of the Bachelorthesis, Hagenberg. （英文）

[1] Lowe R, Shirley N, Bleackley M, Dolan S, Shafee T. Transcriptomics technologies. PLOS Computational Biology. May 2017, 13 (5). Bibcode:2017PLSCB..13E5457L. PMC 5436640 可免费查阅. PMID 28545146. doi:10.1371/journal.pcbi.1005457 可免费查阅. 已忽略未知参数|article-number= (帮助)

[2] Ayturk U. RNA-seq in Skeletal Biology. Curr Osteoporos Rep. 2019;17(4):178-185. doi:10.1007/s11914-019-00517-x

[3] Simoneau J, Dumontier S, Gosselin R, Scott MS. Current RNA-seq methodology reporting limits reproducibility. Brief Bioinform. 2021;22(1):140-145. doi:10.1093/bib/bbz124

[4] Esmeray Sönmez E, Hatipoğlu T, Kurşun D, et al. Whole Transcriptome Sequencing Reveals Cancer-Related, Prognostically Significant Transcripts and Tumor-Infiltrating Immunocytes in Mantle Cell Lymphoma. Cells. 2022;11(21):3394. Published 2022 Oct 27. doi:10.3390/cells11213394

[5] Meggendorfer M, Walter W, Haferlach T. WGS and WTS in leukaemia: A tool for diagnostics?. Best Pract Res Clin Haematol. 2020;33(3):101190. doi:10.1016/j.beha.2020.101190

[6] 转录物组测序. 术语在线. 全国科学技术名词审定委员会. （简体中文）

[7] Chu Y, Corey DR. RNA sequencing: platform selection, experimental design, and data interpretation. Nucleic Acid Ther. August 2012, 22 (4): 271–4. PMC 3426205 可免费查阅. PMID 22830413. doi:10.1089/nat.2012.0367.

[未命名-20250106095953-8] 8.0 ^8.1 Maher CA, Kumar-Sinha C, Cao X,; et al. Transcriptome sequencing to detect gene fusions in cancer. Nature. March 2009, 458 (7234): 97–101. PMC 2725402 可免费查阅. PMID 19136943. doi:10.1038/nature07638.

[未命名_2-20250106095953-9] 9.0 ^9.1 Ingolia NT, Brar GA, Rouskin S, McGeachy AM, Weissman JS. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat Protoc. August 2012, 7 (8): 1534–50. PMC 3535016 可免费查阅. PMID 22836135. doi:10.1038/nprot.2012.086.

[未命名_3-20250106095953-10] 10.0 ^10.1 Qian F, Chung L, Zheng W; et al. Identification of Genes Critical for Resistance to Infection by West Nile Virus Using RNA-Seq Analysis. Viruses. 2013, 5 (7): 1664–81. PMID 23881275. doi:10.3390/v5071664.

[11] Beane J, Vick J, Schembri F. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer Prev Res (Phila). June 2011, 4 (6): 803–17. PMC 3694393 可免费查阅. PMID 21636547. doi:10.1158/1940-6207.CAPR-11-0212.

[12] Beane J, Vick J, Schembri F. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer Prev Res (Phila). June 2011, 4 (6): 803–17. PMC 3694393 可免费查阅. PMID 21636547. doi:10.1158/1940-6207.CAPR-11-0212.

[ReferenceB-13] 13.0 ^13.1 Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology. May 2011, 29 (7): 644–52. PMC 3571712 可免费查阅. PMID 21572440. doi:10.1038/nbt.1883.

[14] De Novo Assembly Using Illumina Reads (PDF). [22 October 2016]. （原始内容存档 (PDF)于2020-09-24）.

[15] Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research. May 2008, 18 (5): 821–9. PMC 2336801 可免费查阅. PMID 18349386. doi:10.1101/gr.074492.107.

[16] Oases: a transcriptome assembler for very short reads. [2019-02-16]. （原始内容存档于2018-11-29）.

[17] Chang Z, Li G, Liu J, Zhang Y, Ashby C, Liu D, Cramer CL, Huang X. Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biology. February 2015, 16 (1): 30. PMC 4342890 可免费查阅. PMID 25723335. doi:10.1186/s13059-015-0596-2.

[ReferenceC-18] 18.0 ^18.1 Li B, Fillmore N, Bai Y, Collins M, Thomson JA, Stewart R, Dewey CN. Evaluation of de novo transcriptome assemblies from RNA-Seq data. Genome Biology. December 2014, 15 (12): 553. PMC 4298084 可免费查阅. PMID 25608678. doi:10.1186/s13059-014-0553-5.

[2012_STAR_aligner-19] 19.0 ^19.1 Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. January 2013, 29 (1): 15–21. PMC 3530905 可免费查阅. PMID 23104886. doi:10.1093/bioinformatics/bts635.

[20] Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009, 10 (3): R25. PMC 2690996 可免费查阅. PMID 19261174. doi:10.1186/gb-2009-10-3-r25.

[21] Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. May 2009, 25 (9): 1105–11. PMC 2672628 可免费查阅. PMID 19289445. doi:10.1093/bioinformatics/btp120.

[22] Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols. March 2012, 7 (3): 562–78. PMC 3334321 可免费查阅. PMID 22383036. doi:10.1038/nprot.2012.016.

[23] Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research. May 2013, 41 (10): e108. PMC 3664803 可免费查阅. PMID 23558742. doi:10.1093/nar/gkt214.

[24] Kim, D; Langmead, B; Salzberg, SL. HISAT: a fast spliced aligner with low memory requirements.. Nature Methods. April 2015, 12 (4): 357–60. PMC 4655817 可免费查阅. PMID 25751142. doi:10.1038/nmeth.3317.

[25] Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature Biotechnology. May 2014, 32 (5): 462–4. PMC 4077321 可免费查阅. PMID 24752080. arXiv:1308.3700 可免费查阅. doi:10.1038/nbt.2862.

[26] Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology. May 2016, 34 (5): 525–7. PMID 27043002. doi:10.1038/nbt.3519.

[27] Wu TD, Watanabe CK. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. May 2005, 21 (9): 1859–75. PMID 15728110. doi:10.1093/bioinformatics/bti310.

[28] Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. February 2017, 14 (2): 135–139. PMC 5792058 可免费查阅. PMID 27941783. doi:10.1038/nmeth.4106 （English）.

[29] Engström PG, Steijger T, Sipos B, Grant GR, Kahles A, Rätsch G, et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nature Methods. December 2013, 10 (12): 1185–91. PMC 4018468 可免费查阅. PMID 24185836. doi:10.1038/nmeth.2722 （English）.

[30] Lu B, Zeng Z, Shi T. Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq. Science China Life Sciences. February 2013, 56 (2): 143–55. PMID 23393030. doi:10.1007/s11427-013-4442-z 可免费查阅.

[31] Bradnam KR, Fass JN, Alexandrov A, Baranay P, Bechner M, Birol I, Boisvert S, Chapman JA, Chapuis G, Chikhi R, Chitsaz H, Chou WC, Corbeil J, Del Fabbro C, Docking TR, Durbin R, Earl D, Emrich S, Fedotov P, Fonseca NA, Ganapathy G, Gibbs RA, Gnerre S, Godzaridis E, Goldstein S, Haimel M, Hall G, Haussler D, Hiatt JB, Ho IY, Howard J, Hunt M, Jackman SD, Jaffe DB, Jarvis ED, Jiang H, Kazakov S, Kersey PJ, Kitzman JO, Knight JR, Koren S, Lam TW, Lavenier D, Laviolette F, Li Y, Li Z, Liu B, Liu Y, Luo R, Maccallum I, Macmanes MD, Maillet N, Melnikov S, Naquin D, Ning Z, Otto TD, Paten B, Paulo OS, Phillippy AM, Pina-Martins F, Place M, Przybylski D, Qin X, Qu C, Ribeiro FJ, Richards S, Rokhsar DS, Ruby JG, Scalabrin S, Schatz MC, Schwartz DC, Sergushichev A, Sharpe T, Shaw TI, Shendure J, Shi Y, Simpson JT, Song H, Tsarev F, Vezzi F, Vicedomini R, Vieira BM, Wang J, Worley KC, Yin S, Yiu SM, Yuan J, Zhang G, Zhang H, Zhou S, Korf IF. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience. July 2013, 2 (1): 10. Bibcode:2013arXiv1301.5406B. PMC 3844414 可免费查阅. PMID 23870653. arXiv:1301.5406 可免费查阅. doi:10.1186/2047-217X-2-10 可免费查阅. 已忽略未知参数|article-number= (帮助)

[32] Hölzer M, Marz M. De novo transcriptome assembly: A comprehensive cross-species comparison of short-read RNA-Seq assemblers. GigaScience. May 2019, 8 (5). PMC 6511074 可免费查阅. PMID 31077315. doi:10.1093/gigascience/giz039. 已忽略未知参数|article-number= (帮助)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

查论编生物信息学
数据库	测序数据库：GenBank、European Nucleotide Archive（英语：European Nucleotide Archive）、日本DNA数据库(DDBJ) 辅助数据库：UniProt, database of protein sequences grouping together Swiss-Prot, TrEMBL 和蛋白质信息资源（英语：Protein Information Resource）其它数据库：蛋白质数据库, Ensembl, 和InterPro（英语：InterPro）专项基因组数据库：酵母基因组数据库（英语：Saccharomyces Genome Database）、FlyBase、VectorBase（英语：VectorBase）、PomBase、WormBase（英语：WormBase）、PHI-base（英语：PHI-base）、拟南芥信息资源（英语：The Arabidopsis Information Resource）与斑马鱼信息网（英语：Zebrafish Information Network）
软件	BLAST Bowtie Clustal EMBOSS（英语：EMBOSS） HMMER（英语：HMMER） MUSCLE SAMtools（英语：SAMtools） TopHat（英语：TopHat (bioinformatics)）
其它	服务器：ExPASy（英语：ExPASy）本体论：基因本体
机构	欧洲生物信息研究所(EMBL-EBI) 欧洲分子生物学实验室(EMBL) 美国国家生物技术信息中心（NCBI）瑞士生物信息学研究所（英语：Swiss Institute of Bioinformatics）日本国立遗传学研究所博德研究所维康桑格研究所斯克里普斯研究所(TSRI)
文件格式	FASTA格式 FASTQ格式斯德哥尔摩格式（英语：Stockholm format）
有关议题	计算生物学分子系统发生学测序序列比对
File:Symbol category class.svg 分类 File:Commons-logo.svg 维基共享

RNA测序

目录

介绍[编辑]

分析[编辑]

转录体组装[编辑]

参考文献[编辑]

外部链接[编辑]

导航菜单

RNA测序

介绍[编辑]

分析[编辑]

转录体组装[编辑]

参考文献[编辑]

外部链接[编辑]

导航菜单

搜索