2014年1月1日星期三

microRNA靶基因预测常用软件介绍

认为RNA仅是DNA与蛋白质之间的过渡的时代已经终结,microRNA的发现使科研界再次聚焦RNA。
microRNA(miRNA)是一类长22 nt左右的内源非编码小RNA,广泛存在于动物、植物和病毒等物种中。1993年,Lee等人首先在秀丽线虫体内发现了首个miRNA lin-4,进一步研究表明,lin-4 RNA通过与lin-14基因3′ UTR特异性结合降低LIN-14蛋白的表达水平。miRNA基因通常位于基因间或内含子区域,由RNA聚合酶转录产生pri-miRNApri-miRNA具有帽子结构和多聚腺苷酸尾巴,pri-miRNA在核酸酶Drosha作用下切割生成70nt左右的pre-miRNA,核酸酶Dicer切割pre-miRNA最终生成22nt左右的miRNA单链分子。成熟的miRNA分子与Argonaute等蛋白形成RNA诱导的沉默复合体(RISC)抑制靶基因表达。

miRNA通过与靶基因mRNA部分互补配对在转录后水平抑制靶基因表达,研究表明,miRNA参与包括细胞增殖、凋亡、分化、代谢、发育和肿瘤转移等各种生物学过程。但miRNAs与其靶基因并非完全匹配,这给确定miRNA靶基因带来难度。科研人员通过分析已知miRNA及其靶基因,发现如下重要特征:靶基因3′ UTR区具有与miRNA 5′端至少7个连续核苷酸的完全配对区域(2-8nt)miRNA的该部分序列称为种子序列,mRNAmiRNA种子序列互补的区域在物种中经常具有保守性。研究人员根据对miRNA及其靶mRNA特征的认识,开发了相应的计算机软件推断miRNA的靶基因。下文对miRNA靶基因预测软件做几个简要的介绍。
一般用于miRNA靶基因预测的软件遵循如下几个原理:
1 序列互补性
位于miRNA 5′端所谓种子序列(2-7nt)与靶基因3′ UTR可形成Watson-Crick配对是所有miRNA靶基因预测的最重要因素。配对包括如图所示几种形式:
多数情况下为7nt匹配:第2-7nt与靶基因呈互补配对,外加在靶基因对应miRNA第一位核苷酸处为A(7mer-1A site),或是miRNA2-8nt与靶基因完全配对(7mer-m8 site);而对于miRNA2-8nt与靶基因完全配对,且外加靶基因对应miRNA第一位核苷酸处为A(8mer site)这种类型,其特异性更高;而对于仅miRNA2-7核苷酸与靶基因完全配对(6mer site)这种方式,其用于搜索靶基因的敏感性更高,但特异性相应下降。另外,还有种子序列外的3’ supplementary site3’ complementary site两种形式。microRNA靶基因预测常用软件介绍

microRNA靶基因预测常用软件介绍

microRNA靶基因预测常用软件介绍

microRNA靶基因预测常用软件介绍

microRNA靶基因预测常用软件介绍

microRNA靶基因预测常用软件介绍

2 序列保守性及其它因素
除了序列互补性外,靶基因预测较关注的还包括序列保守性、热动力学因素、位点的可结合性(accessibility)和UTR碱基分布等多个因素。
序列保守性:miRNA结合位点在多个物种之间如果具有保守性,则该位点更可能为miRNA的靶位点。
热动力学因素:miRNA:target对形成的自由能,自由能越低,其可能性越大。
位点的可结合性(accessibility)mRNA的二级结构影响与miRNA的结合形成双链结构的能力。
UTR碱基分布:miRNA结合位点在UTR区的位置和相应位置的碱基分布同样影响miRNA与靶基因位点的结合和RISC的效率。
另外,诸如miRNA的分布与靶基因组织分布的相关性也是在做靶基因预测时要考虑的重要因素。
用于miRNA靶基因预测的软件种类较多,包括miRanda, EMBL, PicTar, TargetScan(S), DIANA-microT 3.0, PITA, ElMMo, rna22, GenMiR++, TarBase, miRBase, miRGen-Targets等。软件侧重点不同,预测能力可谓各有千秋。选择靶基因预测软件时可以重点选取两个,辅助添加一两个。一般而言,不同软件的预测交集具有更好的特异性。下面列出几个较常用软件及其出处,应用软件可登陆相应网站,了解其特性及其细节,可参看相应文献:
预测软件miRBase,为最常用miRNA数据库,其靶基因预测功能现在已交microCosm(EBI),
关于miRBase请参考:
Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. 2006. miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 34:D140-4.
Abstract: The miRBase database aims to provide integrated interfaces to comprehensive microRNA sequence data, annotation and predicted gene targets. miRBase takes over functionality from the microRNA Registry and fulfils three main roles: the miRBase Registry acts as an independent arbiter of microRNA gene nomenclature, assigning names prior to publication of novel miRNA sequences. miRBase Sequences is the primary online repository for miRNA sequence data and annotation. miRBase Targets is a comprehensive new database of predicted miRNA target genes. miRBase is available at http://microrna.sanger.ac.uk/.
TargetScanLewis等人开发,请参考:
Lewis, B.P., Shih, I.H., Jones-Rhoades, M.W., Bartel, D.P., Burge, C.B. 2003. Prediction of mammalian microRNA targets. Cell. 115, 787-798.
Abstract: MicroRNAs (miRNAs) can play important gene regulatory roles in nematodes, insects, and plants by base-pairing to mRNAs to specify posttranscriptional repression of these messages. However, the mRNAs regulated by vertebrate miRNAs are all unknown. Here we predict more than 400 regulatory target genes for the conserved vertebrate miRNAs by identifying mRNAs with conserved pairing to the 5’ region of the miRNA and evaluating the number and quality of these complementary sites. Rigorous tests using shuffled miRNA controls supported a majority of these predictions, with the fraction of false positives estimated at 31% for targets identified in human, mouse, and rat and 22% for targets identified in pufferfish as well as mammals. Eleven predicted targets (out of 15 tested) were supported experimentally using a HeLa cell reporter system. The predicted regulatory targets of mammalian miRNAs were enriched for genes involved in transcriptional regulation but also encompassed an unexpectedly broad range of other functions.
PicTarKrek等人开发,有关PicTar情况请参考:
Krek A, Grün D, Poy MN, Wolf R, Rosenberg L, Epstein EJ, MacMenamin P, da Piedade I, Gunsalus KC, Stoffel M, Rajewsky N. 2005. Combinatorial microRNA target predictions.Nat Genet. 37(5):495-500.
Abstract: MicroRNAs are small noncoding RNAs that recognize and bind to partially complementary sites in the 3’ untranslated regions of target genes in animals and, by unknown mechanisms, regulate protein production of the target transcript. Different combinations of microRNAs are expressed in different cell types and may coordinately regulate cell-specific target genes. Here, we present PicTar, a computational method for identifying common targets of microRNAs. Statistical tests using genome-wide alignments of eight vertebrate genomes, PicTar’s ability to specifically recover published microRNA targets, and experimental validation of seven predicted targets suggest that PicTar has an excellent success rate in predicting targets for single microRNAs and for combinations of microRNAs. We find that vertebrate microRNAs target, on average, roughly 200 transcripts each. Furthermore, our results suggest widespread coordinate control executed by microRNAs. In particular, we experimentally validate common regulation of Mtpn by miR-375, miR-124 and let-7b and thus provide evidence for coordinate microRNA control in mammals.
预测软件miRDB,请参考如下两篇文献:
Wang X. 2008. miRDB: a microRNA target prediction and functional annotation database with a wiki interface. RNA. 14(6):1012-7.
Abstract: MicroRNAs (miRNAs) are short noncoding RNAs that are involved in the regulation of thousands of gene targets. Recent studies indicate that miRNAs are likely to be master regulators of many important biological processes. Due to their functional importance, miRNAs are under intense study at present, and many studies have been published in recent years on miRNA functional characterization. The rapid accumulation of miRNA knowledge makes it challenging to properly organize and present miRNA function data. Although several miRNA functional databases have been developed recently, this remains a major bioinformatics challenge to miRNA research community. Here, we describe a new online database system, miRDB, on miRNA target prediction and functional annotation. Flexible web search interface was developed for the retrieval of target prediction results, which were generated with a new bioinformatics algorithm we developed recently. Unlike most other miRNA databases, miRNA functional annotations in miRDB are presented with a primary focus on mature miRNAs, which are the functional carriers of miRNA-mediated gene expression regulation. In addition, a wiki editing interface was established to allow anyone with Internet access to make contributions on miRNA functional annotation. This is a new attempt to develop an interactive community-annotated miRNA functional catalog. All data stored in miRDB are freely accessible at http://mirdb.org.
Wang X, El Naqa IM. Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics. 200824(3):325-32.
Abstract:
Motivation: MicroRNAs (miRNAs) are involved in many diverse biological processes and they may potentially regulate the functions of thousands of genes. However, one major issue in miRNA studies is the lack of bioinformatics programs to accurately predict miRNA targets. Animal miRNAs have limited sequence complementarity to their gene targets, which makes it challenging to build target prediction models with high specificity.
Results: Here we present a new miRNA target prediction program based on support vector machines (SVMs) and a large microarray training dataset. By systematically analyzing public microarray data, we have identified statistically significant features that are important to target downregulation. Heterogeneous prediction features have been non-linearly integrated in an SVM machine learning framework for the training of our target prediction model, MirTarget2. About half of the predicted miRNA target sites in human are not conserved in other organisms. Our prediction algorithm has been validated with independent experimental data for its improved performance on predicting a large number of miRNA down-regulated gene targets.
Availability: All the predicted targets were imported into an online database miRDB, which is freely accessible at http://mirdb.org.
miRandaEnright等人设计开发,miRanda 3′ UTR的筛选主要依据序列匹配、miRNAmRNA双链的热稳定性以及靶位点的保守性三方面,最初文献请参考:
Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS. 2003. MicroRNA targets in Drosophila. Genome Biol.;5(1):R1.
Abstract:
Background
The recent discoveries of microRNA (miRNA) genes and characterization of the first few target genes regulated by miRNAs in Caenorhabditis elegans and Drosophila melanogaster have set the stage for elucidation of a novel network of regulatory control. We present a computational method for whole-genome prediction of miRNA target genes. The method is validated using known examples. For each miRNA, target genes are selected on the basis of three properties: sequence complementarity using a position-weighted local alignment algorithm, free energies of RNA-RNA duplexes, and conservation of target sites in related genomes. Application to the D. melanogasterDrosophila pseudoobscura andAnopheles gambiae genomes identifies several hundred target genes potentially regulated by one or more known miRNAs.
Results
These potential targets are rich in genes that are expressed at specific developmental stages and that are involved in cell fate specification, morphogenesis and the coordination of developmental processes, as well as genes that are active in the mature nervous system. High-ranking target genes are enriched in transcription factors two-fold and include genes already known to be under translational regulation. Our results reaffirm the thesis that miRNAs have an important role in establishing the complex spatial and temporal patterns of gene activity necessary for the orderly progression of development and suggest additional roles in the function of the mature organism. In addition the results point the way to directed experiments to determine miRNA functions.
Conclusions
The emerging combinatorics of miRNA target sites in the 3' untranslated regions of messenger RNAs are reminiscent of transcriptional regulation in promoter regions of DNA, with both one-to-many and many-to-one relationships between regulator and target. Typically, more than one miRNA regulates one message, indicative of cooperative translational control. Conversely, one miRNA may have several target genes, reflecting target multiplicity. As a guide to focused experiments, we provide detailed online information about likely target genes and binding sites in their untranslated regions, organized by miRNA or by gene and ranked by likelihood of match. The target prediction algorithm is freely available and can be applied to whole genome sequences using identified miRNA sequences.
预测软件HOCTAR,请参考:
Gennarino VA, Sardiello M, Avellino R, Meola N, Maselli V, Anand S, Cutillo L, Ballabio A, Banfi S. 2009.MicroRNA target prediction by expression analysis of host genes. Genome Res. 19(3):481-90.
Abstract: MicroRNAs (miRNAs) are small noncoding RNAs that control gene expression by inducing RNA cleavage or translational inhibition. Most human miRNAs are intragenic and are transcribed as part of their hosting transcription units. We hypothesized that the expression profiles of miRNA host genes and of their targets are inversely correlated and devised a novel procedure, HOCTAR (host gene oppositely correlated targets), which ranks predicted miRNA target genes based on their anti-correlated expression behavior relative to their respective miRNA host genes. HOCTAR is the first tool for systematic miRNA target prediction that utilizes the same set of microarray experiments to monitor the expression of both miRNAs (through their host genes) and candidate targets. We applied the procedure to 178 human intragenic miRNAs and found that it performs better than currently available prediction softwares in pinpointing previously validated miRNA targets. The high-scoring HOCTAR predicted targets were enriched in Gene Ontology categories, which were consistent with previously published data, as in the case of miR-106b and miR-93. By means of overexpression and loss-of-function assays, we also demonstrated that HOCTAR is efficient in predicting novel miRNA targets and we identified, by microarray and qRT-PCR procedures, 34 and 28 novel targets for miR-26b and miR-98, respectively. Overall, we believe that the use of HOCTAR significantly reduces the number of candidate miRNA targets to be tested compared to the procedures based solely on target sequence recognition. Finally, our data further confirm that miRNAs have a significant impact on the mRNA levels of most of their targets.
数据库Tarbase,最重要的特点是包含被实验证实的miRNA靶基因,请参考:
Sethupathy P, Corda B, Hatzigeorgiou AG. 2006. TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA. 12(2):192-7.
MicroRNAs (miRNAs) are ~22-nt RNA segments that are involved in the regulation of protein expression primarily by binding to one or more target sites on an mRNA transcript and inhibiting translation. MicroRNAs are likely to factor into multiple developmental pathways, multiple mechanisms of gene regulation, and underlie an array of inherited disease processes and phenotypic determinants. Several computational programs exist to predict miRNA targets in mammals, fruit flies, worms, and plants. However, to date, there is no systematic collection and description of miRNA targets with experimental support. We describe a database, TarBase, which houses a manually curated collection of experimentally tested miRNA targets, in human/mouse, fruit fly, worm, and zebrafish, distinguishing between those that tested positive and those that tested negative. Each positive target site is described by the miRNA that binds it, the gene in which it occurs, the nature of the experiments that were conducted to test it, the sufficiency of the site to induce translational repression and/or cleavage, and the paper from which all these data were extracted. Additionally, the database is functionally linked to several other useful databases such as Gene Ontology (GO) and UCSC Genome Browser. TarBase reveals significantly more experimentally supported targets than even recent reviews claim, thereby providing a comprehensive data set from which to assess features of miRNA targeting that will be useful for the next generation of target prediction programs. TarBase can be accessed at http://www.diana.pcbi.upenn.edu/tarbase. 
部分预测软件网址如下:
microRNA靶基因预测常用软件介绍

表格来自文献: Sethupathy P, Corda B, Hatzigeorgiou AG. 2006. TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA. 12(2):192-7.

没有评论 :

发表评论