1、Sequencing and Annotating Genomes测序和注释基因组
桑格的双脱氧终止法、454 Pyrosequencing 454焦磷酸测序法、PCR重叠法进行测序
寻找ORF:看起始/终止密码子、RBS序列
Codon bias (usage): some codons used more frequently than others
如果ORF的codon bias明显不同,可能是无功能的或者通过水平基因转移来的。
2、Nanoarchaeum equitans骑火球纳米古菌,最小的细胞基因组
Smallest cellular genomes belong to parasitic or endosymbiotic prokaryotes
最大的原核基因组也可以比真核生物大
3、ORF越多,有关翻译、信号转导等功能的基因占比增加,有关DNA复制和转录的基因占比减少
4、Parasite寄生生物的基因也不一定少,比如Trichomonas毛滴虫
5、Gene Duplications and Deletions基因的复制和缺失: The major events in the evolution of genomes.
Homologous genes同源基因: sharing evolutionary ancestry → gene family.
Homologs: 1) Paralogs旁系同源, from gene duplication, within the same organism;
2) Orthologs直系同源, from different species;
还有其他的主要事件:HGT水平基因转移,transposon转座子
6、Intergrons整合子: genetic elements that collect and express genes carried on mobile segments of DNA (gene cassettes).
7、Core Genome核心基因组:chromosome fragments present in all strains a species
存在一个物种所有菌株中的染色体片段;
Pan Genome泛基因组: optional parts present in some but not all strains of the species.
存在于该物种中的一些但不是所有菌株中的可选的部分。
8、Genome基因组:一个细胞或病毒的全部基因信息
Metagenome宏基因组:the total genetic complement of all the cells present in a particular environment
Transcriptome转录组:the total RNA produced in an organism under a specfic set of conditions
9、Metagenomics宏基因组学: Total gene content of the organisms inhabiting an environment. 居住在环境中的生物体的总基因含量 →phylogenetic analysis系统发育分析, functional gene analysis功能基因分析, and direct sequencing直接测序.
在什么条件下我要使用宏基因组学研究方法?
Examples of Metagenomic Studies
– Several environments surveyed调查几个环境
– Extreme environments (e.g., highly acidic mine runoff waters) have low diversity, so community DNA can be assembled into individual genomes
在极端环境中因为个体多样性较低,可以使用群落DNA组装成个体基因组
– Complex environments much more challenging, so complete genome assemblies difficult
较为复杂的环境基因组不容易拼接
– Most genes from natural habitats are viral
大多数来自自然栖息地的基因都是病毒
– Can analyze for presence/distribution of specific microbial groups
水体、极端环境、肠道微生物,可以分析特定微生物群的存在/分布
方法:
Sample - Lyse&Extract DNA - sequence DNA - assemble genome - genomic analysis
10、Transcriptomics转录组学: study of cell’s global transcription, monitors total RNA (transcriptome) under a growth condition研究在某个生长条件下细胞的整体转录RNA,转录组。
– two main approaches: microarrays 微阵列and RNA-Seq RNA测序
- Microarrays and the DNA Gene Chip
– microarrays (gene chips): small solid supports to which genes or oligonucleotides are fixed and arrayed spatially in a known pattern。
Transcriptome转录组, Functional Gene Chip功能基因芯片
– measure DNA or RNA that hybridizes (single strands forming double-stranded molecules by complementary or almost complementary base pairing) with known nucleic acid probes and fluorescence用已知的核酸探针或荧光杂交DNA和RNA
缺点是:测的是已知基因,并且成本比较高
·RNA-Seq Analysis RNA测序
– all RNA converted into cDNA and sequenced
逆转录RNA成cDNA再测序
– shows which genes are transcribed and how many copies of each RNA are made
显示哪些基因被转录,以及每个RNA有多少copies
– measures mRNA expression
测量mRNA的表达水平
– identifies long untranslated regions
标示较长的非翻译区域
– discovers noncoding RNAs
发现非编码的RNA
– requires high-throughput/second generation sequencing
必须需要高通量/二代测序
– rRNA must be removed or mRNA enriched
rRNA占80%以上,信号太强,所以必须去除rRNA或者富集mRNA
例子:RNA-Seq analysis of the heterocyst-forming cyanobacterium Anabaena 蓝藻异形胞形成during nitrogen starvation一些基因表达水平的变化
例子:Transcriptomic analysis of sporulation genes in Clostridium梭状芽孢杆菌产孢基因的转录组学分析
11、Proteomics蛋白组学
genome-wide study of structure, function, and activity of an organism’s proteins
全基因组研究有机体蛋白质的结构、功能和活性
- Proteome: all proteins encoded or only those present at a given time (translatome: under specific conditions)
蛋白质组:所有已编码或仅在特定时间存在的蛋白质(翻译组:在特定条件下)
- Methods in Proteomics
– Mass spectrometry allows unambiguous determination of molecular formula, and can be used to identify peptides
质谱可以明确的测定分子式,并可用于识别多肽
– HPLC is used to separate proteins by differences in chemical properties
高效液相色谱法通过不同的化学性质来分离蛋白质
– Proteins collected after HPLC, digested by proteases, resulting peptides identified by mass spectrometry and compared with translated genome
通过高效液相色谱法收集蛋白质,用蛋白酶消化,得到的肽经质谱鉴定并与翻译的基因组进行比较
– Matrix-assisted laser desorption ionization (MALDI): advanced mass spectrometry method where sample is fixed to matrix, ionized, vaporized, and molecular formula is determined (MALDI-TOF) 基质辅助激光解吸电离(MALDI):先进的质谱法,将样品固定在基质上,电离,汽化,并确定分子式(MALDI-tof)
12、Interactome互作组: the complete set of interactions among the macromolecules within a cell
相互作用组:细胞内大分子间的一套完整的相互作用
13、Metabolome代谢组:
complete set of metabolic intermediates and small molecules produced in an organism
生物体产生的一套完整的代谢中间体和小分子
– Reflects enzymatic pathways 反应酶途径
– Confirms reactions occurred确认反应发生
- Advances in Metabolomic Techniques: NIMS 纳米结构引发剂质谱
– Technically challenging due to immense chemical diversity
由于巨大的化学多样性,在技术上具有挑战性
– Nanostructure-initiator mass spectrometry (NIMS) can directly analyze samples without special preparation
纳米结构引发剂质谱(NIMS)可以直接分析样品,无需特殊制备
14、Single-cell genomics: 单细胞基因组学
sequencing individual cell’s genomes对单个细胞的基因组进行测序
– Studying metabolic potential of microorganisms in natural communities
研究自然群落中微生物的代谢潜力
– Transcriptome and proteome analyses can also be formed on single cells
转录组和蛋白质组分析也可以在单个细胞上形成
- Cell Isolation and Sample Preparation细胞分离和样品制备
– Dilution in microwells, encapsulation, fluorescence-activated cell sorting (FACS)
微流控稀释、包封、荧光活化细胞分选(FACS)
– Multiple displacement amplification(MDA扩增): modified PCR for sequencing
多重位移扩增:改良的PCR进行测序。一定要保证无污染。
既可以从环境中获得难培养的菌进行分选单细胞测序,也可以是纯培养的分离出来的。
单细胞分离与测序:Isolation and sequencing of single cells
✓ Metabolic genes assigned to particular species测定特定物种的代谢基因
✓ Microbial dark matter微生物的黑匣子
土壤分离的时候,会有很多微生物因为培养条件及筛选条件不适宜的原因而筛选不出来,但是这部分微生物又是具有重要功能的