贵州位于中国的西南地区,处于茶树的起源中心地带,境内分布着大量的茶组植物资源,但是因为长期以来经济发展的落后以及可利用土地资源的短缺,对这些丰富茶组植物资源的利用有限。但是这些尚未开发的茶树资源为研究其遗传多样性、群体结构以及进化关系提供了宝贵的资源。在本研究中,作者以分布于贵州全境的415份茶树植物资源为材料(从生态类型上看其中159份为野生型,256份为栽培型,包括174份栽培型古茶树、77份现代地方品种和5份育成品种;从分类上看251份为茶、100份为大厂茶、59份为疏齿茶、5份为疑似大理茶),首先根据贵州茶树气候区划将这些茶树资源分为6个区域,然后利用GBS简化基因组测序策略进行SNP挖掘,最终利用获得的高质量SNP进行后续研究。
首先遗传多样性评估显示栽培型茶树资源的多态性位点百分比要明显高于野生型的,最适宜茶树生长的区域茶树资源的遗传多样性要高于次适宜和不适宜的。一般而言,野生型资源因为长期缺乏人工驯化,其遗传多样性要高于人工驯化的栽培型,但作者这里的发现却正好相反,一个可能的解释就是本研究的栽培型中大部分是长期保留下来的地方资源,它们来源广泛且多以种子进行传播,人为的无意识干预导致其遗传背景丰富,而这些野生资源多来自远离人类活动的山林,且本研究选取的多为野生古茶树资源,贵州相对隔绝的地理环境导致它们与外界缺乏基因交流进而遗传背景相对单一。
另外利用连锁不平衡(LD)过滤后的SNP对这些资源进行了遗传结构分析,结果显示野生型和栽培型具有明显的遗传差异,而且栽培型古茶树和现代地方资源也能明显区分,但是后两者间的遗传距离要小于前两者的遗传距离;另外聚类分析还发现来自Ia气候区域的大厂茶和疏齿茶聚在了一个组别,这可能是因为不同的茶组植物间存在着基因交流,同时也可能因为茶组植物分类目前还缺乏明确的界定,其本身的复杂性导致现有的分类标准并不能从根本上进行区分。
进一步遗传变异分析显示野生型和栽培型资源以及不同区域野生型资源间的Fst指数都较高,而不同栽培型间的则较低,表明这些野生型茶树资源与栽培型茶树资源有着较高的遗传分化,而栽培型古茶树和现代地方栽培品种间遗传分化较小,另外结合前面发现的栽培型古茶树遗传结构相对复杂的现象,预示着贵州的栽培型古茶树有着较高的直接育种利用价值。
此外,本研究还对这些资源进行了LD分析,结果显示LD衰减较快,这符合杂交作物一般LD衰减迅速的一般特性,但是r2值相对较低,这可能与本研究的SNP的基因组覆盖度低有关。
Abstract
Background
To efficiently protect and exploit germplasm resources for marker development and breeding purposes, we must accurately depict the features of the tea populations. This study focuses on the Camellia sinensis (C. sinensis) population and aims to (i) identify single nucleotide polymorphisms (SNPs) on the genome level, (ii) investigate the genetic diversity and population structure, and (iii) characterize the linkage disequilibrium (LD) pattern to facilitate next genome-wide association mapping and marker-assisted selection.
Results
We collected 415 tea accessions from the Origin Center and analyzed the genetic diversity, population structure and LD pattern using the genotyping-by-sequencing (GBS) approach. A total of 79,016 high-quality SNPs were identified; the polymorphism information content (PIC) and genetic diversity (GD) based on these SNPs showed a higher level of genetic diversity in cultivated type than in wild type. The 415 accessions were clustered into three groups by STRUCTURE software and confirmed using principal component analyses (PCA)—wild type, cultivated type, and admixed wild type. However, unweighted pair group method with arithmetic mean (UPGMA) trees indicated the accessions should be grouped into more clusters. Further analyses identified four groups, the Pure Wild Type, Admixed Wild Type, ancient landraces and modern landraces using STRUCTURE, and the results were confirmed by PCA and UPGMA tree method. A higher level of genetic diversity was detected in ancient landraces and Admixed Wild Type than that in the Pure Wild Type and modern landraces. The highest differentiation was between the Pure Wild Type and modern landraces. A relatively fast LD decay with a short range (kb) was observed, and the LD decays of four inferred populations were different.
Conclusions
This study is, to our knowledge, the first population genetic analysis of tea germplasm from the Origin Center, Guizhou Plateau, using GBS. The LD pattern, population structure and genetic differentiation of the tea population revealed by our study will benefit further genetic studies, germplasm protection, and breeding.
责任编辑:千鹤茶苗