Establishing the Pangenome

Reference genomes are widely used to guide efficient analyses of genetic diversity, evolutionary relationships, and other emerging questions. However, reference genomes are typically based on only one individual and poorly represent the diversity of that species. Pangenomes (meaning “everything”-genome) are one proposed solution, where we aim to include many high-quality assemblies, and thus all their diversity, into one structure. These pangenomes can then be used for many genetic analyses while minimizing the bias inherent in using a single-individual reference genome. Our recent work establishes that pangenomes can be constructed from diverse assembly inputs, using different read sequencing, assemblers, and coverages. We also have shown that pangenome approaches, using haplotype-resolved assemblies, can resolve variation that is challenging or inaccessible to conventional short/long read analysis. Our ongoing work will continue to address the potential of bovine pangenomes in multiple contexts.

Projects

Establishing the bovine pangenome

We produced reference-quality assemblies for different breeds of cattle using trio-binning and aim to integrate them into a bovine “super”-pangenome. There are an increasing number of high-quality bovine assemblies available, and we investigate the risk/reward tradeoff of integrating highly divergent species into a unified reference. Moreover, we are investigating the optimal pangenome structure. Our efforts are coordinated with the interantional bovine pan-genome consortium.

Towards pangenome-wide association testing

Our aim is to characterize sequence variation beyond SNPs and small insertion and deletion polymorphisms in the Brown Swiss cattle population. To this end, we are producing multiple haplotype-resolved assemblies from representative Brown Swiss trios. As we continue to produce Brown Swiss assemblies, the pangenome is supposed to grow and make structural variation amenable to association testing. We aim at conducting association studies using the pangenome supplemented with short reads or RNA-sequencing data to discover trait-relevant structural variants.

Publications:

Leonard A, Crysnanto D, Fang ZHF, Heaton MP, Vander Ley BL, Herrera C, Bollwein H, Bickhart DM, Kuhn KL, Smith TPL, Rosen BD, Pausch H. (2021). Bovine pangenome reveals trait-associated structural variation from diverse assembly inputs. Preprint at Biorxiv

Crysnanto, D., Leonard, A. S., Fang, Z. H., & Pausch, H. (2021). Novel functional sequences uncovered through a bovine multiassembly graph. Proceedings of the National Academy of Sciences, 118(20). external pageDOI: 10.1073/pnas.2101056118

Crysnanto, D., & Pausch, H. (2020). Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery. Genome biology, 21(1), 1-27. external pageDOI: 10.1186/s13059-​020-02105-0

JavaScript has been disabled in your browser