An organism-wide ATAC-seq peak catalog for the bovine and its use to identify regulatory variants [RESOURCES]

Can Yuan1, Lijing Tang1, Thomas Lopdell2, Vyacheslav A. Petrov1, Claire Oget-Ebrad1, Gabriel Costa Monteiro Moreira1, José Luis Gualdrón Duarte1, Arnaud Sartelet3, Zhangrui Cheng4, Mazdak Salavati4,8, D. Claire Wathes4, Mark A. Crowe5, GplusE Consortium5,7, Wouter Coppieters6, Mathew Littlejohn2, Carole Charlier1, Tom Druet1, Michel Georges1 and Haruko Takeda1 1Unit of Animal Genomics, GIGA-R and Faculty of Veterinary Medicine, University of Liège, 4000 Liège, Belgium; 2Research and Development, Livestock Improvement Corporation, Hamilton 3240, New Zealand; 3Clinical Department of Ruminant, University of Liège, 4000 Liège, Belgium; 4Royal Veterinary College, Hatfield, Herts AL9 7TA, United Kingdom; 5School of Veterinary Medicine, University College Dublin, Dublin 4, Ireland; 6GIGA Genomics platform, GIGA Institute, University of Liège, 4000 Liège, Belgium

8 Present address: Dairy Research and Innovation Centre, Scotland's Rural College, Barony Campus, Dumfries DG1 3NE, UK

Corresponding author: michel.georgesuliege.be Abstract

We report the generation of an organism-wide catalog of 976,813 cis-acting regulatory elements for the bovine detected by the assay for transposase accessible chromatin using sequencing (ATAC-seq). We regroup these regulatory elements in 16 components by nonnegative matrix factorization. Correlation between the genome-wide density of peaks and transcription start sites, correlation between peak accessibility and expression of neighboring genes, and enrichment in transcription factor binding motifs support their regulatory potential. Using a previously established catalog of 12,736,643 variants, we show that the proportion of single-nucleotide polymorphisms mapping to ATAC-seq peaks is higher than expected and that this is owing to an approximately 1.3-fold higher mutation rate within peaks. Their site frequency spectrum indicates that variants in ATAC-seq peaks are subject to purifying selection. We generate eQTL data sets for liver and blood and show that variants that drive eQTL fall into liver- and blood-specific ATAC-seq peaks more often than expected by chance. We combine ATAC-seq and eQTL data to estimate that the proportion of regulatory variants mapping to ATAC-seq peaks is approximately one in three and that the proportion of variants mapping to ATAC-seq peaks that are regulatory is approximately one in 25. We discuss the implication of these findings on the utility of ATAC-seq information to improve the accuracy of genomic selection.

Footnotes

7 A complete list of the GplusE Consortium authors appears at the end of this paper.

[Supplemental material is available for this article.]

Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.277947.123.

9 School of Veterinary Medicine, University College Dublin, Dublin 4, Ireland

Received April 1, 2023. Accepted September 19, 2023.

Comments (0)

No login
gif