en |
Peak identification for ChIP-seq data with no controls
ZHANG, Yanfeng & SU, Bing
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is increasingly being used for genome-wide profiling
of transcriptional regulation, as this technique enables dissection of the gene regulatory networks. With input as control, a variety of
statistical methods have been proposed for identifying the enriched regions in the genome, i.e., the transcriptional factor binding sites
and chromatin modifications. However, when there are no controls, whether peak calling is still reliable awaits systematic
evaluations. To address this question, we used a Bayesian framework approach to show the effectiveness of peak calling without
controls (PCWC). Using several different types of ChIP-seq data, we demonstrated the relatively high accuracy of PCWC with less
than a 5% false discovery rate (FDR). Compared with previously published methods, e.g., the model-based analysis of ChIP-seq
(MACS), PCWC is reliable with lower FDR. Furthermore, to interpret the biological significance of the called peaks, in combination
with microarray gene expression data, gene ontology annotation and subsequent motif discovery, our results indicate PCWC
possesses a high efficiency. Additionally, using in silico data, only a small number of peaks were identified, suggesting the
significantly low FDR for PCWC.
Keywords
ChIP-seq; Bayesian; Peak calling; Gene regulation
|