Before you use the EDACC BED alignment file for anything including the fragment length estimation and MACS peak/signal calls you need to make some changes to the BED files. Make sure you (1) trim reads to 36 bp and (2) filter reads uniformly using a 36 bp mappability track since the EDACC BED files artificially extend reads to 200 bp. (1) Here is my code for trimming and correcting EDACC files (they dont use 0 based starts as required by BED format) http://www.broadinstitute.org/~anshul/projects/roadmap/src/autocorrect_adjustReadLen.sh Use it as follows ./autocorrect_adjustReadLen.sh [EDACCBed.gz file] 36 | gzip -c > newFile.tagAlign.gz (2) Also here is the read filtering code. You will need the following files from this directory http://www.broadinstitute.org/~anshul/projects/roadmap/src/ filterUniqueReads # This is the main utility that does the filtering filterUniqueReads needs the MATLAB run-time library installed (this is free .. see below) and requires mappability and sequence tracks for filtering. The MATLAB run time library installation instructions are here https://code.google.com/p/align2rawsignal/#1._MCR_Installation Also after you install you need to set appropriate paths as explained here https://code.google.com/p/align2rawsignal/#2._Setting_paths The mappability tracks are here http://www.broadinstitute.org/~anshul/projects/encode/rawdata/umap/encodeHg19Male/globalmap_k20tok54/ # Copy over this directory exactly as is and maintain the directory name i.e. globalmap_k20tok54 The sequence tracks are here http://www.broadinstitute.org/~anshul/projects/encode/rawdata/sequence/encodeHg19Male/ # Copy over this directory exactly as is mapFilterTagAlignFiles.sh # wrapper shell script around the filterUniqueReads code. See how filterUniqueReads is called in this script slopBed is part of the BedTools suite https://github.com/arq5x/bedtools2 . The script also uses bedGraphToBigWig which is part of the UCSC source tools. Also there is a step where we merge multiple replicates/datasets as required and subsample reads to a fixed depth. This is standard unix commands such as zcat and shuf. (3) You then use the cross correlation code to estimate fragment lengths and QC scores. Use the code here https://code.google.com/p/phantompeakqualtools/ . (4) The script for MACS2 narrow peak calling and signal generation is here http://www.broadinstitute.org/~anshul/softwareRepo/macs2.signal.lsf.submitscript.sh You should be able to find the specific macs2 commands with the parameters in that file. The strand shift is 1/2 the estimated fragment length. (5) My ChromHMM scripts are here /broad/compbio/anshul/projects/encode/preprocessing/segmentations/chromhmm/scripts chmm_binarize.sh: will binarize the tagAlign files chmm_predict.sh: will infer states based on an existing model chmm_browser.sh: will create browser tracks denseBedtoBigBed.sh: will create bigBed tracks from the dense.bed state tracks.