Return to site

Splitting .bam file by a chromosome in a loop

Some bioinformatic tips I shared with my client.

The task is splitting .bam files by each chromosome (1-22, X, and Y for humans). I referred this site:

https://www.biostars.org/p/9130/

but what if you have many .bam files you have in a directory? We had ~40 .bams to start with so you would definitely want to run the job in one loop command instead of running the above 40 times manually.

This is what it worked:

#split bams

for file in *_chr.bam; do filename=`echo $file | cut -d "." -f 1`; for chrom in `seq 1 22` X Y; do samtools view -bh $file chr${chrom} > ${filename}_${chrom}.bam; done; done

#make index file
for file in *_chr_*.bam; do samtools index $file; done

The result is 24 .bam files + 24 .bai files X ~40 samples = about 1,500 files were created overnight! :)

-Yuka