@lakesea
2022-03-08T01:45:18.000000Z
字数 3257
阅读 237
pipelines
In this pipeline, I will describe and introduce some based steps to call gene PAVs.
gff to bed and grep those regions with gene feature
module load bedops
gff2bed <MH63_chr_v2.gene.gff >MH63_chr_v2.gene.bed
Running Mosdepth
mosdepth -n -t 6 --thresholds 1 -b MH63_chr_v2.gene.bed B009_test B009.sorted.bam
###help
-n --no-per-base dont output per-base depth. skipping this output will speed execution
-t --threads <threads> number of BAM decompression threads [default: 0]
-b --by <bed|window> optional BED file or (integer) window-sizes.
other help:
-F --flag <FLAG> exclude reads with any of the bits in FLAG set [default: 1796]
#Sam file flag: 1796
#read unmapped (0x4)
#not primary alignment (0x100)
#read fails platform/vendor quality checks (0x200)
#read is PCR or optical duplicate (0x400)
-q --quantize <segments> write quantized output see docs for description.
-Q --mapq <mapq> mapping quality threshold [default: 0]
-T --thresholds <thresholds> for each interval in --by, write number of bases covered by at
least threshold bases. Specify multiple integer values separated
by ','.
-h --help show help
Look at the results
zcat B009_test.regions.bed.gz |less
Chr01 262169 265985 MH01g0015200 20.14
Chr01 266366 266897 MH01g0015300 0.00
Chr01 272359 275777 MH01g0015400 16.74
Chr01 274349 276712 MH01g0015500 20.18
Chr01 280060 281122 MH01g0015600 20.07
Chr01 284353 284998 MH01g0015700 18.16
Chr01 285481 286360 MH01g0015800 34.02
Chr01 288457 324059 MH01g0015900 18.84
notes:
zcat IRIS_313-8392_gene_pav.regions.bed.gz |awk '{if ($5>= 1) print $6=1; else if ($5 <1) print $6=0;}'
for i in `cat selected.lines.list`;do zcat ${i}.remark.bam_pav.thresholds.bed.gz |grep -v "#"|awk '{$6 = $5/($3 - $2)}1'| awk '{if ($6>= 0.6) print $7=1; else if ($6 <0.6) print $7=0;}' >> ${i}_pav_dec.txt;done
install
git clone --recurse-submodules https://github.com/fbreitwieser/bamcov
cd bamcov
make
make test
Running
$ ./bamcov/bamcov -m B009.sorted.bam
Chr01 (41.19Mbp)
> 88.72% │ ▄████▄▄██████ ██▄██ █▄▄████████████████████▄▄│ Number of reads: 8894694
> 78.87% │███████████████ █ █████ ▄█████████████████████████│ (165808 filtered)
> 69.01% │███████████████ ██████████████████████████████████│ Covered bases: 39.10Mbp
> 59.15% │███████████████▄██████████████████████████████████│ Percent covered: 94.94%
> 49.29% │██████████████████████████████████████████████████│ Average coverage: 17.7x
> 39.43% │██████████████████████████████████████████████████│ Average baseQ: 36
> 29.57% │██████████████████████████████████████████████████│ Average mapQ: 49.1
> 19.72% │██████████████████████████████████████████████████│
> 9.86% │██████████████████████████████████████████████████│ Histo bin width: 823.7Kbp
> 0.00% │██████████████████████████████████████████████████│ Histo max bin: 98.582%
1 8.24M 16.47M 24.71M 32.95M 41.19M
$ ./bamcov/bamcov -mr Chr01:69650-72886 B009.sorted.bam
Chr01 (41.19Mbp)
> 90.00% │ ███ ██ █ █ █ █ █ │ Number of reads: 34
> 80.00% │ ████ ██ █ █ █ █ █ │
> 70.00% │ █████ ██ ▄ █ █ █ █ █ │ Covered bases: 1.1Kbp
> 60.00% │ █████ ██▄ █ █ █ █ █ █ │ Percent covered: 34.72%
> 50.00% │ █████ ███ █▄ █ █ ▄█ █▄ █▄│ Average coverage: 0.819x
> 40.00% │ █████ ███ ██ █ █ ██ ██ ██│ Average baseQ: 34.5
> 30.00% │█ █████ ███ ██ █ █ ██ ██ ██│ Average mapQ: 3.12
> 20.00% │█ █████ ▄███ ██ █ █ ██ ██▄██│
> 10.00% │█ █████ ████ ██ ▄██ ██ ██ █████│ Histo bin width: 64bp
> 0.00% │█ █████ ████ ██ ███ ███ ██▄▄█████│ Histo max bin: 100%
69.7K 70.3K 70.9K 71.6K 72.2K 72.9K
bedtools genomecov -ibam B009.sorted.bam -bg > bedtools_genomcov.out
###results
Chr01 99 100 3
Chr01 100 104 17
Chr01 104 106 18
Chr01 106 107 19
Chr01 107 108 20
Chr01 108 109 21
Chr01 109 110 22
Chr01 110 112 23
Chr01 112 120 24
Chr01 120 124 25