|
I have a set of illumina paired end reads (rone.fq and rtwo.fq), and I have mapped those reads to a reference (ref.fa).
Now I am trying to get SNP and INDEL information using the following script
I get the following VCF (this is just one line of the VCF I actually get)
I am assuming that samtools thinks this is a SNP, because the REF=A and ALT=C,X. My question is: How can I know the coverage depth of this SNP given this VCF? If this VCF does not have information enough to show the SNP coverage depth then how can I creat a VCF that shows that information? Thanks |
|
right from the VCF spec:
Samtools depth reports raw coverage. In the VCF file DP reports the coverage of all reads at that loci (that pass the quality filter). So 'Samtools depth' gives a different statistics than the DP field. There is also information about the depth on each strand supporting the ref / alt. IT is worth your time to carefully read the documentation for all the Samtools steps I dont quite understand the DP value. For example if DP=14, does that mean that 14 reads support that polymorphism? If so 14 out of how many? I am confused because in position where samtools does not find SNPs the DP value would just mean the read coverage of that position. Thanks
(21 May '12, 17:13)
Customer
Nope DP=14 means you have 14 reads that cover gi|110227054. The way you used samtools it will only report information on those loci that have a suggested mutation. use 'Samtools depth' if you want to know the depth for each loci
(21 May '12, 17:13)
SupportRep
SO I ran "samtools depth aln_sorted.bam>aln_depth" and the info I get about position 32 is "gi|110227054|gb|AE004091.2| 32 33". So from what you are saying I get that the coverage of that base is 32X and the reads that support the SNP is 14? Thanks so much again
(21 May '12, 17:13)
Customer
see my edits above.
(21 May '12, 17:13)
SupportRep
Ok I get it. Why do you say it is an insertion? I thought that was a substitution (because of the comma)
(21 May '12, 17:13)
Customer
|