1vcf(5)                      Bioinformatics formats                      vcf(5)
2
3
4

NAME

6       vcf - Variant Call Format
7

DESCRIPTION

9       The  Variant Call Format (VCF) is a TAB-delimited format with each data
10       line consisting of the following fields:
11
12        1    CHROM    CHROMosome name
13        2    POS      the left-most POSition of the variant
14        3    ID       unique variant IDentifier
15        4    REF      the REFerence allele
16        5    ALT      the ALTernate allele(s) (comma-separated)
17        6    QUAL     variant/reference QUALity
18        7    FILTER   FILTERs applied
19        8    INFO     INFOrmation related to the variant (semicolon-separated)
20        9    FORMAT   FORMAT of the genotype fields (optional; colon-separated)
21       10+   SAMPLE   SAMPLE genotypes and per-sample information (optional)
22
23       The following table gives the INFO tags used by samtools and bcftools.
24
25       AF1    Max-likelihood estimate of the site allele frequency (AF) of the
26              first ALT allele (double)
27
28       DP     Raw read depth (without quality filtering) (int)
29
30       DP4    #  high-quality  reference forward bases, ref reverse, alternate
31              for and alt rev bases (int[4])
32
33       FQ     Consensus quality. Positive: sample genotypes  different;  nega‐
34              tive: otherwise (int)
35
36       MQ     Root-Mean-Square mapping quality of covering reads (int)
37
38       PC2    Phred   probability   of  AF  in  group1  samples  being  larger
39              (,smaller) than in group2 (int[2])
40
41       PCHI2  Posterior weighted chi^2 P-value between group1 and group2  sam‐
42              ples (double)
43
44       PV4    P-value for strand bias, baseQ bias, mapQ bias and tail distance
45              bias (double[4])
46
47       QCHI2  Phred-scaled PCHI2 (int)
48
49       RP     # permutations yielding a smaller PCHI2 (int)
50
51       CLR    Phred log ratio of genotype likelihoods  with  and  without  the
52              trio/pair constraint (int)
53
54       UGT    Most probable genotype configuration without the trio constraint
55              (string)
56
57       CGT    Most probable configuration with the trio constraint (string)
58
59       VDB    Tests variant positions within  reads.  Intended  for  filtering
60              RNA-seq artifacts around splice sites (float)
61
62       RPB    Mann-Whitney rank-sum test for tail distance bias (float)
63
64       HWE    Hardy-Weinberg equilibrium test (Wigginton et al) (float)
65

SEE ALSO

67       https://github.com/samtools/hts-specs
68              The full VCF/BCF file format specification
69
70       A note on exact tests of Hardy-Weinberg equilibrium
71              Wigginton JE et al PMID:15789306
72
73
74
75htslib                            August 2013                           vcf(5)
Impressum