recipe vcfsamplecompare

This script sorts and (optionally) filters the rows/variants of a VCF file (containing data for 2 or more samples) based on the differences in the variant data between samples or sample groups. Degree of “difference” is determined by either the best possible degree of separation of sample groups by genotype calls or the difference in average allelic frequency of each sample or sample group (with a gap size threshold). The pair of samples or sample groups used to represent the difference for a variant row is the one leading to the greatest difference in consistent genotype or average allelic frequencies (i.e. observation ratios, e.g. AO/DP) of the same variant state. If sample groups are not specified, the pair of samples leading to the greatest difference is greedily discovered and chosen to represent the variant/row.







doi: 10.5281/zenodo.1463080

package vcfsamplecompare

(downloads) docker_vcfsamplecompare


v2.008-0, v2.006-0

Depends perl




With an activated Bioconda channel (see 2. Set up channels), install with:

conda install vcfsamplecompare

and update with:

conda update vcfsamplecompare

or use the docker container:

docker pull<tag>

(see vcfsamplecompare/tags for valid values for <tag>)