Introduction, downloads

D: 28 Oct 2018

Recent version history

What's new?

Coming next

General usage

Column set descriptors

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF (.vcf{.gz})

Oxford genotype (.bgen)

Oxford haplotype (.haps)

PLINK 1 dosage

Dosage import settings

Generate random

Unusual chromosome IDs

Phenotypes

Covariates

'Cluster' import

Reference genome (.fa)

Input filtering

Sample ID file

Variant ID file

Interval-BED file

QUAL, FILTER, INFO

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Deduplicate variants

Sample/variant thinning

Pheno./covar. condition

Missingness

Category subset

--keep-fcol (was --filter)

Missing genotypes

Number of distinct alleles

Allele frequencies/counts

Hardy-Weinberg

Imputation quality

Sex

Founder status

Main functions

Data management

--make-{b}pgen/--make-bed

--export

--output-chr

--split-par/--merge-par

--set-all-var-ids

--ref-allele

--ref-from-fa

--normalize

--indiv-sort

--write-covar

--variance-standardize

--quantile-normalize

--split-cat-pheno

--write-samples

(TBD)

Resources

1000 Genomes phase 3

Output file list

Order of operations

Credits

File formats

Order of operations

  • If --zst-decompress present, decompress file to stdout and QUIT
  • Load additional commands from --script
  • Apply --rerun
  • If --help present, print requested help entries and QUIT
  • If --version present, print version and QUIT
  • Apply --silent
  • Apply --out, start logging
  • Define chromosome set (--chr-set, --cow...; human if unspecified)
  • Parse remaining command line flags in lexicographic order
  • Note chromosome filter (--chr, --not-chr, --autosome, --autosome-par)
  • Handle nonstandard input:
    • If --pgen-info with no .pvar file provided, scan header, print basic info, and QUIT
    • Convert VCF (--vcf), Oxford dosage (--bgen, --data/--gen), Oxford haplotype (--haps), or PLINK 1 dosage (--import-dosage) data to PLINK 2 binary, then QUIT if no other commands
    • Generate random dataset (--dummy)
  • Read main sample-info file, if necessary
  • Read main variant-info file, if necessary:
    • Exclude variants with multi-character allele codes (--snps-only)
    • Apply #-of-distinct-alleles filters (--min-alleles, --max-alleles)
    • Apply QUAL/FILTER/INFO variant filters (--var-min-qual, --var-filter, --extract-if-info, --exclude-if-info, --require-info, --require-no-info)
    • Assign chromosome-and-position-based names to variants (--set-all-var-ids, --set-missing-var-ids)
    • Split or merge pseudoautosomal region (--split-par, --merge-par, --merge-x)
  • Read main genotype file's header (--{b}pfile, --bfile, or freshly autoconverted)
  • Transpose PLINK 1 sample-major .bed, if necessary
  • Validate genotype file (--validate), then QUIT if no other commands
  • Print basic information about .pgen (--pgen-info), then QUIT if no other commands
  • Load/create additional phenotypes (--pheno, --within, --family)
  • Select single variant range by ID (--from, --to, --snp, --exclude-snp, --window, --from-bp...)
  • Select multiple variant ranges by ID (--snps, --exclude-snps)
  • Update variant information (--update-map, --update-name)
  • Update allele information (--update-alleles)
  • Extract/exclude variants by ID list(s) (--extract, --exclude)
  • Deduplicate variants (--rm-dup)
  • Filter variants by position (--from-bp, --to-bp, --extract ibed0, ...)
  • Random thinning of variant set (--thin, --thin-count)
  • Update sample information (--update-sex)
  • Keep/remove samples by ID list(s) (--keep, --remove, --keep-fam, --remove-fam)
  • Filter samples on a covariate (--keep-fcol)
  • Filter samples based on phenotype existence (--require-pheno)
  • Filter based on sex and/or founder status (--keep-males...)
  • Random thinning of sample set (--thin-indiv, --thin-indiv-count)
  • Calculate per-sample genotyping rate, remove samples below threshold (--mind)
  • Load covariates (--covar)
  • Filter samples based on covariate existence (--require-covar)
  • Filter samples based on phenotype/covariate conditions (--keep-if, --remove-if, --keep-cats, ...)
  • Report remaining sample/sex/founder counts
  • Split categorical phenotypes/covariates (--split-cat-pheno)
  • Quantile-normalize phenotypes/covariates (--quantile-normalize, --pheno-quantile-normalize, --covar-quantile-normalize)
  • Variance-standardize phenotypes/covariates (--variance-standardize, --covar-variance-standardize)
  • Loop over categories (--loop-cats)
  • Write sample IDs (--write-samples), then QUIT (or advance to next --loop-cats category) if no other commands
  • Main variant filters:
    • Calculate needed allele/genotype frequencies
    • Report overall genotyping rate (--genotyping-rate)
    • Load allele frequencies (--read-freq)
    • Write allele/genotype frequencies to file (--freq, --geno-counts), then QUIT if no other commands
    • Generate missing data reports (--missing), then QUIT if no other commands
    • Remove variants below genotyping rate threshold (--geno)
    • Hardy-Weinberg equilibrium report and/or exact test (--hardy, --hwe), then QUIT if no other commands
    • Apply minor allele frequency and count filters (--maf, --max-maf, --mac, --max-mac)
    • Apply imputation-quality filter (--mach-r2-filter)
    • Enforce minimum spacing (--bp-space)
    • Report remaining variant count
  • Report sample-pair discordances (--sample-diff)
  • Calculate kinship matrix, if necessary
  • Perform kinship-based pruning of samples (--king-cutoff)
  • Write kinship matrix/table to disk (--make-king, --make-king-table)
  • Calculate variance-standardized relationship matrix, if necessary
  • Write relationship matrix to disk (--make-rel, --make-grm-list, --make-grm-bin)
  • Extract principal components (--pca)
  • Write .snplist file (--write-snplist)
  • Change REF/ALT1 alleles (--maj-ref, --ref-allele, --alt1-allele, --ref-from-fa)
  • Left-normalize alleles (--normalize)
  • Write .cov file (--write-covar; also induced by --export)
  • Write PLINK 1 or 2 binary fileset (--make-{b}pgen, --make-bed, --make-just-pvar, --make-just-psam, --make-just-bim, --make-just-fam)
  • Export genotype data to other formats (--export)
  • Perform LD-based pruning (--indep-pairwise)
  • Report LD statistics (--ld)
  • Apply linear scoring system to each sample (--score)
  • Multi-covariate association test (--glm)
  • If --loop-cats, select next category and jump back to "Loop over categories" step, if any categories left
  • Definitely QUIT

Credits >>