Introduction, downloads
D: 9 Jan 2023
Recent version history
What's new?
Coming next
[Jump to search box]
General usage
Getting started
Column set descriptors
Citation instructions
Standard data input
PLINK 1 binary (.bed)
PLINK 2 binary (.pgen)
Autoconversion behavior
VCF/BCF (.vcf[.gz], .bcf)
Oxford genotype (.bgen)
Oxford haplotype (.haps)
PLINK 1 text (.ped, .tped)
PLINK 1 dosage
Sample ID conversion
Dosage import settings
Generate random
Unusual chromosome IDs
Allele frequencies
Phenotypes
Covariates
'Cluster' import
Reference genome (.fa)
Input filtering
Sample ID file
Variant ID file
Interval-BED file
--extract-col-cond
QUAL, FILTER, INFO
Chromosomes
SNPs only
Simple variant window
Multiple variant ranges
Deduplicate variants
Sample/variant thinning
Pheno./covar. condition
Missingness
Category subset
--keep-col-match
Missing genotypes
Number of distinct alleles
Allele frequencies/counts
Hardy-Weinberg
Imputation quality
Sex
Founder status
Main functions
Data management
--make-[b]pgen/--make-bed
--export
--output-chr
--split-par/--merge-par
--set-all-var-ids
--recover-var-ids
--update-map...
--update-ids...
--ref-allele
--ref-from-fa
--normalize
--indiv-sort
--write-covar
--variance-standardize
--quantile-normalize
--split-cat-pheno
--pmerge[-list]
--write-samples
Basic statistics
--freq
--geno-counts
--sample-counts
--missing
--genotyping-rate
--hardy
--het
--fst
--pgen-info
Pairwise diffs
--pgen-diff
--sample-diff
Linkage disequilibrium
--indep...
--ld
Sample-distance matrices
Relationship/covariance
(--make-grm-bin...)
--make-king...
--king-cutoff
Population stratification
--pca
PCA projection
Association analysis
--glm
--glm ERRCODE values
--adjust-file
Linear scoring
--score
--variant-score
Distributed computation
Command-line help
Miscellaneous
Flag/parameter reuse
System resource usage
--loop-cats
.zst decompression
Pseudorandom numbers
Warnings as errors
.pgen validation
Resources
1000 Genomes phase 3
HGDP-CEPH
FASTA files
Errors and warnings
Output file list
Order of operations
Developer information
GitHub root
Compilation
Adding new functionality
Google groups
Credits
File formats
Quick index search
Order of operations
If --zst-decompress present, decompress file to stdout and
QUIT
Load additional commands from --script
Apply --rerun
If --help present, print requested help entries and QUIT
If --version present, print version and QUIT
Apply --silent
Apply --out , start logging
Define chromosome set (--chr-set , --cow ...; human if unspecified)
Parse remaining command line flags in lexicographic order
Note chromosome filter (--chr , --not-chr , --autosome , --autosome-par )
Handle nonstandard input:
If --pgen-info with no .pvar file provided, scan header, print basic info, and QUIT
Convert VCF (--vcf ), BCF (--bcf ), Oxford dosage (--bgen , --data/--gen ), Oxford haplotype (--haps ), or PLINK 1 dosage (--import-dosage ) data to PLINK 2 binary, then QUIT if no other commands
Generate random dataset (--dummy )
Merge filesets (--pmerge , --pmerge-list )
Read main sample-info file, if necessary
Read main variant-info file, if necessary:
Exclude variants with multi-character allele codes (--snps-only )
Apply #-of-distinct-alleles filters (--min-alleles , --max-alleles )
Apply QUAL/FILTER/INFO variant filters (--var-min-qual , --var-filter , --extract-if-info , --exclude-if-info , --require-info , --require-no-info )
Assign chromosome-and-position-based names to variants (--set-all-var-ids , --set-missing-var-ids )
Split or merge pseudoautosomal region (--split-par , --merge-par , --merge-x )
Determine chromosome-length lower bounds for exported VCF/BCF headers, if necessary
Read main genotype file's header (--[b]pfile , --bfile , or freshly autoconverted)
Transpose PLINK 1 sample-major .bed, if necessary
Validate genotype file (--validate ), then QUIT if no other commands
Print basic information about .pgen (--pgen-info ), then QUIT if no other commands
Load/create additional phenotypes (--pheno , --within , --family )
Ignore phenotypes (--not-pheno )
Select single variant range by ID (--from , --to , --snp , --exclude-snp , --window , --from-bp ...)
Select multiple variant ranges by ID (--snps , --exclude-snps )
Update variant information (--recover-var-ids , --update-map , --update-name )
Update allele information (--update-alleles )
Extract/exclude variants by ID list(s) (--extract , --exclude , --extract-intersect )
Extract variants based on text column string/substring match or range condition (--extract-fcol )
Deduplicate variants (--rm-dup )
Filter variants by position (--from-bp , --to-bp , --extract bed0 , ...)
Random thinning of variant set (--thin , --thin-count )
Update sample information (--update-ids , --update-parents , --update-sex )
Keep/remove samples by ID or ID list(s) (--keep , --remove , --keep-fam , --remove-fam , --indv )
Filter samples based on text column string match (--keep-fcol )
Filter samples based on phenotype existence (--require-pheno )
Filter based on sex and/or founder status (--keep-males ...)
Random thinning of sample set (--thin-indiv , --thin-indiv-count )
Calculate per-sample genotyping rate, remove samples below threshold (--mind )
Load covariates (--covar )
Ignore covariates (--not-covar )
Filter samples based on covariate existence (--require-covar )
Filter samples based on phenotype/covariate conditions (--keep-if , --remove-if , --keep-cats , ...)
Report remaining sample/sex/founder counts
Split categorical phenotypes/covariates (--split-cat-pheno )
Quantile-normalize phenotypes/covariates (--quantile-normalize , --pheno-quantile-normalize , --covar-quantile-normalize )
Variance-standardize phenotypes/covariates (--variance-standardize , --covar-variance-standardize )
Loop over categories (--loop-cats )
Write sample IDs (--write-samples ), then QUIT (or advance to next --loop-cats category) if no other commands
Main variant filters:
Load allele frequencies (--read-freq )
Calculate needed allele/genotype frequencies
Report overall genotyping rate (--genotyping-rate )
Write allele/genotype frequencies to file (--freq , --geno-counts ), then QUIT if no other commands
Generate missing data reports (--missing ), then QUIT if no other commands
Remove variants below genotyping rate threshold (--geno )
Hardy-Weinberg equilibrium report and/or exact test (--hardy , --hwe ), then QUIT if no other commands
Apply minor allele frequency and count filters (--maf , --max-maf , --mac , --max-mac )
Apply imputation-quality filter (--mach-r2-filter )
Enforce minimum spacing (--bp-space )
Report remaining variant count
Report sample variant-counts by type (--sample-counts )
Report sample-pair discordances (--sample-diff )
Calculate kinship matrix, if necessary
Perform kinship-based pruning of samples (--king-cutoff )
Write kinship matrix/table to disk (--make-king , --make-king-table )
Calculate variance-standardized relationship matrix, if necessary
Write relationship matrix to disk (--make-rel , --make-grm-list , --make-grm-bin )
Extract principal components (--pca )
Write .snplist file (--write-snplist )
Change REF/ALT1 alleles (--maj-ref , --ref-allele , --alt1-allele , --ref-from-fa )
Left-normalize alleles (--normalize )
Write .cov file (--write-covar ; also induced by --export)
Write PLINK 1 or 2 binary fileset (--make-[b]pgen , --make-bed , --make-just-pvar , --make-just-psam , --make-just-bim , --make-just-fam )
Export genotype data to other formats (--export )
Compare two filesets (--pgen-diff )
Perform LD-based pruning (--indep-pairwise )
Report LD statistics (--ld )
F inbreeding coefficient report (--het )
FST fixation index report (--fst )
Apply linear scoring system(s) to each sample (--score )
Apply linear scoring system(s) to each variant (--variant-score )
Multi-covariate association test (--glm )
If --loop-cats, select next category and jump back to "Loop over categories" step, if any categories left
Definitely QUIT
Developer information >>