Introduction, downloads

D: 3 Dec 2024

Recent version history

What's new?

Coming next

[Jump to search box]

General usage

Getting started

Flag usage summaries

Column set descriptors

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PROVISIONAL_REF?

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF/BCF (.vcf[.gz], .bcf)

Oxford genotype (.bgen)

Oxford haplotype (.haps)

PLINK 1 text (.ped, .tped)

PLINK 1 dosage

Sample ID conversion

Dosage import settings

Generate random

Unusual chromosome IDs

Allele frequencies

Phenotypes

Covariates

'Cluster' import

Reference genome (.fa)

Input filtering

Sample ID file

Variant ID file

Interval-BED file

--extract-col-cond

QUAL, FILTER, INFO

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Deduplicate variants

Sample/variant thinning

Pheno./covar. condition

Missingness

Category subset

--keep-col-match

Missing genotypes

Number of distinct alleles

Allele frequencies/counts

Hardy-Weinberg

Imputation quality

Sex

Founder status

Main functions

Data management

--make-[b]pgen/--make-bed

--export

--output-chr

--split-par/--merge-par

--set-all-var-ids

--recover-var-ids

--update-map...

--update-ids...

--ref-allele

--ref-from-fa

--normalize

--indiv-sort

--write-covar

--variance-standardize

--quantile-normalize

--split-cat-pheno

--pheno-svd

--pmerge[-list]

--write-samples

Basic statistics

--freq

--geno-counts

--sample-counts

--missing

--genotyping-rate

--hardy

--het

--check-sex/--impute-sex

--fst

--pgen-info

Pairwise diffs

--pgen-diff

--sample-diff

Linkage disequilibrium

--indep...

--r[2]-[un]phased

--ld

Sample-distance matrices

Relationship/covariance

  (--make-grm-bin...)

--make-king...

--king-cutoff

Population stratification

--pca

PCA projection

Association analysis

--glm

--glm ERRCODE values

--gwas-ssf

--adjust-file

Report postprocessing

--clump

Linear scoring

--score[-list]

--variant-score

Distributed computation

Command-line help

Miscellaneous

Flag/parameter reuse

System resource usage

--loop-cats

.zst decompression

Pseudorandom numbers

Warnings as errors

.pgen validation

Resources

1000 Genomes phase 3

HGDP-CEPH

FASTA files

Errors and warnings

Output file list

Order of operations

Developer information

GitHub root

Python library

R library

Compilation

Adding new functionality

Discussion forums

Credits

File formats

Tutorials

Setup

Rules of Thumb

Data Exploration 1 — HWE, Allele Frequency Spectrum

Data Exploration 2 — Genomic Structure

Linkage

Relationship Matrix

Genome-Wide Assocation Analyses (GWAS)

Regressions

Post-Hoc

Formatting Files

bcftools

Variant IDs

Reference Alleles

Format for R

Shortcuts

Quick index search

Credits

PLINK 2.0 alpha was developed by Christopher Chang, with support from Human Longevity, Inc. in 2016-17, and substantial input from Stanford's Department of Biomedical Data Science.

  • The .pgen file format is primarily an adaptation of ideas from SNPack (Francisco Sambo, Barbara Di Camillo, Gianna Toffolo, Claudio Cobelli) to more general contexts.
  • The KING-robust kinship estimation method was developed by Ani Manichaikul et al. at the University of Virginia.
  • The extended chrX Hardy-Weinberg equilibrium test was developed by Jan Graffelman (Universitat Politecnica de Catalunya) and Bruce Weir (University of Washington).
  • Christoph Lippert (Human Longevity), Manuel Rivas (Stanford), and Yosuke Tanigawa (Stanford) contributed significantly to development and testing of the Python and R file I/O libraries.
  • Firth logistic regression was added due to discussion with Manuel Rivas. The implementation is ported from Georg Heinze's logistf R package.
  • The Zstandard compression format and libraries used by PLINK 2.0 were developed by Yann Collet and Przemysław Skibiński, with support from Facebook, Inc.
  • The fast PCA approximation is ported from Kevin Galinsky's contribution to EIGENSOFT 6.
  • The array-popcount improvement is based on Faster Population Counts Using AVX2 Instructions by Wojciech Muła, Nathan Kurz, and Daniel Lemire, along with Kim Walisch's libpopcnt implementation.
  • The bgzip-decompression improvement is based on libdeflate (Eric Biggers), and htslib 1.8 integration work by James Bonfield.
  • Thanks to numerous other testers at Human Longevity, Stanford, and elsewhere for bug reports and helpful suggestions.

File format reference >>