Introduction, downloads

S: 11 Dec 2023 (b7.2)

D: 11 Dec 2023

Recent version history

What's new?

Future development


Note to testers

[Jump to search box]

General usage

Getting started

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf[.gz], .bcf)

Oxford (.gen[.gz], .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map

Allele frequencies



Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Positional ranges file

Cluster membership

Set membership



SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies


Mendel errors

Quality scores


Main functions

Data management

















Merge failures

VCF reference merge




Basic statistics









Linkage disequilibrium





Distance matrices






Distance-pheno. analysis





Population stratification





Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)




Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association





Report postprocessing









Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help


Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers


1000 Genomes

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository


Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Discussion forums




File formats

Quick index search


PLINK 1.9 is developed, tested, and documented primarily by Christopher Chang at GRAIL, Inc., Carson Chow and Shashaank Vattikuti at the NIH-NIDDK's Laboratory of Biological Modeling, Laurent Tellier at the BGI Cognitive Genomics Lab, and James Lee at the University of Minnesota, with additional funding from the Purcell Lab at Brigham & Women's Hospital.

  • All previous versions of PLINK are the work of Shaun Purcell at Brigham & Women's Hospital and Harvard University. Since our update started as an independent project, its level of compatibility with PLINK 1.07 would have been all but impossible to achieve if PLINK was not a free and open source program.
  • GCTA is the work of Jian Yang et al. at the University of Queensland. Their release of the GCTA 1.2 source code under GPLv3 terms is also greatly appreciated by us.
  • Thanks to Stephen Hsu at the BGI-CGL for motivating the initial weighted distance calculation.
  • Thanks to Sanja Franić at VU University Amsterdam for early testing.
  • Thanks to Mike Keehan for additional testing and a bugfix.
  • Thanks to Masahiro Kanai for improving the robustness of the VCF parser, fixing some other plink_data.c bugs, and adding some filtering flags.
  • The SSE2 population count algorithm used in many of PLINK 1.9's inner loops is based on work and discussion by Andrew Dalke, Robert Harley, Cédric Lauradoux, Terje Mathisen, and Kim Walisch.
  • The Hardy-Weinberg equilibrium and Fisher exact tests are based on an algorithm developed by Jan Wigginton and Gonçalo Abecasis at the University of Michigan Center for Statistical Genetics.
  • The Hardy-Weinberg equilibrium test 'midp' option was added due to work by Jan Graffelman and Victor Moreno.
  • The parallel gzip implementation was developed by Mark Adler at the Caltech/NASA Jet Propulsion Laboratory.
  • The BGZF library was developed by Bob Handsaker, Petr Danecek, Heng Li, and John Marshall.
  • PLINK 1.9's permutation procedures extend work by Brian Browning (PRESTO) and Roman Pahl (PERMORY).
  • PLINK 1.9's fast epistasis test implements methods developed by Xiang Wan et al. in BOOST and Masao Ueki, Heather Cordell, and Richard Howey in CASSI.
  • The logistic regression algorithm is based on the winning submission of Pascal Pons in the GWAS Speedup crowdsourcing contest run in April 2013 by Babbage Analytics & Innovation and TopCoder, who have donated the results to be used in PLINK 2. The contest was designed by Po-Ru Loh; subsequent analysis and code preparation were performed by Andrew Hill, Ragu Bharadwaj, and Scott Jelinsky. A manuscript is in preparation by these authors and Iain Kilty, Kevin Boudreau, Karim Lakhani and Eva Guinan.
  • Thanks to David Fischer for GitHub hygiene improvements.
  • Thanks to numerous PLINK 1.9 alpha testers for bug reports and helpful suggestions.

File format reference >>