Introduction, downloads

D: 30 Apr 2017

Recent version history

What's new?

Coming next

General usage

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF (.vcf{.gz})

Oxford genotype (.bgen)

Oxford haplotype (.haps)

Dosage import settings

Unusual chromosome IDs




Output file list


File formats

PLINK 2.00 alpha

PLINK 2.0 alpha was developed by Christopher Chang and the Human Longevity, Inc. Data Science team, with substantial input from Stanford's Department of Biomedical Data Science. (More detailed credits.)

Binary downloads

Operating systemDevelopment (30 Apr)
Linux 64-bit Intel1download
Linux 32-bitdownload
OS X (64-bit)download
Windows 64-bitdownload
Windows 32-bitdownload

1: This build can still run on AMD processors, but it's statically linked to Intel MKL, so some linear algebra operations will be slow. We will try to provide an AMD Zen-optimized build as soon as supporting libraries are available.

Source code and build instructions are available on Github.

Recent version history

30 April 2017: VCF dosage import/export, --vcf-min-gq, and --read-freq implemented. --score can now work with standard errors. --autosome{-par} now works properly. SNPHWE2 and SNPHWEX functions relicensed as GPL2+, to enable inclusion in the HardyWeinberg R package.

20 April: .sample export bugfix (didn't work if file was over 256 KB and no phenotypes were present). --dummy implemented (can now generate dosages).

19 April: --hardy/--hwe chrX bugfix (thanks to Jan Graffelman for catching the problem and validating the fix). --new-id-max-allele-len now has three modes ('error', 'missing', and 'truncate'), and the default mode is now 'error' (i.e. --set-missing-var-ids and --set-all-var-ids now error out when an allele code longer than 23 characters is encountered, instead of silently truncating). --score implemented, and extended to support variance-normalization and multiple score columns (these two features provide a simple way to project new samples onto previously computed principal components).

11 April: --pca var-wts bugfix, and --pca eigenvalue ordering bugfix. --glm linear regression and --condition{-list} support added. --geno/--mind/--missing/--genotyping-rate can now refer to missing dosages instead of just missing hardcalls (note that, when importing dosage data, dosages in (0.1, 0.9) and (1.1, 1.9) are saved but there usually won't be associated hardcalls).

20 March 2017: Initial public release.

What's new?

  • Preservation of reference alleles (without requiring constant use of --keep-allele-order), phase information, and the VCF QUAL, FILTER, and INFO fields. Use --make-pgen instead of --make-bed when importing a VCF; the fileset can then be referenced with --pfile. We will provide 1000 Genomes phase 3 downloads in the new fileset format as soon as multiallelic variants are also supported.
  • The new .pgen file format incorporates SNPack-style genotype compression, frequently reducing file sizes by 80+% with negligible computational cost. To allow users to take advantage of genotype compression without sacrificing compatibility with scripts expecting old-style .bim and .fam text files, PLINK 2.0 supports a hybrid .pgen + .bim + .fam usage mode (--make-bpgen/--bpfile). We've also provided a Python library for reading and writing .pgen files, to simplify migration to the new format. (PLINK 1 .bed files are valid .pgen files, so code written on top of the library is backward-compatible.)
  • Firth regression ('--glm firth-fallback', '--glm firth'). Standard logistic regression fails to converge, yielding 'NA' or nonsense results, when the 2x2 allele/phenotype contingency table has an empty cell ("quasi-complete separation"); this is common, and especially likely to happen with the strongest associations. Firth regression can prevent you from missing these associations. The fast 'firth-fallback' mode (only use Firth regression when there's either an empty contingency table cell or regular-logistic-regression convergence failure) gets you most of the benefit for a fraction of the computational cost.
  • '--pca approx' (equivalent to EIGENSOFT 6 fastmode with default parameters). If you have more than ten thousand samples, only need the top principal components, and can tolerate ~0.1% error in the last PC, this can save you a ton of compute time.
  • The 64-bit Linux build can handle linear algebra on matrices with more than 231 elements (so regular --pca is no longer limited to ~46000 samples), as long as your system has enough memory.
  • KING-robust kinship coefficients (--make-king, --make-king-table, --king-cutoff). These remain accurate when good population allele frequency estimates are unavailable. We have found --king-cutoff to be much more reliable than the PLINK 1.9 --rel-cutoff flag for removal of close relations.
  • Proper support for dosages (decimal allele count expected values). When .gen/.bgen files are imported, hardcalls and dosages are saved to the .pgen. Operations which naturally extend to decimals (e.g. --pca, --glm, --freq, --maf/--mac) use the dosage information when it's present, while methods that can only make use of hardcalls (e.g. KING-robust, Hardy-Weinberg exact test) simply ignore the dosages. --hard-call-threshold can now be used to change the saved hardcalls without changing the dosages.
  • Much more multithreaded code.
  • Most commands let you control which columns appear in the main output file(s).
  • Broad support for both gzipped and Zstd-compressed text input files.
  • Graffelman and Weir's extended chrX Hardy-Weinberg exact test, which takes male allele frequencies into account. We've found that this tends to identify quite a few obviously miscalled chrX variants which were not caught by the usual QC filters.
  • Oxford-style haplotype filesets can now be imported and exported (--haps, '--export haps'/'--export hapslegend').
  • Sample-major PLINK binary files can now be efficiently exported ('--export ind-major-bed'). This is close to 3 orders of magnitude faster than the previous implementation (PLINK 1.07 --make-bed + --ind-major).

Coming next

  1. BGEN v1.2 and v1.3 import/export.
  2. Multiallelic variant support.
  3. BCF2 import/export.
  4. Merge. (Once this is operational, a stable version of the .pgen specification will be provided, and PLINK 2.0 beta testing will begin.)

General usage >>