Introduction, downloads

S: 13 Feb 2017 (b3.46)

D: 13 Feb 2017

Recent version history

What's new?

Future development


Note to testers

[Jump to search box]

General usage

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf{.gz}, .bcf)

Oxford (.gen{.gz}, .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map



Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Cluster membership

Set membership



SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies


Mendel errors

Quality scores


Main functions

Data management

















Merge failures

VCF reference merge




Basic statistics









Linkage disequilibrium





Distance matrices






Distance-pheno. analysis





Population stratification





Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)




Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association




Report postprocessing









Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help


Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers


1000 Genomes phase 1

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository


Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Google groups




File formats

Quick index search


This page is under construction. If there's something you consider to be an essential PLINK resource which is not mentioned on this page, contact us and/or comment in the plink2-users Google group.

Genotype data

1000 Genomes phase 1 (hosted by GigaDB, Aspera download available there)

Refer to the 1000 Genomes website for additional sample information, data usage rules, and citation instructions.

HapMap phase 2

See the PLINK 1.07 resources page.

Teaching materials and example dataset

These files were created by Shaun Purcell for PLINK 1.02 (+ gPLINK + Haploview), but everything except for the haplotypic analysis will still work with 1.90.

  • Tutorial data: (BWH mirror), which contains the following six files:
    • wgas1.ped (sample whole-genome .ped data file)
    • (corresponding .map file)
    • extra.ped (sample follow-up regional genotyping .ped file)
    • (corresponding .map file)
    • pop.cov (population membership variable)
    • command-list.txt (command list for 2nd part of practical)
    • The BWH mirror file also contains an old Windows plink.exe, and gPLINK/Haploview .jar files.
  • Teaching materials: (BWH mirror), which contains the following two files:
    • practical-1-slides.ppt
    • practical-2-notes.doc

Everything should be fairly self-explanatory after looking through the PowerPoint file and Word document.

Gene range lists

These lists are valid input for flags such as --make-set, '--extract range', '--annotate ranges', and --gene-report.

They contain one gene per row, with the following four columns:

  1. Chromosome code
  2. Start of gene (base-pair units, 1-based)
  3. End of gene (this position is included in the interval)
  4. Gene ID

Our files were generated from UCSC Table Browser RefSeq track data in May 2014 with the following pipeline:

tail -n +2 ucscdl-hgxx | awk '{print $3 " " $5 " " $6 " " $13}' | cut -c 4- | grep -E '^.{1,2}\ ' | awk '{print $4 " " $1 " " $2 " " $3}' | nsort | interval_merge > glist-hgxx


  • nsort is a variant of the Unix sort utility which implements 'natural sort'; and
  • interval_merge merges overlapping intervals associated with the same gene ID, inserts XY pseudoautosomal region entries when appropriate, and reorders the fields.

(Source code for both of these auxiliary programs is in the GitHub repository.)

Functional SNP attributes

This file contains nonsense, missense, frameshift, and splice annotations from dbSNP build 129, and is designed to be used with the --annotate and --attrib flags.

SNP attributes (dbSNP build 129): snp129.attrib.gz (BWH mirror)

We plan to assemble an updated version of this file; let us know if there's anything you want us to add, or have thoughts re: filtering out probable low-quality dbSNP entries.

Errors and warnings >>