Introduction, downloads

S: 21 Feb 2018 (b5.3)

D: 21 Feb 2018

Recent version history

What's new?

Future development


Note to testers

[Jump to search box]

General usage

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf{.gz}, .bcf)

Oxford (.gen{.gz}, .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map



Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Positional ranges file

Cluster membership

Set membership



SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies


Mendel errors

Quality scores


Main functions

Data management

















Merge failures

VCF reference merge




Basic statistics









Linkage disequilibrium





Distance matrices






Distance-pheno. analysis





Population stratification





Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)




Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association





Report postprocessing









Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help


Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers


1000 Genomes phase 1

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository


Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Google groups




File formats

Quick index search

R plugin functions

--R [R script filename] <debug>

(Not supported on Windows.)

PLINK is designed to interoperate well with R: almost all built-in commands generate tabular reports that are easy to load and postprocess in it. With the Rserve package (preferably version 1.7 or later) and PLINK's --R flag, you can also apply R functions directly to PLINK binary data, without the need to write your own I/O code.

--R loads the given R script, which must have a function of the form

Rplink <- function(PHENO,GENO,CLUSTER,COVAR)


  • PHENO is a vector of phenotypes (length N)
  • GENO is a matrix of genotypes (N rows, m columns; 0/1/2/'NA' additive coding, like '--recode A')
  • CLUSTER is a vector of numeric cluster IDs (length N, all-zero when no clusters are defined), and
  • COVAR is a matrix of covariates (N rows, C columns).

(N is the number of samples (after filtering); C, which can be zero, is the number of covariates; and m is the number of variants in the current data block, which is usually smaller than the total number in the dataset.)

For each variant, PLINK expects this function to return a numeric vector of values of the form

c(length(r), r)

where the vectors are permitted to have different lengths. The PLINK 1.07 documentation contains several detailed examples. If this basic interface is insufficient for your needs, you may find the PLINK/SEQ R package to be more helpful.

On a normal --R run, results are written to If you want to look at the R commands PLINK sends, add the 'debug' modifier; this causes them to be logged to plink.debug.R (without being executed).

Connecting elsewhere

--R-port [port number]

--R-host [host]
--R-socket [socket]

By default, --R tries to connect to a local Rserve instance on port 6311. You can change this as follows:

  • --R-port sets the port number.
  • --R-host lets you connect to a remote host, while --R-socket specifies a socket name.

Secondary input >>