Introduction, downloads

S: 11 Dec 2023 (b7.2)

D: 11 Dec 2023

Recent version history

What's new?

Future development

Limitations

Note to testers

[Jump to search box]

General usage

Getting started

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf[.gz], .bcf)

Oxford (.gen[.gz], .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map

Allele frequencies

Phenotypes

Covariates

Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Positional ranges file

Cluster membership

Set membership

Attribute-based

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies

Hardy-Weinberg

Mendel errors

Quality scores

Relationships

Main functions

Data management

--make-bed

--recode

--output-chr

--zero-cluster

--split-x/--merge-x

--set-me-missing

--fill-missing-a2

--set-missing-var-ids

--update-map...

--update-ids...

--flip

--flip-scan

--keep-allele-order...

--indiv-sort

--write-covar...

--[b]merge...

Merge failures

VCF reference merge

--merge-list

--write-snplist

--list-duplicate-vars

Basic statistics

--freq[x]

--missing

--test-mishap

--hardy

--mendel

--het/--ibc

--check-sex/--impute-sex

--fst

Linkage disequilibrium

--indep...

--r/--r2

--show-tags

--blocks

Distance matrices

Identity-by-state/Hamming

  (--distance...)

Relationship/covariance

  (--make-grm-bin...)

--rel-cutoff

Distance-pheno. analysis

  (--ibs-test...)

Identity-by-descent

--genome

--homozyg...

Population stratification

--cluster

--pca

--mds-plot

--neighbour

Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)

--dosage

--lasso

--test-missing

Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association

--tdt

--dfam

--qfam...

--tucc

Report postprocessing

--annotate

--clump

--gene-report

--meta-analysis

Epistasis

--fast-epistasis

--epistasis

--twolocus

Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help

Miscellaneous

Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers

Resources

1000 Genomes

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository

Compilation

Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Google groups

plink2-users

plink2-dev

Credits

File formats

Quick index search

Command-line help

--help [flag name/prefix...]

When invoked with no parameters, --help provides a summary of all PLINK flags, starting with the main functions. This is long (over 1000 lines); we recommend you pipe the output through a terminal pager like Unix less or more, or dump it to a file with e.g.

plink --help > plink-help.txt

Alternatively, you can provide one or more flag names/prefixes to cause PLINK to only display information on the referenced flags, e.g.

[chrchang:~/plink-ng]$ plink --help abcd Z-gneome
PLINK v1.90b6.9 64-bit (4 Mar 2019)            www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3

--genome ['gz'] ['rel-check'] ['full'] ['unbounded'] ['nudge']
  Generate an identity-by-descent report.
  * It is usually best to perform this calculation on a marker set in
    approximate linkage equilibrium.
  * The 'rel-check' modifier excludes pairs of samples with different FIDs
    from the final report.
  * 'full' adds raw pairwise comparison data to the report.
  * The P(IBD=0/1/2) estimator employed by this command sometimes yields
    numbers outside the range [0,1]; by default, these are clipped.  The
    'unbounded' modifier turns off this clipping.
  * Then, when PI_HAT^2 < P(IBD=2), 'nudge' adjusts the final P(IBD=0/1/2)
    estimates to a theoretically possible configuration.
  * The computation can be subdivided with --parallel.

--ppc-gap <val>    : Minimum number of base pairs, in thousands, between
                     informative pairs of markers used in --genome PPC test.
                     500 if unspecified.
--min <cutoff>     : Specify minimum PI_HAT for inclusion in --genome report.
--max <cutoff>     : Specify maximum PI_HAT for inclusion in --genome report.

No help entry for 'abcd'.

More precisely, for each parameter you pass to --help, PLINK will first search for an exact flag name match; if it fails to find one, it will then search for exact prefix matches; and if it also fails to find any of those, it will search for Damerau-Levenshtein distance 1 matches (note the 'Z-gneome' misspelling above). (The "Quick index search" on this webpage's sidebar uses the same logic.)

If --help is used with other flags (other than --script and --rerun), it causes everything before it on the command line to be ignored, and everything after it to be treated as --help parameters. This is convenient when you've forgotten exactly how a flag works while in the middle of typing a long command: you can put your help request at the end of the unfinished command, and then retrieve your unfinished command line with the up arrow (in most shells, anyway).

[chrchang:~/plink-ng]$ plink --bfile test_data --hwe 1e-5 midp --help --pca
PLINK v1.90b6.9 64-bit (4 Mar 2019)            www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
--help present, ignoring other flags.

--pca {count} ['header'] ['tabs'] ['var-wts']
  Calculates a variance-standardized relationship matrix (use
  --make-rel/--make-grm-gz/--make-grm-bin to dump it), and extracts the top
  20 principal components.
  * It is usually best to perform this calculation on a marker set in
    approximate linkage equilibrium.
  * You can change the number of PCs by passing a numeric parameter.
  * The 'header' modifier adds a header line to the .eigenvec output file.
    (For compatibility with the GCTA flag of the same name, the default is no
    header line.)
  * The 'tabs' modifier causes the .eigenvec file(s) to be tab-delimited.
  * The 'var-wts' modifier requests an additional .eigenvec.var file with PCs
    expressed as variant weights instead of sample weights.

--pca-cluster-names <...> : These can be used individually or in combination
--pca-clusters <fname>      to define a list of clusters to use in the basic
                            --pca computation.  (--pca-cluster-names expects
                            a space-delimited sequence of cluster names,
                            while --pca-clusters expects a file with one
                            cluster name per line.)  All samples outside
                            those clusters will then be projected on to the
                            calculated PCs.
[chrchang:~/plink-ng]$ plink --bfile test_data --hwe 1e-5 midp --pca ...

Miscellany >>