Introduction, downloads

S: 11 Dec 2023 (b7.2)

D: 11 Dec 2023

Recent version history

What's new?

Future development

Limitations

Note to testers

[Jump to search box]

General usage

Getting started

Citation instructions

Standard data input

PLINK 1 binary (.bed)

Autoconversion behavior

PLINK text (.ped, .tped...)

VCF (.vcf[.gz], .bcf)

Oxford (.gen[.gz], .bgen)

23andMe text

Generate random

Unusual chromosome IDs

Recombination map

Allele frequencies

Phenotypes

Covariates

Clusters of samples

Variant sets

Binary distance matrix

IBD report (.genome)

Input filtering

Sample ID file

Variant ID file

Positional ranges file

Cluster membership

Set membership

Attribute-based

Chromosomes

SNPs only

Simple variant window

Multiple variant ranges

Sample/variant thinning

Covariates (--filter)

Missing genotypes

Missing phenotypes

Minor allele frequencies

Hardy-Weinberg

Mendel errors

Quality scores

Relationships

Main functions

Data management

--make-bed

--recode

--output-chr

--zero-cluster

--split-x/--merge-x

--set-me-missing

--fill-missing-a2

--set-missing-var-ids

--update-map...

--update-ids...

--flip

--flip-scan

--keep-allele-order...

--indiv-sort

--write-covar...

--[b]merge...

Merge failures

VCF reference merge

--merge-list

--write-snplist

--list-duplicate-vars

Basic statistics

--freq[x]

--missing

--test-mishap

--hardy

--mendel

--het/--ibc

--check-sex/--impute-sex

--fst

Linkage disequilibrium

--indep...

--r/--r2

--show-tags

--blocks

Distance matrices

Identity-by-state/Hamming

  (--distance...)

Relationship/covariance

  (--make-grm-bin...)

--rel-cutoff

Distance-pheno. analysis

  (--ibs-test...)

Identity-by-descent

--genome

--homozyg...

Population stratification

--cluster

--pca

--mds-plot

--neighbour

Association analysis

Basic case/control

  (--assoc, --model)

Stratified case/control

  (--mh, --mh2, --homog)

Quantitative trait

  (--assoc, --gxe)

Regression w/ covariates

  (--linear, --logistic)

--dosage

--lasso

--test-missing

Monte Carlo permutation

Set-based tests

REML additive heritability

Family-based association

--tdt

--dfam

--qfam...

--tucc

Report postprocessing

--annotate

--clump

--gene-report

--meta-analysis

Epistasis

--fast-epistasis

--epistasis

--twolocus

Allelic scoring (--score)

R plugins (--R)

Secondary input

GCTA matrix (.grm.bin...)

Distributed computation

Command-line help

Miscellaneous

Tabs vs. spaces

Flag/parameter reuse

System resource usage

Pseudorandom numbers

Resources

1000 Genomes

Teaching materials

Gene range lists

Functional SNP attributes

Errors and warnings

Output file list

Order of operations

For developers

GitHub repository

Compilation

Core algorithms

Partial sum lookup

Bit population count

Ternary dot product

Vertical population count

Exact statistical tests

Multithreaded gzip

Adding new functionality

Google groups

plink2-users

plink2-dev

Credits

File formats

Quick index search

Family-based association analysis

Transmission disequilibrium test

--tdt [{exact | exact-midp | poo}] ['perm' | 'mperm='<value>] ['perm-count'] [{parentdt1 | parentdt2 | pat | mat}] ['set-test']

Given case/control phenotypes and pedigree information, --tdt normally computes parenTDT (see the PLINK 1.07 documentation for details), transmission disequilibrium test, and combined test statistics, writing results to plink.tdt.

  • A Mendel error check is performed before the main tests; offending genotypes are treated as missing by this analysis. --mendel-multigen extends this check in the usual way; however, --mendel-duos is not currently supported.
  • By default, the basic TDT p-value is based on a chi-square test unless you request the exact binomial test with 'exact' or 'exact-midp'. (The parenTDT and combined tests are always based on chi-square stats for now since the corresponding exact tests are more complex, but we do know how to implement them; contact us if you want us to do so.)
  • 'perm'/'mperm=<value>' requests a family-based adaptive or max(T) permutation test. 'perm-count' has the usual meaning. By default, the permutation test statistic is the basic TDT p-value; 'parentdt1'/'parentdt2' cause parenTDT or combined test p-values, respectively, to be considered instead.
  • 'set-test' tests the significance of variant sets. This cannot be used with exact tests.

The 'poo' modifier causes a parent-of-origin analysis to be performed instead, with transmissions from heterozygous fathers and heterozygous mothers considered separately. Results are reported to plink.tdt.poo in this case.

  • The parent-of-origin analysis does not currently support exact tests.
  • By default, the permutation test statistic is the absolute parent-of-origin test Z score; 'pat'/'mat' cause paternal or maternal TDT chi-square statistics, respectively, to be considered instead.

--dfam ['no-unrelateds'] ['perm' | 'mperm='<value>] ['perm-count'] ['set-test']

--dfam executes an extended version of the sib-TDT (see Spielman RS, Ewens WJ (1998) A sibship test for linkage in the presence of association: the sib transmission/disequilibrium test) which includes clusters of unrelated individuals, writing results to plink.dfam.

  • For backward compatibility, all missing-phenotype samples are treated as controls. You can use --prune to remove them from the analysis instead.
  • If clusters are not defined via --within, all unrelated individuals are treated as belonging to the same cluster.
  • --mendel-multigen, 'perm', 'mperm=<value>', 'perm-count', and 'set-test' have the usual effects.
  • To remove unrelated individuals from the test (reducing it to the original sib-TDT), add the 'no-unrelateds' modifier.
Quantitative trait

--qfam ['perm' | 'mperm='<value>] ['perm-count'] ['emp-se']
--qfam-parents ['perm' | 'mperm='<value>] ['perm-count'] ['emp-se']
--qfam-between ['perm' | 'mperm='<value>] ['perm-count'] ['emp-se']
--qfam-total ['perm' | 'mperm='<value>] ['perm-count'] ['emp-se']

For family-based quantitative trait association analysis, PLINK offers the QFAM procedure, which combines a simple linear regression of phenotype on genotype with a special permutation test which corrects for family structure. Genotypes are divided into between-family and within-family components, the components are permuted separately, and then association analysis is performed on the within-family component (--qfam, --qfam-parents), between-family component (--qfam-between), or their sum (--qfam-total). Refer to the PLINK 1.07 documentation for a detailed discussion of the method.

  • A Mendel error check is performed before the main tests; offending genotypes are treated as missing by this analysis. --mendel-multigen extends this check in the usual way; however, --mendel-duos is not currently supported.
  • Permutation is required. 'perm' and 'perm-count' have the usual meanings. However, 'mperm=<value>' just specifies a fixed number of permutations; due to the way within-family genotype components are flipped, the method does not support a proper max(T) test.
  • The output files are slightly different from PLINK 1.07 ('P' in header line has been renamed to 'RAW_P' to reduce likelihood of misinterpretation, and the .perm file doesn't have a 'STAT' field).
  • The 'emp-se' modifier adds BETA and EMP_SE (empirical standard error for beta) fields to the .perm output file.
  • Zero genotype variance now consistently yields 'NA' results.
  • We've also made minor changes to how overlapping trios are handled, so NIND values may occasionally be different than PLINK 1.07's.
  • Covariates are not currently supported.
Generate pseudo-controls

--tucc ['write-bed']

--tucc generates a new dataset where, for each trio, a case sample is created with the child's genotypes, and a pseudo-control sample is created with all the untransmitted alleles.

  • The new dataset only contains autosomal diploid variants.
  • For backward compatibility, this flag defaults to writing plink.tucc.ped, with no accompanying .map file. This behavior is deprecated; add the 'write-bed' modifier to generate a complete binary fileset instead.
  • When either parental genotype is missing, the corresponding case and pseudo-control genotypes are set to missing. This also happens for Mendel errors; --mendel-multigen extends the Mendel error check in the usual way.

Report postprocessing >>