Introduction, downloads

D: 19 May 2022

Recent version history

What's new?

Coming next

[Jump to search box]

General usage

Getting started

Column set descriptors

Citation instructions

Standard data input

PLINK 1 binary (.bed)

PLINK 2 binary (.pgen)

Autoconversion behavior

VCF/BCF (.vcf[.gz], .bcf)

Oxford genotype (.bgen)

Oxford haplotype (.haps)

PLINK 1 text (.ped, .tped)

PLINK 1 dosage

Sample ID conversion

Dosage import settings

Generate random

Unusual chromosome IDs

Allele frequencies



'Cluster' import

Reference genome (.fa)

Input filtering

Sample ID file

Variant ID file

Interval-BED file




SNPs only

Simple variant window

Multiple variant ranges

Deduplicate variants

Sample/variant thinning

Pheno./covar. condition


Category subset


Missing genotypes

Number of distinct alleles

Allele frequencies/counts


Imputation quality


Founder status

Main functions

Data management



















Basic statistics










Pairwise diffs



Linkage disequilibrium



Sample-distance matrices





Population stratification


PCA projection

Association analysis


--glm ERRCODE values


Linear scoring



Distributed computation

Command-line help


Flag/parameter reuse

System resource usage


.zst decompression

Pseudorandom numbers

Warnings as errors

.pgen validation


1000 Genomes phase 3

FASTA files

Errors and warnings

Output file list

Order of operations

Google groups


File formats

Quick index search


This page is under construction. If there's something you consider to be an essential PLINK resource which is not mentioned on this page, contact us and/or comment in the plink2-users Google group.

The linked files are currently hosted by Dropbox. If you are unable to download them, contact us for access to an alternate source; we understand that Dropbox is blocked in some locations.

Genotype data

1000 Genomes phase 3, phased and (optionally) annotated

Callset: (main source, chrY/chrM/contigs source) (source)

  Split by chromosome?   Keep singleton variants? (more info...)

  INFO annotations?

  KING-based pedigree corrections? (more info...)

all_hg38.pgen.zst (3.16 GiB, requires --allow-extra-chr) all_hg38_ns.pgen.zst (3.10 GiB, requires --allow-extra-chr) all_phase3.pgen.zst (2.25 GiB) all_phase3_ns.pgen.zst (2.13 GiB)
all_hg38.pvar.zst (4.41 GiB, >90% of this is annotations) all_hg38_noannot.pvar.zst (359 MiB) (rename to "all_hg38.pvar.zst" before use) all_hg38_ns.pvar.zst (3.89 GiB; >90% of this is annotations) all_hg38_ns_noannot.pvar.zst (296 MiB) (rename to "all_hg38_ns.pvar.zst" before use) all_phase3.pvar.zst (1.26 GiB) all_phase3_noannot.pvar.zst (614 MiB) (rename to "all_phase3.pvar.zst" before use) all_phase3_ns.pvar.zst (812 MiB) all_phase3_ns_noannot.pvar.zst (362 MiB) (rename to "all_phase3_ns.pvar.zst" before use)
hg38_corrected.psam hg38_orig.psam phase3_corrected.psam phase3_orig.psam (rename to "all_hg38.psam" before use) (rename to "all_hg38_ns.psam" before use) (rename to "all_phase3.psam" before use) (rename to "all_phase3_ns.psam" before use)

Common sample information file (not for chrY/chrM): hg38_corrected.psam. hg38_orig.psam. phase3_corrected.psam. phase3_orig.psam. Create symlinks from chr1_hg38.psam, chr2_hg38.psam, chr1_phase3.psam, chr2_phase3.psam, etc. to this (or make a bunch of copies).

Remove "_noannot" from the .pvar.zst filenames before use.

chr1_hg38.pgen.zst (236 MiB), chr1_hg38.pvar.zst (347 MiB) chr1_hg38_noannot.pvar.zst (27.1 MiB)
chr2_hg38.pgen.zst (247 MiB), chr2_hg38.pvar.zst (365 MiB) chr2_hg38_noannot.pvar.zst (29.4 MiB)
chr3_hg38.pgen.zst (204 MiB), chr3_hg38.pvar.zst (298 MiB) chr3_hg38_noannot.pvar.zst (23.8 MiB)
chr4_hg38.pgen.zst (196 MiB), chr4_hg38.pvar.zst (290 MiB) chr4_hg38_noannot.pvar.zst (23.3 MiB)
chr5_hg38.pgen.zst (183 MiB), chr5_hg38.pvar.zst (271 MiB) chr5_hg38_noannot.pvar.zst (21.6 MiB)
chr6_hg38.pgen.zst (178 MiB), chr6_hg38.pvar.zst (259 MiB) chr6_hg38_noannot.pvar.zst (20.6 MiB)
chr7_hg38.pgen.zst (176 MiB), chr7_hg38.pvar.zst (252 MiB) chr7_hg38_noannot.pvar.zst (19.8 MiB)
chr8_hg38.pgen.zst (159 MiB), chr8_hg38.pvar.zst (232 MiB) chr8_hg38_noannot.pvar.zst (18.5 MiB)
chr9_hg38.pgen.zst (140 MiB), chr9_hg38.pvar.zst (195 MiB) chr9_hg38_noannot.pvar.zst (15.0 MiB)
chr10_hg38.pgen.zst (150 MiB), chr10_hg38.pvar.zst (213 MiB) chr10_hg38_noannot.pvar.zst (17.5 MiB)
chr11_hg38.pgen.zst (140 MiB), chr11_hg38.pvar.zst (205 MiB) chr11_hg38_noannot.pvar.zst (16.5 MiB)
chr12_hg38.pgen.zst (143 MiB), chr12_hg38.pvar.zst (202 MiB) chr12_hg38_noannot.pvar.zst (16.3 MiB)
chr13_hg38.pgen.zst (106 MiB), chr13_hg38.pvar.zst (152 MiB) chr13_hg38_noannot.pvar.zst (12.3 MiB)
chr14_hg38.pgen.zst (98.4 MiB), chr14_hg38.pvar.zst (139 MiB) chr14_hg38_noannot.pvar.zst (11.5 MiB)
chr15_hg38.pgen.zst (97.0 MiB), chr15_hg38.pvar.zst (131 MiB) chr15_hg38_noannot.pvar.zst (10.6 MiB)
chr16_hg38.pgen.zst (107 MiB), chr16_hg38.pvar.zst (146 MiB) chr16_hg38_noannot.pvar.zst (11.7 MiB)
chr17_hg38.pgen.zst (94.7 MiB), chr17_hg38.pvar.zst (129 MiB) chr17_hg38_noannot.pvar.zst (10.2 MiB)
chr18_hg38.pgen.zst (87.4 MiB), chr18_hg38.pvar.zst (120 MiB) chr18_hg38_noannot.pvar.zst (9.86 MiB)
chr19_hg38.pgen.zst (80.4 MiB), chr19_hg38.pvar.zst (106 MiB) chr19_hg38_noannot.pvar.zst (8.10 MiB)
chr20_hg38.pgen.zst (73.9 MiB), chr20_hg38.pvar.zst (101 MiB) chr20_hg38_noannot.pvar.zst (7.99 MiB)
chr21_hg38.pgen.zst (46.4 MiB), chr21_hg38.pvar.zst (62.4 MiB) chr21_hg38_noannot.pvar.zst (5.01 MiB)
chr22_hg38.pgen.zst (50.4 MiB), chr22_hg38.pvar.zst (67.7 MiB) chr22_hg38_noannot.pvar.zst (5.03 MiB)
chrX_hg38.pgen.zst (95.8 MiB), chrX_hg38.pvar.zst (161 MiB) chrX_hg38_noannot.pvar.zst (9.05 MiB)
chrY_hg38.pgen.zst (7.83 MiB), chrY_hg38.pvar.zst (8.07 MiB) chrY_hg38_noannot.pvar.zst (734 KiB)
chrM_hg38.pgen.zst (69.1 KiB), chrM_hg38.pvar.zst (188 KiB) chrM_hg38_noannot.pvar.zst (18.0 KiB)
contigs_hg38.pgen.zst (63.3 MiB), contigs_hg38.pvar.zst (137 MiB) contigs_hg38_noannot.pvar.zst (5.16 MiB)

chr1_phase3.pgen.zst (172 MiB), chr1_phase3.pvar.zst (100 MiB) chr1_phase3_noannot.pvar.zst (47.5 MiB)
chr2_phase3.pgen.zst (185 MiB), chr2_phase3.pvar.zst (110 MiB) chr2_phase3_noannot.pvar.zst (52.0 MiB)
chr3_phase3.pgen.zst (153 MiB), chr3_phase3.pvar.zst (90.6 MiB) chr3_phase3_noannot.pvar.zst (42.9 MiB)
chr4_phase3.pgen.zst (150 MiB), chr4_phase3.pvar.zst (89.1 MiB) chr4_phase3_noannot.pvar.zst (42.2 MiB)
chr5_phase3.pgen.zst (136 MiB), chr5_phase3.pvar.zst (81.6 MiB) chr5_phase3_noannot.pvar.zst (38.8 MiB)
chr6_phase3.pgen.zst (136 MiB), chr6_phase3.pvar.zst (78.4 MiB) chr6_phase3_noannot.pvar.zst (36.9 MiB)
chr7_phase3.pgen.zst (131 MiB), chr7_phase3.pvar.zst (73.5 MiB) chr7_phase3_noannot.pvar.zst (34.6 MiB)
chr8_phase3.pgen.zst (121 MiB), chr8_phase3.pvar.zst (71.2 MiB) chr8_phase3_noannot.pvar.zst (33.7 MiB)
chr9_phase3.pgen.zst (103 MiB), chr9_phase3.pvar.zst (55.6 MiB) chr9_phase3_noannot.pvar.zst (26.2 MiB)
chr10_phase3.pgen.zst (111 MiB), chr10_phase3.pvar.zst (62.3 MiB) chr10_phase3_noannot.pvar.zst (29.3 MiB)
chr11_phase3.pgen.zst (107 MiB), chr11_phase3.pvar.zst (62.7 MiB) chr11_phase3_noannot.pvar.zst (29.7 MiB)
chr12_phase3.pgen.zst (106 MiB), chr12_phase3.pvar.zst (59.8 MiB) chr12_phase3_noannot.pvar.zst (28.2 MiB)
chr13_phase3.pgen.zst (78.4 MiB), chr13_phase3.pvar.zst (44.6 MiB) chr13_phase3_noannot.pvar.zst (21.0 MiB)
chr14_phase3.pgen.zst (73.6 MiB), chr14_phase3.pvar.zst (41.4 MiB) chr14_phase3_noannot.pvar.zst (19.5 MiB)
chr15_phase3.pgen.zst (71.6 MiB), chr15_phase3.pvar.zst (38.0 MiB) chr15_phase3_noannot.pvar.zst (17.9 MiB)
chr16_phase3.pgen.zst (79.8 MiB), chr16_phase3.pvar.zst (42.0 MiB) chr16_phase3_noannot.pvar.zst (19.8 MiB)
chr17_phase3.pgen.zst (68.2 MiB), chr17_phase3.pvar.zst (36.4 MiB) chr17_phase3_noannot.pvar.zst (17.1 MiB)
chr18_phase3.pgen.zst (65.2 MiB), chr18_phase3.pvar.zst (35.4 MiB) chr18_phase3_noannot.pvar.zst (16.8 MiB)
chr19_phase3.pgen.zst (57.6 MiB), chr19_phase3.pvar.zst (28.9 MiB) chr19_phase3_noannot.pvar.zst (13.5 MiB)
chr20_phase3.pgen.zst (52.5 MiB), chr20_phase3.pvar.zst (28.2 MiB) chr20_phase3_noannot.pvar.zst (13.3 MiB)
chr21_phase3.pgen.zst (34.6 MiB), chr21_phase3.pvar.zst (17.4 MiB) chr21_phase3_noannot.pvar.zst (8.08 MiB)
chr22_phase3.pgen.zst (35.8 MiB), chr22_phase3.pvar.zst (17.4 MiB) chr22_phase3_noannot.pvar.zst (8.20 MiB)
chrX_phase3.pgen.zst (73.0 MiB), chrX_phase3.pvar.zst (44.7 MiB) chrX_phase3_noannot.pvar.zst (18.3 MiB)
chrY_phase3.pgen.zst (325 KiB), chrY_phase3.pvar.zst (605 KiB), chrY_phase3_noannot.pvar.zst (241 KiB), chrY_phase3.psam (1233 samples)
chrM_phase3.pgen.zst (50.4 KiB), chrM_phase3.pvar.zst (15.7 KiB), chrM_phase3_noannot.pvar.zst (10.4 KiB), chrM_phase3_corrected.psam chrM_phase3_orig.psam (2534 samples, rename to "chrM_phase3.psam" before use)


  • .pgen.zst file(s) must be decompressed before use. (This isn't necessary for .pvar.zst files: see --pfile's 'vzs' modifier.) If you don't have another .zst decompressor installed, you can use PLINK 2 for this purpose:
    plink2 --zst-decompress all_hg38.pgen.zst > all_hg38.pgen
  • In addition to ~600 trios which were intentionally included, this dataset contains a few close relations which are not described in the .psam file, e.g. sibships where neither parent was sequenced. Use --remove with one of the following ID lists when you don't want close relations:
    These lists were generated from the original dataset with "--king-cutoff 0.177" and "--king-cutoff 0.0884", respectively. If you're curious, here's the --make-king-table + --king-table-filter report listing all 1st/2nd-degree related sample pairs: deg2_hg38.kin0
  • This dataset was intended to contain only unrelated samples; unfortunately, a few parent-child pairs, sibships, and second-degree relationships snuck in. Use --remove with one of the following ID lists when you don't want close relations:
    These lists were generated from the original dataset with "--king-cutoff 0.177" and "--king-cutoff 0.0884", respectively. If you're curious, here's the --make-king-table + --king-table-filter report listing all 1st/2nd-degree related sample pairs: deg2_phase3.kin0
  • This dataset fuses results from two different pipelines. The primary chr1..chrX genotypes are phased, contain no missing calls, and only have biallelic left-normalized variants (multiallelic variants were "split"). The chrY/chrM/contigs genotypes are unphased, contain some missing calls, multiallelic variants there are unsplit, and there are a few variants which aren't left-normalized.
  • All relevant information in the original phased chr1..chrX callset is preserved. The chrY/chrM/contigs source material contains per-genotype AD, DP, GQ, and PL fields which cannot be represented by the .pgen file format, and are consequently not preserved.
  • This dataset contains (unsplit) multiallelic variants, and a few variants which aren't left-normalized.
  • Refer to the 1000 Genomes website for additional sample information, data usage rules, and citation instructions.

Reference genomes

These are the reference genomes that the aforementioned 1000 Genomes samples were aligned against. Note that --fa can directly read these compressed files.

Errors and warnings >>