See the PLINK 2 Resources page for 1000 Genomes phase 3. PLINK 2 --make-bed can be used to convert those files to PLINK 1 binary format.

If you really want just phase 1, click here.

1000 Genomes phase 1 (hosted by GigaDB, Aspera download available there)

Refer to the 1000 Genomes website for additional sample information, data usage rules, and citation instructions.

HapMap phase 2

See the PLINK 1.07 resources page.

Teaching materials and example dataset

These files were created by Shaun Purcell for PLINK 1.02 (+ gPLINK + Haploview), but everything except for the haplotypic analysis will still work with 1.90.

  • Tutorial data: (BWH mirror), which contains the following six files:
    • wgas1.ped (sample whole-genome .ped data file)
    • (corresponding .map file)
    • extra.ped (sample follow-up regional genotyping .ped file)
    • (corresponding .map file)
    • pop.cov (population membership variable)
    • command-list.txt (command list for 2nd part of practical)
    • The BWH mirror file also contains an old Windows plink.exe, and gPLINK/Haploview .jar files.
  • Teaching materials: (BWH mirror), which contains the following two files:
    • practical-1-slides.ppt
    • practical-2-notes.doc

Everything should be fairly self-explanatory after looking through the PowerPoint file and Word document.

Gene range lists

These lists are valid input for flags such as --make-set, "--extract range", "--annotate ranges", and --gene-report.

They contain one gene per row, with the following four columns:

  1. Chromosome code
  2. Start of gene (base-pair units, 1-based)
  3. End of gene (this position is included in the interval)
  4. Gene ID

Our files were generated from UCSC Table Browser RefSeq track data in May 2014 with the following pipeline:

tail -n +2 ucscdl-hgxx | awk '{print $3 " " $5 " " $6 " " $13}' | cut -c 4- | grep -E '^.{1,2}\ ' | awk '{print $4 " " $1 " " $2 " " $3}' | nsort | interval_merge > glist-hgxx


  • nsort is a variant of the Unix sort utility which implements "natural sort"; and
  • interval_merge merges overlapping intervals associated with the same gene ID, inserts XY pseudoautosomal region entries when appropriate, and reorders the fields.

(Source code for both of these auxiliary programs is in the GitHub repository.)

Functional SNP attributes

This file contains nonsense, missense, frameshift, and splice annotations from dbSNP build 129, and is designed to be used with the --annotate and --attrib flags.

SNP attributes (dbSNP build 129): snp129.attrib.gz (BWH mirror)

We plan to assemble an updated version of this file; let us know if there's anything you want us to add, or have thoughts re: filtering out probable low-quality dbSNP entries.

