 14[CHR] AND "human"[ORGN] => 908,625 SNPS
  • Human dataset

International Hapmap

  • 270 people, 4 populations
 Yoruba people of Ibadan, Nigeria: 30 trios
 Japan, 45 unrelated individuals from the Tokyo area
 China, 45 unrelated individuals from Beijing
 U.S.: 30 trios ;residents with northern and western European ancestry
 Landmark or Region:chr14:21100000..21199999

1,000 genomes

Personal Genome Project

10,000 genomes

  • G10K
  • G10K paper
  • Assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus.
  • 16,203 representative vertebrate species out of ca. 60,000 living species
  • I5K Insects
  • Blog


  • Example:
 2bwt-builder genome.fasta
 soap2 –a frag_1.fastq -b frag_2.fastq -D genome.fasta.index -o frag.mated.soap2 -2 frag.single.soap2 -m 120 -x 240
 cat frag.mated.soap2 | sort -k8 -nk9 >! frag.mated.soap2.sorted
 mv frag.mated.soap2.sorted frag.mated.soap2 
 soapsnp -i frag.mated.soap2 -d genome.fasta -o genome.cns


 bowtie-build genome.fasta genome.fasta 
 bowtie genome.fasta -1 frag_1.fastq -2 frag_2.fastq -p 8 -I 120 -X 240 --sam > frag.mated.sam
 samtools view -bS frag.mated.sam > frag.mated.bam
 samtools sort frag.mated.bam frag.mated.bam.sorted
 mv frag.mated.bam.sorted frag.mated.bam

 samtools mpileup -ugf genome.fasta frag.mated.bam | bcftools view -vcg - > genome.snps