Dpuiu snp: Difference between revisions

From Cbcb
Jump to navigation Jump to search
No edit summary
No edit summary
 
(14 intermediate revisions by the same user not shown)
Line 3: Line 3:
* [http://droog.gs.washington.edu/parc/images/iupac.html IUPAC ambiguity codes]
* [http://droog.gs.washington.edu/parc/images/iupac.html IUPAC ambiguity codes]
* [ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/ NCBI SNP FTP]
* [ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/ NCBI SNP FTP]
* [http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/data/?build=37 Human Genome Assembly Information]
* [http://genome.ucsc.edu/cgi-bin/hgTracks UC Santa Cruz Genome Browser]
* [http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp Entrez dbSNP]
  Search examples:
  14[CHR] AND "human"[ORGN] => 908,625 SNPS
* Human dataset
  http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?sample=SRS000090#
  http://ccr.coriell.org/Sections/Search/Sample_Detail.aspx?Ref=GM12878
= International Hapmap =
* 270 people, 4 populations
  Yoruba people of Ibadan, Nigeria: 30 trios
  Japan, 45 unrelated individuals from the Tokyo area
  China, 45 unrelated individuals from Beijing
  U.S.: 30 trios ;residents with northern and western European ancestry
* http://hapmap.ncbi.nlm.nih.gov
* Example:
  Landmark or Region:chr14:21100000..21199999
= 1,000 genomes =
* 2500 people, 25 populations
* http://www.1000genomes.org/
* http://browser.1000genomes.org/index.html
* [http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-40 Variant Call Format]
= Personal Genome Project =
* 100,000 humans
* http://www.personalgenomes.org/
= 10,000 genomes =
* [http://genome10k.soe.ucsc.edu/ G10K]
* [http://jhered.oxfordjournals.org/content/100/6/659.full.pdf G10K paper]
* Assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus.
* 16,203 representative vertebrate species out of ca. 60,000 living species
* [http://www.arthropodgenomes.org/wiki/i5K  I5K]  Insects
* [http://www.medpedia.com/news_analysis/6-Genomics-Evolution-and-Pseudoscience/entries/8989-10000-genomes-ndash-why Blog]


= Soap =
= Soap =
Line 26: Line 65:
   samtools sort frag.mated.bam frag.mated.bam.sorted
   samtools sort frag.mated.bam frag.mated.bam.sorted
   mv frag.mated.bam.sorted frag.mated.bam
   mv frag.mated.bam.sorted frag.mated.bam
 
   samtools mpileup -ugf genome.fasta frag.mated.bam | bcftools view -vcg - > genome.snps
   samtools mpileup -ugf genome.fasta frag.mated.bam | bcftools view -vcg - > genome.snps

Latest revision as of 16:33, 14 June 2011

Links

 Search examples:
 14[CHR] AND "human"[ORGN] => 908,625 SNPS
  • Human dataset
 http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?sample=SRS000090#
 http://ccr.coriell.org/Sections/Search/Sample_Detail.aspx?Ref=GM12878

International Hapmap

  • 270 people, 4 populations
 Yoruba people of Ibadan, Nigeria: 30 trios
 Japan, 45 unrelated individuals from the Tokyo area
 China, 45 unrelated individuals from Beijing
 U.S.: 30 trios ;residents with northern and western European ancestry
 Landmark or Region:chr14:21100000..21199999

1,000 genomes

Personal Genome Project

10,000 genomes

  • G10K
  • G10K paper
  • Assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species, approximately one for every vertebrate genus.
  • 16,203 representative vertebrate species out of ca. 60,000 living species
  • I5K Insects
  • Blog

Soap

  • Example:
 2bwt-builder genome.fasta
 
 soap2 –a frag_1.fastq -b frag_2.fastq -D genome.fasta.index -o frag.mated.soap2 -2 frag.single.soap2 -m 120 -x 240
 cat frag.mated.soap2 | sort -k8 -nk9 >! frag.mated.soap2.sorted
 mv frag.mated.soap2.sorted frag.mated.soap2 
 
 soapsnp -i frag.mated.soap2 -d genome.fasta -o genome.cns

Bowtie/Samtools

 bowtie-build genome.fasta genome.fasta 
 bowtie genome.fasta -1 frag_1.fastq -2 frag_2.fastq -p 8 -I 120 -X 240 --sam > frag.mated.sam
 
 samtools view -bS frag.mated.sam > frag.mated.bam
 
 samtools sort frag.mated.bam frag.mated.bam.sorted
 mv frag.mated.bam.sorted frag.mated.bam

 samtools mpileup -ugf genome.fasta frag.mated.bam | bcftools view -vcg - > genome.snps