NCBI submission: Difference between revisions
Jump to navigation
Jump to search
Line 49: | Line 49: | ||
== AA submission == | == AA submission == | ||
[http://www.ncbi.nlm.nih.gov/Traces/assembly/assmbrowser.cgi?cmd=show&f=rfc&m=doc&s=rfc AA] | * [http://www.ncbi.nlm.nih.gov/Traces/assembly/assmbrowser.cgi?cmd=show&f=rfc&m=doc&s=rfc AA] | ||
[http://www.ncbi.nlm.nih.gov/Traces/assembly/assmbrowser.cgi?cmd=show&f=rfc&m=doc&s=rfc#sub AA submission info] | * [http://www.ncbi.nlm.nih.gov/Traces/assembly/assmbrowser.cgi?cmd=show&f=rfc&m=doc&s=rfc#sub AA submission info] | ||
Compressed archive containing 2 files: ASSEMBLY.xml , MD5 | Compressed archive containing 2 files: ASSEMBLY.xml , MD5 | ||
Line 100: | Line 100: | ||
$ md5sum umd-20070816-125223/ASSEMBLY.xml | $ md5sum umd-20070816-125223/ASSEMBLY.xml | ||
$ edit umd-20070816-125223/MANIFEST # update ASSEMBLY.xml md5sum | $ edit umd-20070816-125223/MANIFEST # update ASSEMBLY.xml md5sum | ||
$ ls -1 umd-20070816-125223* | $ ls -1 umd-20070816-125223* | ||
umd-20070816-125223.tar.gz | umd-20070816-125223.tar.gz | ||
Line 108: | Line 108: | ||
ASSEMBLY.xml # Assembly XML | ASSEMBLY.xml # Assembly XML | ||
MANIFEST # MD5 sums | MANIFEST # MD5 sums | ||
3. validate: | 3. validate: | ||
Line 116: | Line 114: | ||
$ xmllint --schema ASSEMBLY.xsd umd-20070816-125223/ASSEMBLY.xml > /dev/null | $ xmllint --schema ASSEMBLY.xsd umd-20070816-125223/ASSEMBLY.xml > /dev/null | ||
umd-20070816-125223/ASSEMBLY.xml validates | umd-20070816-125223/ASSEMBLY.xml validates | ||
4. create tarball | |||
$ tar czvf umd-20070816-125223.tar.gz umd-20070816-125223/ | |||
5. upload tarball to ftp-private.ncbi.nlm.nih.gov | |||
== TPA submissions == | == TPA submissions == |
Revision as of 14:26, 25 September 2008
NCBI
seqin: standalone application
Genome submission
TA submission
Compressed archive containing 3 files: TRACEINFO.xml, MD5, README traces/ directory SCF format traces under traces/ or traces/*/ The archive(s) is/are gzip files 1-4GB; include center's name and the date into file names Accepted only by uploading to NCBI FTP server. server: ftp-trace.ncbi.nih.gov login: passwd: center: UMD
Scripts:
/nfshomes/dpuiu/Archives/JCVI/bin/phred2xmlTrace.pl
SRA submission
server: ftp-trace.ncbi.nlm.nih.gov login: cbcb_trc password: t@@GeaYF Center_name (acronym): CBCB Full name: Center for Bioinformatics and Computational Biology, University of Maryland
Short reads: uploaded to short_read/ Sanger reads: uploaded to uploads/
Test submission: uploaded to test/ (~30 raeds)
Validation table
AA submission
Compressed archive containing 2 files: ASSEMBLY.xml , MD5 Accepted only by uploading to NCBI FTP server. server: ftp-private.ncbi.nlm.nih.gov login: umd_trc passwd: t@@GeaYF center: UMD description: University of Maryland ASSEMBLY XML Schema png ASSEMBLY XML Schema xsd
Use XContig package scripts
Files:
.contig : contigs & underlying reads (use TRACE_NAME's or SEQ_NAME's) .seq : read sequences (use TRACE_NAME's or SEQ_NAME's) .qual : read qualities (use TRACE_NAME's or SEQ_NAME's) .ti2seq_name : (TI , TRACE_NAME or SEQ_NAME)
Example:
Xoo: /fs/szasmg/Bacteria/Xanthomonas/XOO/Xoo_PXO99A/FinalAsm_June2007/AA
Steps:
1. makeConinfo ASSEMBLY.coninfo $ more ASSEMBLY.coninfo <coninfo> <meta name='center'>UMD</meta> <meta name='db'>Xoo</meta> <meta name='desc'>Xanthomonas oryzae pv. oryzae strain PXO99A</meta> <meta name='object'>ASSEMBLY</meta> <meta name='species_code'>Xanthomonas oryzae pv. oryzae strain PXO99A</meta> <meta name='structure'>Chromosome</meta> <meta name='subtype'>NEW</meta> <meta name='taxid'>360094</meta> <contig id="1106158952778_stitched" conformation="CIRCULAR" subtype="NEW"/> <file src="Xoo.contig"/> <seq src="Xoo.seq"/> <qual src="Xoo.qual"/> <idmap src="Xoo.ti2seq_name" direction="FORWARD"/> </coninfo>
2. buildAssemblyArchive ASSEMBLY.coninfo --prompt --subname umd-20070816-125223 problems: * submitter_reference="tigr...." : replace tigr with umd * conformation: always LINEAR : replace LINEAR with CIRCULAR * taxid: not recognized : replace <taxid>id</taxid> with <organism descriptor="TAXID">id</organism>
$ md5sum umd-20070816-125223/ASSEMBLY.xml $ edit umd-20070816-125223/MANIFEST # update ASSEMBLY.xml md5sum $ ls -1 umd-20070816-125223* umd-20070816-125223.tar.gz umd-20070816-125223/ 1106158952778_stitched_20070817-141849.con # Contig consensus 1106158952778_stitched_20070817-141849.congap # Contig gaps ASSEMBLY.xml # Assembly XML MANIFEST # MD5 sums
3. validate: oXygen: software used by NCBI; license required xmllint: open source $ xmllint --schema ASSEMBLY.xsd umd-20070816-125223/ASSEMBLY.xml > /dev/null umd-20070816-125223/ASSEMBLY.xml validates
4. create tarball $ tar czvf umd-20070816-125223.tar.gz umd-20070816-125223/
5. upload tarball to ftp-private.ncbi.nlm.nih.gov
TPA submissions
- Use SeqIn : /nfshomes/dpuiu/szdevel/sequin.8.10/sequin; e-mail the output file to gb_sub@ncbi.nlm.nih.gov
- Register WGS