Latest revision as of 17:08, 19 May 2010

HTTP Access

Data can be uploaded and downloaded using http. Data is accessed using a REST-ish mechanism. To construct a URL, you will need to know three things

The address of any swap server
The base path of the file group containing the data you want to pull
The path within the file group to your file

After you know these items, the url is constructed as follows:

http://[server[:port]]/[function]/[group_path]/[file_path]?[function_options]

Let's assume you have a file on server1.university.edu running on port 8080 (the default), the file is in a file group with prefix processdata/webcrawls/2004crawl and the file is located in the directory /oct2004/35/crawlfile.arc.gz

The url would be http://server1.university.edu:8080/get/processdata/webcrawls/2004crawl/oct2004/35/crawlfile.arc.gz

Downloading Files

Whole File

Partial File

Download Arc Files=

This function assumes any files that you are pulling is a file containing concatenated arc entries, where each arc entry has been gzip'd.

offset - offset to start reading within the compressed file
contentonly - (optional) set to true to strip out arc http header information (default: false)

http://server1.university.edu:8080/arc/processdata/webcrawls/2004crawl/oct2004/35/crawlfile.arc.gz?offset=6789&contentonly=true

@@ Line 14: / Line 14: @@
 The url would be http://server1.university.edu:8080/get/processdata/webcrawls/2004crawl/oct2004/35/crawlfile.arc.gz
-==Downloading Data==
+==Downloading Files==
 '''Whole File'''
@@ Line 20: / Line 20: @@
 '''Partial File'''
-'''Arc File'''
+==Download Arc Files===
+This function assumes any files that you are pulling is a file containing concatenated arc entries, where each arc entry has been gzip'd.
+* offset - offset to start reading within the compressed file
+* contentonly - (optional) set to true to strip out arc http header information (default: false)
+http://server1.university.edu:8080/arc/processdata/webcrawls/2004crawl/oct2004/35/crawlfile.arc.gz?offset=6789&contentonly=true

Personal tools

Swap:Data Access: Difference between revisions - Adapt

Search

General

Projects

Research

Tools

Swap:Data Access: Difference between revisions

From Adapt

Latest revision as of 17:08, 19 May 2010

HTTP Access

Downloading Files

Download Arc Files=

	This page was last edited on 19 May 2010, at 17:08. Privacy policy About Adapt Disclaimers
	Mozilla Cavendish skin modified by DaSch for the Web Community Wiki GitHub project page – Report a bug – Skin version: 2.6.0