Personal tools

Webarc:Merge DB Constructor: Difference between revisions

From Adapt

Jump to: navigation, search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 3: Line 3:


== How To Build ==
== How To Build ==
In Eclipse, configure a run
In Eclipse, configure a run.
# Right-click on 'colstate' in Package Explorer, select 'Run As.. --> Run Configurations'.
# Right-click on ''colstate'' in Package Explorer, select ''Run As.. --> Run Configurations''.
# On the left pane, right click on 'Java Application --> New'
# On the left pane, right click on ''Java Application --> New''
# Enter 'mdb' in the Name field on the right pane.
# Enter ''mdb'' in the Name field on the right pane.
# Select 'colstate' in the Project field.
# Select ''colstate'' in the Project field.
# Select 'edu.umd.umiacs.temporalscoring.CollectionState' in the Main class field
# Select ''edu.umd.umiacs.temporalscoring.CollectionState'' in the Main class field
# Click 'Apply'
# Click ''Apply''
# Click 'Close'
# Click ''Close''


In Eclipse, export 'colstate' as a runnable JAR.
In Eclipse, export ''colstate'' as a runnable JAR.
# Right-click on 'colstate' in Package Explorer, select 'export'.
# Right-click on ''colstate'' in Package Explorer, select ''export''.
# Select 'mdb - colstate' as Launch configuration.
# Select ''Runnable JAR file'' and click ''Next''.
# Put <your directory>/mdb.jar in Export destination.
# Select ''mdb - colstate'' as Launch configuration.
# Select 'Package required libraries into generated JAR'
# Put ''<your directory>/mdb.jar'' in Export destination.
# Click 'Finish'
# Select ''Package required libraries into generated JAR''
# Click ''Finish''


In a shell terminal (or a command line prompt in Windows), change directory to where mdb.jar is located (<your directory> above).
== How To Run ==
In a shell terminal (or a command line prompt in Windows), change directory to where mdb.jar is located (''<your directory>'' above).


<pre>
<pre>

Latest revision as of 15:25, 11 November 2009

What It Does

This tool constructs Merge DB for each month, which contains the union set of records between Merge DB of the previous month and Fresh DB of the current month. I.e. For month m, <math>MergeDB_m = MergeDB_{m-1} \cup FreshDB_m</math>. Since constructing a Merge DB for each month requires an existing Merge DB for the previous month, this tool needs to be run sequentially from the first month to the last month.

How To Build

In Eclipse, configure a run.

  1. Right-click on colstate in Package Explorer, select Run As.. --> Run Configurations.
  2. On the left pane, right click on Java Application --> New
  3. Enter mdb in the Name field on the right pane.
  4. Select colstate in the Project field.
  5. Select edu.umd.umiacs.temporalscoring.CollectionState in the Main class field
  6. Click Apply
  7. Click Close

In Eclipse, export colstate as a runnable JAR.

  1. Right-click on colstate in Package Explorer, select export.
  2. Select Runnable JAR file and click Next.
  3. Select mdb - colstate as Launch configuration.
  4. Put <your directory>/mdb.jar in Export destination.
  5. Select Package required libraries into generated JAR
  6. Click Finish

How To Run

In a shell terminal (or a command line prompt in Windows), change directory to where mdb.jar is located (<your directory> above).

java -jar mdb.jar <MergeDBNames.lst>

Input File

<MergeDBNames.lst>: A file that lists the locations of the Merge DBs. The name of Fresh DB is assumed to be the concatenation of the name of the Merge DB for the same month and '-fresh'. For example, for Merge DB 'month-003', Fresh DB is assumed to be 'month-003-fresh'. Note that since this tool cannot be run in parallel, this file needs to contain the entire list for all months.

Example contents in a list file:

/fs/webarc3/data/wikipedia/bdb-monthly/month-000
/fs/webarc3/data/wikipedia/bdb-monthly/month-001
/fs/webarc3/data/wikipedia/bdb-monthly/month-002
/fs/webarc3/data/wikipedia/bdb-monthly/month-003
/fs/webarc3/data/wikipedia/bdb-monthly/month-004
/fs/webarc3/data/wikipedia/bdb-monthly/month-005
/fs/webarc3/data/wikipedia/bdb-monthly/month-006
/fs/webarc3/data/wikipedia/bdb-monthly/month-007
/fs/webarc3/data/wikipedia/bdb-monthly/month-008
/fs/webarc3/data/wikipedia/bdb-monthly/month-009
/fs/webarc3/data/wikipedia/bdb-monthly/month-010
/fs/webarc3/data/wikipedia/bdb-monthly/month-011
/fs/webarc3/data/wikipedia/bdb-monthly/month-012
/fs/webarc3/data/wikipedia/bdb-monthly/month-013
/fs/webarc3/data/wikipedia/bdb-monthly/month-014
/fs/webarc3/data/wikipedia/bdb-monthly/month-015
/fs/webarc3/data/wikipedia/bdb-monthly/month-016
/fs/webarc3/data/wikipedia/bdb-monthly/month-017
/fs/webarc3/data/wikipedia/bdb-monthly/month-018
/fs/webarc3/data/wikipedia/bdb-monthly/month-019
/fs/webarc3/data/wikipedia/bdb-monthly/month-020
/fs/webarc3/data/wikipedia/bdb-monthly/month-021
/fs/webarc3/data/wikipedia/bdb-monthly/month-022
/fs/webarc3/data/wikipedia/bdb-monthly/month-023
/fs/webarc3/data/wikipedia/bdb-monthly/month-024
/fs/webarc3/data/wikipedia/bdb-monthly/month-025
/fs/webarc3/data/wikipedia/bdb-monthly/month-026
/fs/webarc3/data/wikipedia/bdb-monthly/month-027
/fs/webarc3/data/wikipedia/bdb-monthly/month-028
/fs/webarc3/data/wikipedia/bdb-monthly/month-029
/fs/webarc3/data/wikipedia/bdb-monthly/month-030
/fs/webarc3/data/wikipedia/bdb-monthly/month-031
/fs/webarc3/data/wikipedia/bdb-monthly/month-032
/fs/webarc3/data/wikipedia/bdb-monthly/month-033
/fs/webarc3/data/wikipedia/bdb-monthly/month-034
/fs/webarc3/data/wikipedia/bdb-monthly/month-035
/fs/webarc3/data/wikipedia/bdb-monthly/month-036
/fs/webarc3/data/wikipedia/bdb-monthly/month-037
/fs/webarc3/data/wikipedia/bdb-monthly/month-038
/fs/webarc3/data/wikipedia/bdb-monthly/month-039
/fs/webarc3/data/wikipedia/bdb-monthly/month-040
/fs/webarc3/data/wikipedia/bdb-monthly/month-041
/fs/webarc3/data/wikipedia/bdb-monthly/month-042
/fs/webarc3/data/wikipedia/bdb-monthly/month-043
/fs/webarc3/data/wikipedia/bdb-monthly/month-044
/fs/webarc3/data/wikipedia/bdb-monthly/month-045
/fs/webarc3/data/wikipedia/bdb-monthly/month-046
/fs/webarc3/data/wikipedia/bdb-monthly/month-047
/fs/webarc3/data/wikipedia/bdb-monthly/month-048
/fs/webarc3/data/wikipedia/bdb-monthly/month-049
/fs/webarc3/data/wikipedia/bdb-monthly/month-050
/fs/webarc3/data/wikipedia/bdb-monthly/month-051
/fs/webarc3/data/wikipedia/bdb-monthly/month-052
/fs/webarc3/data/wikipedia/bdb-monthly/month-053
/fs/webarc3/data/wikipedia/bdb-monthly/month-054
/fs/webarc3/data/wikipedia/bdb-monthly/month-055
/fs/webarc3/data/wikipedia/bdb-monthly/month-056
/fs/webarc3/data/wikipedia/bdb-monthly/month-057
/fs/webarc3/data/wikipedia/bdb-monthly/month-058
/fs/webarc3/data/wikipedia/bdb-monthly/month-059
/fs/webarc3/data/wikipedia/bdb-monthly/month-060
/fs/webarc3/data/wikipedia/bdb-monthly/month-061
/fs/webarc3/data/wikipedia/bdb-monthly/month-062
/fs/webarc3/data/wikipedia/bdb-monthly/month-063
/fs/webarc3/data/wikipedia/bdb-monthly/month-064
/fs/webarc3/data/wikipedia/bdb-monthly/month-065
/fs/webarc3/data/wikipedia/bdb-monthly/month-066
/fs/webarc3/data/wikipedia/bdb-monthly/month-067
/fs/webarc3/data/wikipedia/bdb-monthly/month-068
/fs/webarc3/data/wikipedia/bdb-monthly/month-069
/fs/webarc3/data/wikipedia/bdb-monthly/month-070
/fs/webarc3/data/wikipedia/bdb-monthly/month-071
/fs/webarc3/data/wikipedia/bdb-monthly/month-072
/fs/webarc3/data/wikipedia/bdb-monthly/month-073
/fs/webarc3/data/wikipedia/bdb-monthly/month-074
/fs/webarc3/data/wikipedia/bdb-monthly/month-075
/fs/webarc3/data/wikipedia/bdb-monthly/month-076
/fs/webarc3/data/wikipedia/bdb-monthly/month-077
/fs/webarc3/data/wikipedia/bdb-monthly/month-078
/fs/webarc3/data/wikipedia/bdb-monthly/month-079
/fs/webarc3/data/wikipedia/bdb-monthly/month-080
/fs/webarc3/data/wikipedia/bdb-monthly/month-081
/fs/webarc3/data/wikipedia/bdb-monthly/month-082
/fs/webarc3/data/wikipedia/bdb-monthly/month-083

Output Files

Under the same directory under which Fresh DBs are located, new directories for Merge DBs are generated. The names of the new directories are as specified in the input file.

Notes

There is an assumption on the Fresh DB names. For month m, if MergeDB name is <month-m>, FreshDB name is assumed to be <month-m-fresh>.

Source Codes

svn co http://narasvn.umiacs.umd.edu/repository/src/webarc/colstate