Webarc:Tools Developed: Difference between revisions
From Adapt
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
== Input Data Preprocessors == | == Input Data Preprocessors == | ||
*[[Webarc:LRE Monthly Dumper|LRE Monthly Dumper] (Java): | *[[Webarc:LRE Monthly Dumper|LRE Monthly Dumper]] (Java): | ||
*[[Webarc:MediaWiki-to-TREC Converter|MediaWiki-to-TREC Converter] (Java): | *[[Webarc:MediaWiki-to-TREC Converter|MediaWiki-to-TREC Converter]] (Java): | ||
*[[Webarc:Merge DB Constructor|Merge DB Constructor] (Java): | *[[Webarc:Merge DB Constructor|Merge DB Constructor]] (Java): | ||
*[[Webarc:Carryover DB Constructor|Carryover DB Constructor] (Java): | *[[Webarc:Carryover DB Constructor|Carryover DB Constructor]] (Java): | ||
For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month. | For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month. | ||
Line 9: | Line 9: | ||
== Indexers == | == Indexers == | ||
[[Webarc:Lemur Indexer (modified)|Lemur Indexer (modified)]] (C++): | |||
== Retrievers == | == Retrievers == | ||
[[Webarc:Temporal Okapi Retrieval Method|Temporal Okapi Retrieval Method]] (C++): | |||
[[Webarc:Temporal KL Retrieval Method|Temporal KL Retrieval Method]] (C++): | |||
== Miscellaneous == | == Miscellaneous == | ||
[[Webarc:Berkeley DB Wrapper for Carryover DB|Berkeley DB Wrapper for Carryover DB]] (Java): A simple and elegant wrapper for any C/C++ implementations that want to access Carryover DB (which is based on Java Berkeley DB) via Java JNI. |
Revision as of 20:20, 9 November 2009
Input Data Preprocessors
- LRE Monthly Dumper (Java):
- MediaWiki-to-TREC Converter (Java):
- Merge DB Constructor (Java):
- Carryover DB Constructor (Java):
For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month.
Indexers
Lemur Indexer (modified) (C++):
Retrievers
Temporal Okapi Retrieval Method (C++): Temporal KL Retrieval Method (C++):
Miscellaneous
Berkeley DB Wrapper for Carryover DB (Java): A simple and elegant wrapper for any C/C++ implementations that want to access Carryover DB (which is based on Java Berkeley DB) via Java JNI.