Webarc:Tools Developed
From Adapt
Input Data Preprocessors
- LRE Monthly Dumper (Java):
- MediaWiki-to-TREC Converter (Java):
- Merge DB Constructor (Java):
- Carryover DB Constructor (Java):
For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month.
Indexers
Lemur Indexer (modified) (C++):
Retrievers
Temporal Okapi Retrieval Method (C++): Temporal KL Retrieval Method (C++):
Miscellaneous
Berkeley DB Wrapper for Carryover DB (Java): A simple and elegant wrapper for any C/C++ implementations that want to access Carryover DB (which is based on Java Berkeley DB) via Java JNI.