Personal tools

Webarc:Tools Developed: Difference between revisions

From Adapt

Jump to: navigation, search
No edit summary
 
No edit summary
Line 1: Line 1:
== Input Data Preprocessors ==
== Input Data Preprocessors ==
*[[Webarc:LRE Monthly Dumper|LRE Monthly Dumper] (Java):
*[[Webarc:LRE Monthly Dumper|LRE Monthly Dumper]] (Java):
*[[Webarc:MediaWiki-to-TREC Converter|MediaWiki-to-TREC Converter] (Java):
*[[Webarc:MediaWiki-to-TREC Converter|MediaWiki-to-TREC Converter]] (Java):
*[[Webarc:Merge DB Constructor|Merge DB Constructor] (Java):
*[[Webarc:Merge DB Constructor|Merge DB Constructor]] (Java):
*[[Webarc:Carryover DB Constructor|Carryover DB Constructor] (Java):
*[[Webarc:Carryover DB Constructor|Carryover DB Constructor]] (Java):


For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month.
For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month.
Line 9: Line 9:


== Indexers ==
== Indexers ==
 
[[Webarc:Lemur Indexer (modified)|Lemur Indexer (modified)]] (C++):




== Retrievers ==
== Retrievers ==
 
[[Webarc:Temporal Okapi Retrieval Method|Temporal Okapi Retrieval Method]] (C++):
[[Webarc:Temporal KL Retrieval Method|Temporal KL Retrieval Method]] (C++):




== Miscellaneous ==
== Miscellaneous ==
[[Webarc:Berkeley DB Wrapper for Carryover DB|Berkeley DB Wrapper for Carryover DB]] (Java): A simple and elegant wrapper for any C/C++ implementations that want to access Carryover DB (which is based on Java Berkeley DB) via Java JNI.

Revision as of 20:20, 9 November 2009

Input Data Preprocessors

For our experiments, we have run the tools listed above sequentially to obtain monthly snapshots of Wikipedia revisions from 2001 to 2007, and also to identify, for each month, the articles that do not have new revisions in the current month, thus need to be carried over and indexed for the current month.


Indexers

Lemur Indexer (modified) (C++):


Retrievers

Temporal Okapi Retrieval Method (C++): Temporal KL Retrieval Method (C++):


Miscellaneous

Berkeley DB Wrapper for Carryover DB (Java): A simple and elegant wrapper for any C/C++ implementations that want to access Carryover DB (which is based on Java Berkeley DB) via Java JNI.