Personal tools

Webarc:Temporally Anchored Scoring Experiments: Difference between revisions

From Adapt

Jump to: navigation, search
No edit summary
 
No edit summary
Line 1: Line 1:
==Input Dataset==
==Input Dataset==
We preprocessed the entire revision history of the English Wikipedia from 2001 to 2007 (available at http://www.archive.org/details/enwiki-20080425). After preprocessing, we obtained 84 monthly snapshots starting from January 2001 ending in December 2007. Included in each monthly snapshot are the latest revision of existing articles at the end of the month. For example, the Wikipedia article 'Economy of the United States' created on [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=649100 August 21 2002] was included in the six monthly snapshots of August 2002, September 2002, ..., January 2003 since there is no newer revision made until [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=749002 February 7 2003], whereas the same article revised on [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=166885 August 16 2002] was not included in any of the snapshots. [[Webarc:Input Dataset Statistics |This page]] shows some statistics of the monthly snapshots.
We preprocessed the entire revision history of the English Wikipedia from 2001 to 2007 (available at http://www.archive.org/details/enwiki-20080425). After preprocessing, we obtained 84 monthly snapshots starting from January 2001 ending in December 2007. Included in each monthly snapshot are the latest revision of existing articles at the end of the month. For example, the Wikipedia article 'Economy of the United States' created on [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=649100 August 21 2002] was included in the six monthly snapshots of August 2002, September 2002, ..., January 2003 since there is no newer revision made until [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=749002 February 7 2003], whereas the same article revised on [http://en.wikipedia.org/w/index.php?title=Economy_of_the_United_States&oldid=166885 August 16 2002] was not included in any of the snapshots. [[Webarc:Input Dataset Statistics |This page]] shows further details on the monthly snapshots.


==Queries==
==Queries==
We used the  
From the AOL query log that consists of


==Further Information==
==Further Information==
* [[Webarc:Input Dataset Statistics |[1] Input Dataset Statistics]]
* [[Webarc:Input Dataset Statistics |[1] Input Dataset Statistics]]
* [[Webarc:Tools Developed |[2] Tools Developed]]
* [[Webarc:Tools Developed |[2] Tools Developed]]

Revision as of 02:42, 18 November 2009

Input Dataset

We preprocessed the entire revision history of the English Wikipedia from 2001 to 2007 (available at http://www.archive.org/details/enwiki-20080425). After preprocessing, we obtained 84 monthly snapshots starting from January 2001 ending in December 2007. Included in each monthly snapshot are the latest revision of existing articles at the end of the month. For example, the Wikipedia article 'Economy of the United States' created on August 21 2002 was included in the six monthly snapshots of August 2002, September 2002, ..., January 2003 since there is no newer revision made until February 7 2003, whereas the same article revised on August 16 2002 was not included in any of the snapshots. This page shows further details on the monthly snapshots.

Queries

From the AOL query log that consists of

Further Information