Webarc:Temporally Anchored Scoring Experiments: Difference between revisions
From Adapt
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
==Queries== | ==Queries== | ||
From the AOL query log that consists of | Based on the AOL query log made available for a short period of time in 2005, we built our temporal query load by extracting multi-term query phrases that led the user to select a Wikipedia article among the search results. We chose only ''multi-term'' queries because 1. single-term queries do not have ... 2. we wanted to investigate cases where different temporal circumstances give weights to search terms differently.... | ||
However, considering that about 80% of search queries are multi-terms, | |||
resulted in a click to the Wikipedia website. | |||
We built a temporally-anchored query load From the AOL query log that consists of | |||
==Further Information== | ==Further Information== | ||
* [[Webarc:Input Dataset Statistics |[1] Input Dataset Statistics]] | * [[Webarc:Input Dataset Statistics |[1] Input Dataset Statistics]] | ||
* [[Webarc:Tools Developed |[2] Tools Developed]] | * [[Webarc:Tools Developed |[2] Tools Developed]] |
Revision as of 03:13, 18 November 2009
Input Dataset
We preprocessed the entire revision history of the English Wikipedia from 2001 to 2007 (available at http://www.archive.org/details/enwiki-20080425). After preprocessing, we obtained 84 monthly snapshots starting from January 2001 ending in December 2007. Included in each monthly snapshot are the latest revision of existing articles at the end of the month. For example, the Wikipedia article 'Economy of the United States' created on August 21 2002 was included in the six monthly snapshots of August 2002, September 2002, ..., January 2003 since there is no newer revision made until February 7 2003, whereas the same article revised on August 16 2002 was not included in any of the snapshots. This page shows further details on the monthly snapshots.
Queries
Based on the AOL query log made available for a short period of time in 2005, we built our temporal query load by extracting multi-term query phrases that led the user to select a Wikipedia article among the search results. We chose only multi-term queries because 1. single-term queries do not have ... 2. we wanted to investigate cases where different temporal circumstances give weights to search terms differently....
However, considering that about 80% of search queries are multi-terms,
resulted in a click to the Wikipedia website.
We built a temporally-anchored query load From the AOL query log that consists of