Webarc:Trec Eval (modified)
From Adapt
What It Does
We added Kendall's Tau (http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient) support in the existing Trec_Eval tool (http://trec.nist.gov/trec_eval/).
How To Build
In the Trec_Eval directory,
make
How To Run
Below is copied from the README file that came with trec_eval.8.1
trec_eval [-h] [-q] [-a] [-o] [-c] [-l<num> [-N<num>] [-M<num>] [-Ua<num>] [-Ub<num>] [-Uc<num>] [-Ud<num>] [-T] trec_rel_file trec_top_file
Calculate and print various evaluation measures, evaluating the results in trec_top_file against the relevance judgements in trec_rel_file.
There are a fair number of options, of which only the lower case options are normally ever used.
-h: Print full help message and exit -q: In addition to summary evaluation, give evaluation for each query -a: Print all evaluation measures calculated, instead of just the main official measures for TREC. -o: Print everything out in old, nonrelational format (default is relational) -c: Average over the complete set of queries in the relevance judgements instead of the queries in the intersection of relevance judgements and results. Missing queries will contribute a value of 0 to all evaluation measures (which may or may not be reasonable for a particular evaluation measure, but is reasonable for standard TREC measures.) -l<num>: Num indicates the minimum relevance judgement value needed for a document to be called relevant. (All measures used by TREC eval are based on binary relevance). Used if trec_rel_file contains relevance judged on a multi-relevance scale. Default is 1. -N<num>: Number of docs in collection -M<num>: Max number of docs per topic to use in evaluation (discard rest). -Ua<num>: Value to use for 'a' coefficient of utility computation. relevant nonrelevant retrieved a b nonretrieved c d -Ub<num>: Value to use for 'b' coefficient of utility computation. -Uc<num>: Value to use for 'c' coefficient of utility computation. -Ud<num>: Value to use for 'd' coefficient of utility computation. -J: Calculate all values only over the judged (either relevant or nonrelevant) documents. All unjudged documents are removed from the retrieved set before any calculations (possibly leaving an empty set). DO NOT USE, unless you really know what you're doing - very easy to get reasonable looking, but invalid, numbers. -T: Treat similarity as time that document retrieved. Compute several time-based measures after ranking docs by time retrieved (first doc (lowest sim) retrieved ranked highest). Only done if -a selected.
Input File
Example trec_rel_file:
3 Q0 Politics_of_the_Falkland_Islands04-29-2001 1 11.12743 qts.1_tw.1 3 Q0 Economy_of_the_Falkland_Islands04-29-2001 2 11.12743 qts.1_tw.1 3 Q0 History_of_the_Falkland_Islands04-29-2001 3 11.12743 qts.1_tw.1 3 Q0 Foreign_relations_of_Cyprus04-26-2001 4 8.11532 qts.1_tw.1 3 Q0 Economy_of_Finland04-29-2001 5 7.12460 qts.1_tw.1 3 Q0 Economy_of_the_Faroe_Islands04-29-2001 6 6.77037 qts.1_tw.1 3 Q0 Economy_of_Canada03-26-2001 7 5.95059 qts.1_tw.1 3 Q0 Economy_of_Germany03-10-2001 8 5.74032 qts.1_tw.1 3 Q0 Paul_Robeson04-20-2001 9 4.97028 qts.1_tw.1 3 Q0 Politics_of_Dominica04-26-2001 10 4.09356 qts.1_tw.1 3 Q0 Economy_of_the_Republic_of_the_Congo04-26-2001 11 3.78614 qts.1_tw.1 3 Q0 Economy_of_Ethiopia04-29-2001 12 3.74733 qts.1_tw.1 3 Q0 Politics_of_Cyprus04-26-2001 13 2.08520 qts.1_tw.1 4 Q0 Politics_of_the_Falkland_Islands04-29-2001 1 12.70288 qts.1_tw.1 4 Q0 Christ's_College,_Cambridge05-20-2001 2 12.24870 qts.1_tw.1 4 Q0 Politics_of_Gibraltar05-17-2001 3 12.16519 qts.1_tw.1 4 Q0 History_of_Hong_Kong05-04-2001 4 11.67782 qts.1_tw.1 4 Q0 Vittorio_De_Sica05-13-2001 5 11.17646 qts.1_tw.1 4 Q0 Aphex_Twin05-03-2001 6 11.02514 qts.1_tw.1
Example trec_top_filefiles:
3 Q0 Politics_of_Dominica04-26-2001 1 11.09356 qts.1_tw.1 3 Q0 Paul_Robeson04-20-2001 2 10.97028 qts.1_tw.1 3 Q0 Economy_of_Germany03-10-2001 3 9.74032 qts.1_tw.1 3 Q0 Economy_of_Canada03-26-2001 4 8.95059 qts.1_tw.1 3 Q0 Economy_of_the_Faroe_Islands04-29-2001 5 7.77037 qts.1_tw.1 3 Q0 Economy_of_Finland04-29-2001 6 6.12460 qts.1_tw.1 3 Q0 Foreign_relations_of_Cyprus04-26-2001 7 5.11532 qts.1_tw.1 3 Q0 History_of_the_Falkland_Islands04-29-2001 8 4.37087 qts.1_tw.1 3 Q0 Economy_of_the_Falkland_Islands04-29-2001 9 3.01286 qts.1_tw.1 3 Q0 Politics_of_the_Falkland_Islands04-29-2001 10 2.12743 qts.1_tw.1 3 Q0 Politics_of_Cyprus04-26-2001 13 2.08520 qts.1_tw.1 3 Q0 Economy_of_Ethiopia04-29-2001 12 1.74733 qts.1_tw.1 3 Q0 Economy_of_the_Republic_of_the_Congo04-26-2001 11 0.78614 qts.1_tw.1 4 Q0 Politics_of_the_Falkland_Islands04-29-2001 1 12.70288 qts.1_tw.1 4 Q0 Christ's_College,_Cambridge05-20-2001 2 12.24870 qts.1_tw.1 4 Q0 Politics_of_Gibraltar05-17-2001 3 12.16519 qts.1_tw.1 4 Q0 History_of_Hong_Kong05-04-2001 4 11.67782 qts.1_tw.1 4 Q0 Vittorio_De_Sica05-13-2001 5 11.17646 qts.1_tw.1 4 Q0 Aphex_Twin05-03-2001 6 11.02514 qts.1_tw.1
Output Files
Example output file for the example input files above (Note Kendall's Tau's in the middle):
num_q all 2 num_ret all 19 num_rel all 19 num_rel_ret all 19 map all 1.0000 gm_ap all 1.0000 R-prec all 1.0000 bpref all 1.0000 recip_rank all 1.0000 num_nonrel_judged_ret all 0 exact_prec all 1.0000 exact_recall all 1.0000 11-pt_avg all 1.0000 3-pt_avg all 1.0000 avg_doc_prec all 1.0000 exact_relative_prec all 1.0000 avg_relative_prec all 1.0000 exact_unranked_avg_prec all 1.0000 exact_relative_unranked_avg_prec all 1.0000 map_at_R all 0.8782 int_map all 1.0000 exact_int_R_rcl_prec all 1.0000 int_map_at_R all 0.8782 bpref_allnonrel all 1.0000 bpref_retnonrel all 1.0000 bpref_topnonrel all 1.0000 bpref_top5Rnonrel all nan bpref_top10Rnonrel all nan bpref_top10pRnonrel all 1.0000 bpref_top25pRnonrel all 1.0000 bpref_top50pRnonrel all 1.0000 bpref_top25p2Rnonrel all 1.0000 bpref_retall all 1.0000 bpref_5 all 1.0000 bpref_10 all 1.0000 bpref_num_all all 0.0000 bpref_num_ret all 0.0000 bpref_num_correct all 0 bpref_num_possible all 0 old_bpref all 1.0000 old_bpref_top10pRnonrel all 1.0000 infAP all 1.0000 gm_bpref all 1.0000 ircl_prn.0.00 all 1.0000 ircl_prn.0.10 all 1.0000 ircl_prn.0.20 all 1.0000 ircl_prn.0.30 all 1.0000 ircl_prn.0.40 all 1.0000 ircl_prn.0.50 all 1.0000 ircl_prn.0.60 all 1.0000 ircl_prn.0.70 all 1.0000 ircl_prn.0.80 all 1.0000 ircl_prn.0.90 all 1.0000 ircl_prn.1.00 all 1.0000 P5 all 1.0000 P10 all 0.8000 P20 all 0.4750 P30 all 0.3167 P50 all 0.1900 P100 all 0.0950 recall5 all 0.6090 recall10 all 0.8846 recall20 all 1.0000 recall30 all 1.0000 recall50 all 1.0000 recall100 all 1.0000 0.20R-prec all 1.0000 0.40R-prec all 1.0000 0.60R-prec all 1.0000 0.80R-prec all 1.0000 1.00R-prec all 1.0000 1.20R-prec all 0.7812 1.40R-prec all 0.6754 1.60R-prec all 0.6095 1.80R-prec all 0.5436 2.00R-prec all 0.5000 relative_prec5 all 1.0000 relative_prec10 all 1.0000 relative_prec20 all 1.0000 relative_prec30 all 1.0000 relative_prec50 all 1.0000 relative_prec100 all 1.0000 unranked_avg_prec5 all 0.6090 unranked_avg_prec10 all 0.6846 unranked_avg_prec20 all 0.4750 unranked_avg_prec30 all 0.3167 unranked_avg_prec50 all 0.1900 unranked_avg_prec100 all 0.0950 relative_unranked_avg_prec5 all 1.0000 relative_unranked_avg_prec10 all 0.6800 relative_unranked_avg_prec20 all 0.2562 relative_unranked_avg_prec30 all 0.1139 relative_unranked_avg_prec50 all 0.0410 relative_unranked_avg_prec100 all 0.0102 Kendall's_tau5 all 0.0000 Kendall's_tau10 all 0.0000 Kendall's_tau20 all 0.3846 Kendall's_tau30 all 0.3846 Kendall's_tau50 all 0.3846 Kendall's_tau100 all 0.3846 utility_1.0_-1.0_0.0_0.0 all 9.5000 rcl_at_142_nonrel all 1.0000 fallout_recall_0 all 0.0000 fallout_recall_14 all 1.0000 fallout_recall_28 all 1.0000 fallout_recall_42 all 1.0000 fallout_recall_56 all 1.0000 fallout_recall_71 all 1.0000 fallout_recall_85 all 1.0000 fallout_recall_99 all 1.0000 fallout_recall_113 all 1.0000 fallout_recall_127 all 1.0000 fallout_recall_142 all 1.0000 int_0.20R-prec all 1.0000 int_0.40R-prec all 1.0000 int_0.60R-prec all 1.0000 int_0.80R-prec all 1.0000 int_1.00R-prec all 1.0000 int_1.20R-prec all 0.7812 int_1.40R-prec all 0.6754 int_1.60R-prec all 0.6095 int_1.80R-prec all 0.5436 int_2.00R-prec all 0.5000 micro_prec all 1.0000 micro_recall all 1.0000 micro_bpref all nan
Notes
Use -a option to generate Kendall's Tau values.
Source Codes
svn co http://narasvn.umiacs.umd.edu/repository/src/webarc/trec_eval.8.1