Skip to content

Latest commit

 

History

History
104 lines (83 loc) · 4 KB

RESULTS.md

File metadata and controls

104 lines (83 loc) · 4 KB

Results with all metrics:

  1. For MS:

a. Default settings:

python -m benchmark.manticore.evaluate data/trec-covid test trec_covid
metric k=1 k=2 k=5 k=10
0 NDCG 0.29 0.28226 0.29505 0.29494
1 MAP 0.00083 0.00143 0.00275 0.00478
2 Recall 0.00083 0.00164 0.00394 0.00773
3 P 0.36 0.34 0.36 0.352
4 MRR 0.36 0.44 0.51 0.51872
5 R_cap 0.36 0.34 0.36 0.352
6 Hole 0.04 0.05 0.072 0.114
7 Accuracy 0.36 0.52 0.78 0.86
python -m benchmark.manticore.evaluate data/nfcorpus test nfcorpus
metric k=1 k=2 k=5 k=10
0 NDCG 0.41176 0.36619 0.31409 0.28791
1 MAP 0.05493 0.07349 0.0934 0.10805
2 Recall 0.05493 0.07689 0.10895 0.14141
3 P 0.42105 0.34985 0.25759 0.20402
4 MRR 0.42724 0.4613 0.49138 0.49786
5 R_cap 0.42724 0.36223 0.29551 0.26369
6 Hole 0.05573 0.06502 0.07988 0.09226
7 Accuracy 0.42724 0.49536 0.60681 0.65635

b. ES-like settings:

python -m benchmark.manticore.evaluate data/trec-covid test trec_covid_es_like
metric k=1 k=2 k=5 k=10
0 NDCG 0.72 0.69292 0.64028 0.59764
1 MAP 0.00206 0.00401 0.00837 0.01405
2 Recall 0.00206 0.00425 0.00916 0.01664
3 P 0.78 0.78 0.7 0.652
4 MRR 0.78 0.86 0.86667 0.87333
5 R_cap 0.78 0.78 0.7 0.652
6 Hole 0.02 0.01 0.04 0.048
7 Accuracy 0.78 0.94 0.96 1
python -m benchmark.manticore.evaluate data/nfcorpus test nfcorpus_es_like
metric k=1 k=2 k=5 k=10
0 NDCG 0.4257 0.39989 0.35145 0.31715
1 MAP 0.05581 0.07796 0.10248 0.11704
2 Recall 0.05581 0.0823 0.12377 0.14913
3 P 0.43963 0.39474 0.29969 0.22941
4 MRR 0.44582 0.4969 0.52183 0.5269
5 R_cap 0.44582 0.40402 0.3371 0.29062
6 Hole 0.06811 0.06502 0.07616 0.08421
7 Accuracy 0.44582 0.54799 0.63777 0.67802
  1. For ES:
python -m benchmark.es.evaluate_bm25 data/trec-covid test trec_covid
metric k=1 k=2 k=5 k=10
0 NDCG 0.82 0.79679 0.72491 0.68803
1 MAP 0.00234 0.0044 0.00961 0.01698
2 Recall 0.00234 0.00443 0.01027 0.01907
3 P 0.88 0.84 0.768 0.734
4 MRR 0.88 0.9 0.92167 0.92167
5 R_cap 0.88 0.83 0.768 0.734
6 Hole 0.02 0.03 0.052 0.054
7 Accuracy 0.88 0.92 1 1
python -m benchmark.es.evaluate_bm25 data/nfcorpus test nfcorpus
metric k=1 k=2 k=5 k=10
0 NDCG 0.44968 0.4197 0.37705 0.34281
1 MAP 0.05936 0.08833 0.11329 0.12969
2 Recall 0.05936 0.09328 0.13313 0.16603
3 P 0.46753 0.41396 0.32273 0.24708
4 MRR 0.44892 0.49536 0.52023 0.52954
5 R_cap 0.44892 0.40712 0.34711 0.30188
6 Hole 0.06192 0.07276 0.07802 0.08359
7 Accuracy 0.44892 0.5418 0.62848 0.70279