Validating Simulations of User Query Variants

Breuer, Timo; Fuhr, Norbert; Schaer, Philipp

doi:10.1007/978-3-030-99736-6_6

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13185))

Included in the following conference series:

European Conference on Information Retrieval

2662 Accesses
3 Altmetric

Abstract

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior. As a solution, simulating user interactions provides a cost-efficient way to support system-oriented experiments with more realistic directives when no interaction logs are available. While there are several user models for simulated clicks or result list interactions, very few attempts have been made towards query simulations, and it has not been investigated if these can reproduce properties of real queries. In this work, we validate simulated user query variants with the help of TREC test collections in reference to real user queries that were made for the corresponding topics. Besides, we introduce a simple yet effective method that gives better reproductions of real queries than the established methods. Our evaluation framework validates the simulations regarding the retrieval performance, reproducibility of topic score distributions, shared task utility, effort and effect, and query term similarity when compared with real user query variants. While the retrieval effectiveness and statistical properties of the topic score distributions as well as economic aspects are close to that of real queries, it is still challenging to simulate exact term matches and later query reformulations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluating Simulated User Interaction and Search Behaviour

User Simulations for Interactive Search: Evaluating Personalized Query Suggestion

Validating simulated interaction for retrieval evaluation

Article 06 May 2017

Notes

1.
https://culpepper.io/publications/robust-uqv.txt.gz.
2.
https://github.com/castorini/anserini/blob/master/docs/regressions-core17.md.
3.
https://github.com/irgroup/ecir2022-uqv-sim.
4.
S1 and S3, as well as S2 and S\(3^\prime \), do not differ when averaging over the first queries.
5.
Applying the Bonferroni correction adjusts the alpha level to \(\alpha =\frac{0.05}{64}\approx 0.0008\) (considering eight users and eight query simulators for an alpha level of 0.05).

References

Allan, J., Harman, D., Kanoulas, E., Li, D., Gysel, C.V., Voorhees, E.M.: TREC 2017 common core track overview. In: Proceedings of the TREC (2017)
Google Scholar
Azzopardi, L.: The economics in interactive information retrieval. In: Proceedings of the SIGIR, pp. 15–24 (2011)
Google Scholar
Azzopardi, L., de Rijke, M.: Automatic construction of known-item finding test beds. In: Efthimiadis, E.N., Dumais, S.T., Hawking, D., Järvelin, K. (eds.) Proceedings of the SIGIR, pp. 603–604 (2006)
Google Scholar
Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six European languages. In: Proceedings of the SIGIR, pp. 455–462 (2007)
Google Scholar
Baskaya, F., Keskustalo, H., Järvelin, K.: Time drives interaction: simulating sessions in diverse searching environments. In: Proceedings of the SIGIR, pp. 105–114 (2012)
Google Scholar
Benham, R., Culpepper, J.S.: Risk-reward trade-offs in rank fusion. In: Proceedings of the ADCS, pp. 1:1–1:8 (2017)
Google Scholar
Benham, R., et al.: RMIT at the 2017 TREC CORE track. In: Proceedings of the TREC (2017)
Google Scholar
Benham, R., Mackenzie, J.M., Moffat, A., Culpepper, J.S.: Boosting search performance using query variations. ACM Trans. Inf. Syst. 37(4), 41:1-41:25 (2019)
Article Google Scholar
Berendsen, R., Tsagkias, M., de Rijke, M., Meij, E.: Generating pseudo test collections for learning to rank scientific articles. In: Proceedings of the CLEF, pp. 42–53 (2012)
Google Scholar
Breuer, T., et al.: How to measure the reproducibility of system-oriented IR experiments. In: Proceedings of the SIGIR, pp. 349–358 (2020)
Google Scholar
Breuer, T., Ferro, N., Maistro, M., Schaer, P.: Repro_eval: a python interface to reproducibility measures of system-oriented IR experiments. In: Proceedings of the ECIR, pp. 481–486 (2021)
Google Scholar
Carterette, B., Bah, A., Zengin, M.: Dynamic test collections for retrieval evaluation. In: Proceedings of the ICTIR, pp. 91–100. ACM (2015)
Google Scholar
Chuklin, A., Markov, I., de Rijke, M.: Click models for web search. In: Retrieval, and Services, Morgan & Claypool Publishers, Synthesis Lectures on Information Concepts (2015)
Google Scholar
Craswell, N., Campos, D., Mitra, B., Yilmaz, E., Billerbeck, B.: ORCAS: 20 million clicked query-document pairs for analyzing search. In: Proceedings of the CIKM, pp. 2983–2989 (2020)
Google Scholar
Croft, W.B., Harper, D.J.: Using probabilistic models of document retrieval without relevance information. J. Document. 35(4), 285–295 (1979)
Article Google Scholar
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of the SIGIR, pp. 299–306 (2002)
Google Scholar
Eickhoff, C., Teevan, J., White, R., Dumais, S.T.: Lessons from the journey: a query log analysis of within-session learning. In: Proceedings of the WSDM, pp. 223–232 (2014)
Google Scholar
Faggioli, G., Zendel, O., Culpepper, J.S., Ferro, N., Scholer, F.: An enhanced evaluation framework for query performance prediction. In: Proceedings of the ECIR, pp. 115–129 (2021)
Google Scholar
Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: Proceedings of the SIGIR, pp. 453–462 (2013)
Google Scholar
Günther, S., Hagen, M.: Assessing query suggestions for search session simulation. In: Proceedings of the Sim4IR (2021). http://ceur-ws.org/Vol-2911/paper6.pdf
Gysel, C.V., Kanoulas, E., de Rijke, M.: Lexical query modeling in session search. In: Proceedings of the ICTIR, pp. 69–72 (2016)
Google Scholar
He, Y., Tang, J., Ouyang, H., Kang, C., Yin, D., Chang, Y.: Learning to rewrite queries. In: Proceedings of the CIKM, pp. 1443–1452 (2016)
Google Scholar
Herdagdelen, A., et al.: Generalized syntactic and semantic models of query reformulation. In: Proceedings of the SIGIR, pp. 283–290 (2010)
Google Scholar
Huurnink, B., Hofmann, K., de Rijke, M., Bron, M.: Validating query simulators: an experiment using commercial searches and purchases. In: Proceedings of the CLEF, pp. 40–51 (2010)
Google Scholar
Jansen, B.J., Booth, D.L., Spink, A.: Patterns of query reformulation during web searching. J. Assoc. Inf. Sci. Technol. 60(7), 1358–1371 (2009)
Article Google Scholar
Järvelin, K., Price, S.L., Delcambre, L.M.L., Nielsen, M.L.: Discounted cumulated gain based evaluation of multiple-query IR sessions. In: Proceedings of the ECIR, pp. 4–15 (2008)
Google Scholar
Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proceedings of the WWW, pp. 387–396 (2006)
Google Scholar
Jordan, C., Watters, C.R., Gao, Q.: Using controlled query generation to evaluate blind relevance feedback algorithms. In: Proceedings of the JCDL, pp. 286–295 (2006)
Google Scholar
Keskustalo, H., Järvelin, K., Pirkola, A., Sharma, T., Lykke, M.: Test collection-based IR evaluation needs extension toward sessions - a case of extremely short queries. In: Proceedings of the AIRS, pp. 63–74 (2009)
Google Scholar
Lin, J., Ma, X., Lin, S., Yang, J., Pradeep, R., Nogueira, R.: Pyserini: a python toolkit for reproducible information retrieval research with sparse and dense representations. In: Proceedings of the SIGIR, pp. 2356–2362. ACM (2021)
Google Scholar
Liu, B., Craswell, N., Lu, X., Kurland, O., Culpepper, J.S.: A comparative analysis of human and automatic query variants. In: Proceedings of the SIGIR, pp. 47–50 (2019)
Google Scholar
Liu, J., Sarkar, S., Shah, C.: Identifying and predicting the states of complex search tasks. In: Proceedings of the CHIIR, pp. 193–202 (2020)
Google Scholar
Mackenzie, J., Moffat, A.: Modality effects when simulating user querying tasks. In: Proceedings of the ICTIR, pp. 197–201 (2021)
Google Scholar
Maxwell, D., Azzopardi, L.: Agents, simulated users and humans: an analysis of performance and behaviour. In: Proceedings of the CIKM, pp. 731–740. ACM (2016)
Google Scholar
Maxwell, D., Azzopardi, L.: Simulating interactive information retrieval: simiir: a framework for the simulation of interaction. In: Proceedings of the SIGIR, pp. 1141–1144. ACM (2016)
Google Scholar
Moffat, A., Scholer, F., Thomas, P., Bailey, P.: Pooled evaluation over query variations: users are as diverse as systems. In: Proceedings of the CIKM, pp. 1759–1762 (2015)
Google Scholar
Pääkkönen, T., Kekäläinen, J., Keskustalo, H., Azzopardi, L., Maxwell, D., Järvelin, K.: Validating simulated interaction for retrieval evaluation. Inf. Ret. J. 20(4), 338–362 (2017). https://doi.org/10.1007/s10791-017-9301-2
Article Google Scholar
Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev. 18(2), 95–145 (2003)
Article Google Scholar
Sloan, M., Yang, H., Wang, J.: A term-based methodology for query reformulation understanding. Inf. Retriev. J. 18(2), 145–165 (2015). https://doi.org/10.1007/s10791-015-9251-5
Article Google Scholar
Tague, J., Nelson, M.J.: Simulation of user judgments in bibliographic retrieval systems. In: Proceedings of the SIGIR, pp. 66–71 (1981)
Google Scholar
Verberne, S., Sappelli, M., Järvelin, K., Kraaij, W.: User simulations for interactive search: Evaluating personalized query suggestion. In: Proceedings of the ECIR, pp. 678–690 (2015)
Google Scholar
Verberne, S., Sappelli, M., Kraaij, W.: Query term suggestion in academic search. In: Proceedings of the ECIR, pp. 560–566 (2014)
Google Scholar
Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: Proceedings of the SIGIR, pp. 315–323 (1998)
Google Scholar
Yang, H., Guan, D., Zhang, S.: The query change model: modeling session search as a Markov decision process. ACM Trans. Inf. Syst. 33(4), 20:1-20:33 (2015)
Article Google Scholar
Yang, P., Fang, H., Lin, J.: Anserini: reproducible ranking baselines using Lucene. ACM J. Data Inf. Qual. 10(4), 16:1-16:20 (2018)
Google Scholar
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the SIGIR, pp. 334–342 (2001)
Google Scholar
Zhang, Y., Liu, X., Zhai, C.: Information retrieval evaluation as search simulation: a general formal framework for IR evaluation. In: Proceedings of the ICTIR, pp. 193–200 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

TH Köln, Köln, Germany
Timo Breuer & Philipp Schaer
Universität Duisburg-Essen, Duisburg, Germany
Norbert Fuhr

Authors

Timo Breuer
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Fuhr
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Schaer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timo Breuer .

Editor information

Editors and Affiliations

Martin Luther University Halle-Wittenberg, Halle, Germany
Matthias Hagen
Leiden University, Leiden, The Netherlands
Suzan Verberne
University of Glasgow, Glasgow, UK
Craig Macdonald
University of Duisburg-Essen, Essen, Germany
Christin Seifert
University of Stavanger, Stavanger, Norway
Krisztian Balog
Norwegian University of Science and Technology, Trondheim, Norway
Kjetil Nørvåg
University of Stavanger, Stavanger, Norway
Vinay Setty

A Appendix

Table 1. Average retrieval performance over q queries

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Breuer, T., Fuhr, N., Schaer, P. (2022). Validating Simulations of User Query Variants. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13185. Springer, Cham. https://doi.org/10.1007/978-3-030-99736-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-99736-6_6
Published: 05 April 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99735-9
Online ISBN: 978-3-030-99736-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Validating Simulations of User Query Variants

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluating Simulated User Interaction and Search Behaviour

User Simulations for Interactive Search: Evaluating Personalized Query Suggestion

Validating simulated interaction for retrieval evaluation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Validating Simulations of User Query Variants

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluating Simulated User Interaction and Search Behaviour

User Simulations for Interactive Search: Evaluating Personalized Query Suggestion

Validating simulated interaction for retrieval evaluation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation