"Incorporating Word Dependencies Into Structured Document Retrieval Models " by Fedor Nikolaev

Access Type

Open Access Thesis

Date of Award

January 2021

Degree Type

Thesis

Degree Name

M.S.

Department

Computer Science

First Advisor

Alexander Kotov

Abstract

Information Retrieval models present us different ways for probabilistic modeling of documents and queries that are used for effective scoring of documents with respect to user's queries. In recent years two trends in information retrieval modeling have emerged. The first type of modeling called structured retrieval includes models such as MLM, BM25F, PRMS etc. that use documents represented as a combination of several pre-defined fields, such as title, abstract, etc. and model documents as mixtures of models for particular fields, which are usually taken with different importance. The second type of modeling, that includes models such as SDM, WSDM, and PQE, expands the standard bag-of-words modeling approach by considering information about dependencies between terms in queries and documents. In this work, we answer the question of whether these two ways of utilizing additional information can be combined in a unified framework for more effective retrieval, what is the best way to combine these two types of modeling, and what are strengths and weaknesses of different ways to incorporate term dependencies into structured retrieval models.

Share

COinS