Structured representation for Information Retrieval

Yuxuan Zong and Benjamin Piwowarski

Generative information retrieval uses transformer neural networks as differential search indexes, representing them as a sequence of document identifier tokens. Some of the work propose to use arbitrary identifiers while other works propose to use meta-data (text, URL, title) as identifiers. We propose a new generative approach, named REFERENTIAL, that combines these two directions: using prefix-biased identifiers and removing the one-to-one relationship between an identifier and a document. This paper gives the brief introduction of my thesis and what I have done during the past year.