Licence Creative Commons Learning Information Retrieval Functions and Parameters on Unlabeled Collections

6 octobre 2014
Durée : 00:42:25
Nombre de vues 6
Nombre d’ajouts dans une liste de lecture 0
Nombre de favoris 0
Parantapa Goswami / LIG

The present study focuses on (a) predicting parameters of already existing standard IR models and (b) learning new IR functions.

We first explore various statistical methods to estimate the collection parameter of family of information based models. This parameter determines  the behavior of a term in the collection. In earlier studies, it was set to the average number of documents where the term appears, without full justification. We introduce here a fully formalized estimation method which leads to improved versions of these models over the original ones. But the method developed is applicable only to estimate the collection parameter under the information model framework.

To alleviate this we propose a transfer learning approach which can predict values for any parameter for any IR model. This approach uses relevance judgments on a past collection to learn a regression function which can infer parameter values for each single query on a new unlabeled target collection. The proposed method not only outperforms the standard IR models with their default parameter values, but also yields either better or at par performance with popular parameter tuning methods which use relevance judgments on target collection.

We then investigate the application of transfer learning based techniques to directly transfer relevance information from a source collection to derive a "pseudo-relevance" judgment on an unlabeled target collection. From this derived pseudo-relevance a ranking function is learned using any standard learning algorithm which can rank documents in the target collection. In various experiments the learned function outperformed standard IR models as well as other state-of-the-art transfer learning based algorithms.

Though a ranking function learned through a learning algorithm is effective still it has a predefined form based on the learning algorithm used. We thus introduce an exhaustive discovery approach to search ranking functions from a space of simple functions. Through experimentation we found that some of the discovered functions are highly competitive with respect to standard IR models.

Mots clés : soutenance thèse

 Informations

  • Ajouté par : Gricad Vidéos
  • Mis à jour le : 1 janvier 2021 00:00
  • Chaîne :
  • Type : Autres
  • Langue principale : Français
Les commentaires ont été désactivés pour cette vidéo.