Personalization for Search Engine

Learning Notes of CMU Course 11-642 Search Engine

Posted by haohanz May 03, 2019 · Stay Hungry · Stay Foolish

Topic-based personailization

\begin{equation} \beta \Pr (d│q)+(1−\beta) \Pr_{cat} (d│q, u) \end{equation} \begin{equation} \Pr_{cat} (d|q, u)= \sum_c \Pr (c│d) \Pr (c|u) \end{equation}

  • Metric: MRR - improved a lot on the ambiguous queries
  • Main idea:
    • consider the user’s and doc’s topic-category matches
    • A personalized SE consider
      • Relevance
      • The query-independent value of scores
      • The topic is interested in or not

Long-short term personalization

10 features for atypical discovery

  1. Query length
  2. Query length divergence
  3. SAT Reading Level
  4. SAT Reading Level Divergence
  5. Topic divergence
  6. Ratio of noun
  7. Verb ratio divergence
  8. Adjective ratio divergence
  9. Longest query position
  10. Question ratio

Divergence = distance of current session to historical vocabs/topic categories/features