Boosted negative sampling by quadratically constrained entropy maximization

Journal Article (2019)
Author(s)

I.T. Kekeç (TU Delft - Pattern Recognition and Bioinformatics)

David Mimno (Cornell University)

DMJ Tax (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1016/j.patrec.2019.04.027
More Info
expand_more
Publication Year
2019
Language
English
Research Group
Pattern Recognition and Bioinformatics
Volume number
125
Pages (from-to)
310-317

Abstract

Learning probability densities for natural language representations is a difficult problem because language is inherently sparse and high-dimensional. Negative sampling is a popular and effective way to avoid intractable maximum likelihood problems, but it requires correct specification of the sampling distribution. Previous state of the art methods rely on heuristic distributions that appear to do well in practice. In this work, we define conditions for optimal sampling distributions and demonstrate how to approximate them using Quadratically Constrained Entropy Maximization(QCEM). Our analysis shows that state of the art heuristics are restrictive approximations to our proposed framework. To demonstrate the merits of our formulation, we apply QCEM to matching synthetic exponential family distributions and to finding high dimensional word embedding vectors for English. We are able to achieve faster inference on synthetic experiments and improve the correlation on semantic similarity evaluations on the Rare Words dataset by 4.8%.

No files available

Metadata only record. There are no files for this record.