76 - Increasing In-Class Similarity by Retrofitting Embeddings with Demographics, with Dirk Hovy - podcast episode cover

76 - Increasing In-Class Similarity by Retrofitting Embeddings with Demographics, with Dirk Hovy

Nov 27, 201830 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

EMNLP 2018 paper by Dirk Hovy and Tommaso Fornaciari. https://www.semanticscholar.org/paper/Improving-Author-Attribute-Prediction-by-Linguistic-Hovy-Fornaciari/71aad8919c864f73108aafd8e926d44e9df51615 In this episode, Dirk Hovy talks about natural language as social phenomenon which can provide insights about those who generate it. For example, this paper uses retrofitted embeddings to improve on two tasks: predicting the gender and age group of a person based on their online reviews. In this approach, authors embeddings are first generated using Doc2Vec, then retrofitted such that authors with similar attributes are closer in the vector space. In order to estimate the retrofitted vectors for authors with unknown attributes, a linear transformation is learned which maps Doc2Vec vectors to the retrofitted vectors. Dirk also used a similar approach to encode geographic information to model regional linguistic variations, in another EMNLP 2018 paper with Christoph Purschke titled “Capturing Regional Variation with Distributed Place Representations and Geographic Retrofitting” [link: https://www.semanticscholar.org/paper/Capturing-Regional-Variation-with-Distributed-Place-Hovy-Purschke/6d9babd835d0cdaaf175f098bb4fd61fd75b1be0].
For the best experience, listen in Metacast app for iOS or Android