Topic Modeling

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret. Recently, neural …

Cross-lingual Contextualized Topic Models with Zero-shot Learning

We introduce a novel topic modeling method that can make use of contextulized embeddings (e.g., BERT) to do zero-shot cross-lingual topic modeling.