When and Why Vision-Language Models Behave like Bags-Of-Words, and What to Do About It?

Despite the success of large vision and language models (VLMs) in many downstream applications, it is unclear how well they encode the compositional relationships between objects and attributes. Here, we create the Attribution, Relation, and Order …

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

We show that text-to-image generation models can amplify stereotypes at large scale.

Language Invariant Properties in Natural Language Processing

Meaning is context-dependent, but many properties of language (should) remain the same even if we transform the context. For example, sentiment, entailment, or speaker properties should be the same in a translation and original of a text. We …

Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals

Current language technology is ubiquitous and directly influences individuals' lives worldwide. Given the recent trend in AI on training and constantly releasing new and powerful large language models (LLMs), there is a need to assess their biases …

Pipelines for Social Bias Testing of Large Language Models

The maturity level of language models is now at a stage in which many companies rely on them to solve various tasks. However, while research has shown how biased and harmful these models are, **systematic ways of integrating social bias tests into …

XLM-EMO: Multilingual Emotion Prediction in Social Media Text

We use Large Multilingual Language Models to create a tool for multilingual emotion prediction

Beyond NDCG: behavioral testing of recommender systems with RecList

In this paper, we propose RecList, a behavioral-based testing methodology. RecList organizes recommender systems by use case and introduces a general plug-and-play procedure to scale up behavioral testing. We demonstrate its capabilities by analyzing known algorithms and black-box commercial systems, and we release RecList as an open source, extensible package for the community.

On the Gap between Adoption and Understanding in NLP

There are some issues with current research trends in NLP that can hamper the free development of scientific research. We identify five of particular concern: 1) the early adoption of methods without sufficient understanding or analysis; 2) the …

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret. Recently, neural …

Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction

We investigate grounded language learning through real-world data, by modelling a teacher-learner dynamics through the natural interactions occurring between users and search engines.