In this talk we explore how to train a CLIP (Contrastive Language-Image Pre-training) for the Italian language and for a Fashion use case. We also explore potential insights given by the language-image capabilities of these models.