3-minute Pitch: Retrieval Guided Contrastive Learning for Hateful Memes Detection
[898 words, 3-minute read]
Hateful memes are captioned images promoting hostility towards specific social groups. Most hateful memes detection systems are logistic classifiers built on the embedding space of pre-trained visual-langauge model (e.g., CLIP). However, we find that under these embedding spaces, hateful memes and belign memes are located