Jinghong Chen

Sign in Subscribe

Jinghong Chen

Jinghong Chen

3-minute Pitch: Retrieval Guided Contrastive Learning for Hateful Memes Detection

[898 words, 3-minute read] Hateful memes are captioned images promoting hostility towards specific social groups. Most hateful memes detection systems are logistic classifiers built on the embedding space of pre-trained visual-langauge model (e.g., CLIP). However, we find that under these embedding spaces, hateful memes and belign memes are located

[Paper Express] Data Selection for Language Models via Importance Resampling (DSIR)

[Paper Express] Data Selection for Language Models via Importance Resampling (DSIR)

README. Data Selection (DS) aims to select a given number of samples from a large, unlabeled dataset for training a capable model in a target domain. In the case of training langauge models, practical DS methods need to efficiently select from raw text corpus containing trillions of tokens. This paper,

Catch up on Speculative Decoding in 5 minutes: a survey for researchers as of December 2023

Speculative decoding speeds up LLM inference without any loss of generation quality. As of December 2023, researchers have reported ~2x speed-up from applying speculative decoding to 3B to 1T models. This survey explains the latest speculative decoding methods that enable lossless speed-up, examines reported experimental results, and suggests future research

Estimate LLM inference speed and VRAM usage quickly: with a Llama-7B case study

You can estimate Time-To-First-Token (TTFT), Time-Per-Output-Token (TPOT), and the VRAM (Video Random Access Memory) needed for Large Language Model (LLM) inference in a few lines of calculation. I will show you how with a real example using Llama-7B. LLM Inference Basics LLM inference consists of two stages: prefill and decode.

MultiLoRA explained in 3 minutes: Democratizing LoRA for Better Multi-Task Learning

Executive Summary: LoRA (Low-Rank Adaptation) fine-tunes a low-rank weight update matrix instead of the whole weight matrix. MultiLoRA modifies LoRA to better learn multiple tasks simultaneously. A MultiLoRA module can be viewed as several LoRA modules connected in parallel and weighted by learnable scaling factors. Finetuning LLaMA with MultiLoRA enhances

Papers with Practical Values for Vision-Language Research @NeurIPS 2023 Day 5.

These 9 papers below offer practical solutions or guidance for vision-language research. I describe each work in 5 sentences. Invited Talk: Systems and Foundation Models (FM). General-purpose FM solves niche problems such as data cleaning better than dedicated algorithms. Christopher Ré shares two directions to make FMs more efficient from

(Vision-Language Researcher) Selected Papers @NeurIPS 2023

Here are some papers that we, who mainly work on vision-language models, think are interesting on Day 4 of NeurIPS 2023. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. Tree of Thoughts (ToT) is a decoding scheme for auto-regressive Transformer. A thought is defined as a coherent piece