What is Zero-Shot Learning?

Training

Zero-Shot Learning

Asking a model to perform a task using only natural language instructions, without providing any examples. The model relies entirely on its pre-trained knowledge to understand and complete the task.

Zero-shot learning means giving a language model a task description and expecting it to perform correctly without any demonstrations or examples. For instance, simply saying "Classify this review as positive or negative: [review text]" is a zero-shot prompt. The model must rely entirely on its understanding of language and the task description to produce the right output. This is the simplest form of prompting and the baseline against which other techniques are compared.

The impressive zero-shot capabilities of modern LLMs are a direct result of their massive pre-training on diverse internet text. During pre-training, models encounter countless examples of classification, summarization, translation, and other tasks in their training data. When you describe a task in natural language, the model can draw on this implicit training to perform it. Larger, more capable models generally have better zero-shot performance because they have absorbed more patterns from more data.

Zero-shot prompting works best for well-defined, common tasks that the model likely encountered during training — sentiment analysis, summarization, translation, simple question answering, and general text generation. It tends to struggle with highly specific formats, unusual tasks, or domains where precise consistency matters. If zero-shot performance is not sufficient, few-shot learning (adding examples) is usually the first escalation before considering fine-tuning.

In practice, zero-shot and few-shot are endpoints of a spectrum. "Zero-shot with instructions" — where you provide detailed task descriptions but no examples — can be surprisingly effective, especially with instruction-tuned models like GPT-4 and Claude. The line between a detailed zero-shot instruction and a few-shot example is often blurry. Most developers start with zero-shot prompts, evaluate the results, and add examples or refine instructions only where needed.

Transformer

Explore more AI concepts in the glossary

Browse Full Glossary