Temperature
A parameter that controls the randomness of a model's output. Lower values (e.g., 0.0) make responses more deterministic and focused, while higher values (e.g., 1.0) make them more creative and varied.
Temperature is a sampling parameter that scales the probability distribution over the model's vocabulary at each step of text generation. Technically, it divides the logits (raw prediction scores) before applying the softmax function. A temperature of 0 makes the model always pick the highest-probability token, while higher temperatures flatten the distribution, giving lower-probability tokens a better chance of being selected.
In practice, temperature acts as a creativity dial. For factual tasks like data extraction, code generation, or classification, a low temperature (0.0 to 0.3) produces consistent, predictable outputs. For creative writing, brainstorming, or generating diverse responses, a higher temperature (0.7 to 1.0) adds variety and surprise. Going above 1.0 is possible but often produces incoherent or nonsensical text.
Temperature interacts with other sampling parameters like top-p and top-k. Setting temperature to 0 overrides both, producing fully deterministic output (assuming the same input). When using temperature and top-p together, most practitioners adjust one and leave the other at its default. OpenAI recommends changing either temperature or top-p, but not both simultaneously.
Choosing the right temperature depends on your application. Customer-facing chatbots often use 0.3-0.5 for a balance of reliability and naturalness. Code assistants typically use 0.0-0.2 for precision. Creative applications might use 0.8-1.0. Many developers run the same prompt at multiple temperatures to find the sweet spot for their specific use case.
Explore more AI concepts in the glossary
Browse Full Glossary