Back to glossaryDefinition
Training Data
Training data is the collection of text, images, or other content that an AI model learned from during its development. The capabilities, biases, and knowledge of an AI model are largely determined by its training data. An LLM trained primarily on internet text will reflect the patterns, perspectives, and information present on the internet — including its gaps and inaccuracies.
Training data has a cutoff date, which is why AI models don't know about events that happened after their training was completed.