The We Don't Train on Your Data Question
This is a member-only chapter. Log in with your Signal Over Noise membership email to continue.
Log in to readModule 3 · Section 3 of 7
The “We Don’t Train on Your Data” Question
AI vendors are aware that data handling is a concern, so their communications tend to emphasise the most favourable interpretation of their policies. “We don’t train on your data” sometimes means exactly that. Sometimes it means “we don’t train on your data on this specific tier” — implying other tiers operate differently. Sometimes it means “we don’t use your data for model training but may use it for safety monitoring, fraud detection, or improving our systems in other ways.”
Anthropic is notably transparent: their privacy policy distinguishes clearly between what is collected, what is retained, and what is used for training across different tiers. OpenAI’s policy is more complex, with different rules for different products. Google Workspace integration with Gemini has separate data handling rules from consumer Gemini.
The practical question to ask for any AI tool you use for work is: what is the data processing agreement? If there is no clear answer, you are probably on a free or consumer tier. Anything sensitive should not be going there.