Why Worry
Depending on the generative AI you are using, you may be inadvertently providing training data for their training model. The Large Language Models that power services like ChatGPT, Claude and Gemini need as much information as possible to increase the accuracy of their AI. The more data these models have access to, the more powerful they become. Some of these companies are running into the issue of running out of readily available data to use in training, so they are relying on user input data. This data is often from the free tiers of their user base.Ignoring using the data for training, any data you have given them permission to use can be stored by the service you are using. In the event of a data breach, sensitive data may be compromised.
Best Practices
While you should always check the data privacy policy for any service you are using, there are a few best practices that you should follow to protect your data regardless of the policy. If you follow these, your data should be relatively safe, regardless of the company's policies.- With few exceptions, don’t enter information that is not publicly available or that you would not feel comfortable posting publicly on social media.
- Never enter demographic data. Do not include specific information about yourself or any other individual in prompts. There is never a reason to include this information.