I’m not seeing where any of this gives Google permission to train AI using your data. As far as I can see it’s all about using AI to manage your data, which is a completely different thing. The word “training” appears to originate in Dave Jones’ tweet, not in any of the Google pages being quoted. Is there any confirmation that this is actually happening, and not just a social media panic?
I would opt out just in case. I remember using Adobe Acrobat at work and noticed they read every single PDF and generate a few comments about it even when you never asked them to. Meaning they‘re scanning through potentially confidential data. I have no doubts Google will do the same sooner or later.
The only option is smart features, on or off. That requires Google to read the email to categorize them and do a lot of basic stuff. It doesn’t let you narrowly have more privacy on specific features. It’s all or nothing, and if you get a lot of emails then it’s hard to turn it off if you already use categories. Google always does all or nothing because they know people need some of it, same with location. It has to be precise location tracking to use things, you can’t just do rough location.
Yes, but the point is that granting Google permission to manage your data by AI is a very different thing from training the AI on your data. You can do all the things you describe without also having the AI train on the data, indeed it’s a hard bit of extra work to train the AI on the data as well.
If the setting isn’t specifically saying that it’s to let them train AI on your data then I’m inclined to believe that’s not what it’s for. They’re very different processes, both technically and legally. I think there’s just some click-baiting going on here with the scary “they’re training on your data!” Accusation, it seems to be baseless.
So you think that they’re not using your data simply because they’re not telling you that they are? Don’t be naive. Since when are these companies asking for permission? I’m not even confident opting out does anything. At this point, your safest bet is to not use their services.
Yes, exactly. Training an AI is a completely different process from prompting it, it takes orders of magnitude more work and can’t be done on a model that’s currently in use.
Understand that basically ANYTHING that “uses AI” is using you for training data.
At its simplest, it is the old fashioned A/B testing where you are used as part of a reinforcement/labeling pipeline. Sometimes it gets considerably more bullshit as your very queries and what would make you make them are used to “give you a better experience” and so forth.
And if you read any of the EULAs (for the stuff that google opted users into…) you’ll see verbiage along those lines.
Of course, the reality is that google is going to train off our data regardless. But that is why it is a good idea to decouple your life from google as much as possible. It takes a long ass time but… no better time than today.
Understand that basically ANYTHING that “uses AI” is using you for training data.
No, that’s not necessarily the case. A lot of people don’t understand how AI training and AI inference work, they are two completely separate processes. Doing one does not entail doing the other, in fact a lot of research is being done right now trying to make it possible to do both because it would be really handy to be able to do them together and it can’t really be done like that yet.
And if you read any of the EULAs
Go ahead and do so, they will have separate sections specifically about the use of data for training. Data privacy is regulated by a lot of laws, even in the United States, and corporate users are extremely picky about that sort of stuff.
If the checkbox you’re checking in the settings isn’t explicitly saying “this is to give permission to use your data for training” then it probably isn’t doing that. There might be a separate one somewhere, it might just be a blanket thing covered in the EULA, but “tricking” the user like that wouldn’t make any sense. It doesn’t save them any legal hassle to do it like that.
A lot of people don’t understand how AI training and AI inference work, they are two completely separate processes.
Yes, they are. Not sure why you are bringing that up.
For those wondering what the actual difference is (possibly because they don’t seem to know):
At a high level, training is when you ingest data to create a model based on characteristics of that data. Inference is when you then apply a model to (preferably new) data. So think of training as “teaching” a model what a cat is, and inference as having that model scan through images for cats.
And a huge part of making a good model is providing good data. That is, generally speaking, done by labeling things ahead of time. Back in the day it was paying people to take an amazon survey where they said “hot dog or no hot dog”. These days… it is “anti-bot” technology that gets that for free (think about WHY every single website cares what is a fire hydrant or a bicycle…)
But that is ALSO just simple metrics like “Did the user use what we suggested”. Instead of saying “not hot dog” it is “good reply” or “no reply” or “still read email” or “ignored email” and so forth.
And once you know what your pain points are with TOTALLY anonymized user data, you can then “reproduce” said user data to add to your training set. Which is the kind of bullshit facebook, allegedly, has done for years where they’ll GLADLY delete your data if you request it… but not that picture of you at the McDonald’s down the street because that belongs to Ronjon Buck who worked there one summer. But they’ll gladly anonymize your user data so the picture of you actually just corresponds to “User 25156161616” that happens to be the sibling of your sister and so forth…
in fact a lot of research is being done right now trying to make it possible to do both because it would be really handy to be able to do them together and it can’t really be done like that yet.
That is literally just a feedback loop and is core to pretty much any “agentic” network/graph.
Go ahead and do so, they will have separate sections specifically about the use of data for training. Data privacy is regulated by a lot of laws, even in the United States, and corporate users are extremely picky about that sort of stuff.
There also tend to be laws about opting in and forced EULA agreements. It is almost like the megacorps have acknowledged that they’ll just do whatever and MAYBE pay a fee after they have made so much more money already.
I’m not seeing where any of this gives Google permission to train AI using your data. As far as I can see it’s all about using AI to manage your data, which is a completely different thing. The word “training” appears to originate in Dave Jones’ tweet, not in any of the Google pages being quoted. Is there any confirmation that this is actually happening, and not just a social media panic?
I would opt out just in case. I remember using Adobe Acrobat at work and noticed they read every single PDF and generate a few comments about it even when you never asked them to. Meaning they‘re scanning through potentially confidential data. I have no doubts Google will do the same sooner or later.
The only option is smart features, on or off. That requires Google to read the email to categorize them and do a lot of basic stuff. It doesn’t let you narrowly have more privacy on specific features. It’s all or nothing, and if you get a lot of emails then it’s hard to turn it off if you already use categories. Google always does all or nothing because they know people need some of it, same with location. It has to be precise location tracking to use things, you can’t just do rough location.
Yes, but the point is that granting Google permission to manage your data by AI is a very different thing from training the AI on your data. You can do all the things you describe without also having the AI train on the data, indeed it’s a hard bit of extra work to train the AI on the data as well.
If the setting isn’t specifically saying that it’s to let them train AI on your data then I’m inclined to believe that’s not what it’s for. They’re very different processes, both technically and legally. I think there’s just some click-baiting going on here with the scary “they’re training on your data!” Accusation, it seems to be baseless.
So you think that they’re not using your data simply because they’re not telling you that they are? Don’t be naive. Since when are these companies asking for permission? I’m not even confident opting out does anything. At this point, your safest bet is to not use their services.
So they are using AI but not in a capacity that would make it learn? Doubt
Yes, exactly. Training an AI is a completely different process from prompting it, it takes orders of magnitude more work and can’t be done on a model that’s currently in use.
Understand that basically ANYTHING that “uses AI” is using you for training data.
At its simplest, it is the old fashioned A/B testing where you are used as part of a reinforcement/labeling pipeline. Sometimes it gets considerably more bullshit as your very queries and what would make you make them are used to “give you a better experience” and so forth.
And if you read any of the EULAs (for the stuff that google opted users into…) you’ll see verbiage along those lines.
Of course, the reality is that google is going to train off our data regardless. But that is why it is a good idea to decouple your life from google as much as possible. It takes a long ass time but… no better time than today.
No, that’s not necessarily the case. A lot of people don’t understand how AI training and AI inference work, they are two completely separate processes. Doing one does not entail doing the other, in fact a lot of research is being done right now trying to make it possible to do both because it would be really handy to be able to do them together and it can’t really be done like that yet.
Go ahead and do so, they will have separate sections specifically about the use of data for training. Data privacy is regulated by a lot of laws, even in the United States, and corporate users are extremely picky about that sort of stuff.
If the checkbox you’re checking in the settings isn’t explicitly saying “this is to give permission to use your data for training” then it probably isn’t doing that. There might be a separate one somewhere, it might just be a blanket thing covered in the EULA, but “tricking” the user like that wouldn’t make any sense. It doesn’t save them any legal hassle to do it like that.
Yes, they are. Not sure why you are bringing that up.
For those wondering what the actual difference is (possibly because they don’t seem to know):
At a high level, training is when you ingest data to create a model based on characteristics of that data. Inference is when you then apply a model to (preferably new) data. So think of training as “teaching” a model what a cat is, and inference as having that model scan through images for cats.
And a huge part of making a good model is providing good data. That is, generally speaking, done by labeling things ahead of time. Back in the day it was paying people to take an amazon survey where they said “hot dog or no hot dog”. These days… it is “anti-bot” technology that gets that for free (think about WHY every single website cares what is a fire hydrant or a bicycle…)
But that is ALSO just simple metrics like “Did the user use what we suggested”. Instead of saying “not hot dog” it is “good reply” or “no reply” or “still read email” or “ignored email” and so forth.
And once you know what your pain points are with TOTALLY anonymized user data, you can then “reproduce” said user data to add to your training set. Which is the kind of bullshit facebook, allegedly, has done for years where they’ll GLADLY delete your data if you request it… but not that picture of you at the McDonald’s down the street because that belongs to Ronjon Buck who worked there one summer. But they’ll gladly anonymize your user data so the picture of you actually just corresponds to “User 25156161616” that happens to be the sibling of your sister and so forth…
That is literally just a feedback loop and is core to pretty much any “agentic” network/graph.
There also tend to be laws about opting in and forced EULA agreements. It is almost like the megacorps have acknowledged that they’ll just do whatever and MAYBE pay a fee after they have made so much more money already.
I am bringing it up because the setting Google is presenting only describes using AI on your data, not training AI on your data.
Not that I’ve seen, no.