They say AI is now filling up the content of the Internet with gaslighting posts from humans who do not exist and tweets from machines who make up stuff. Nowadays we discuss things with bots, not humans.
Question: How can I stop bots and AI machines stealing my content and using this info in a twisted fashion against humans, so the confusion continues?
@off_grid_living@twtxt.net There is not much one can do, other than avoid putting any of it on places, that get scraped frequently or engage in data harvesting, for some other purposes.
You can also check the site https://haveibeentrained.com . There you can see, if any of your images are already in the scraped datasets, used to train AI. If they are, you can request them, to be removed.
@off_grid_living@twtxt.net No problem. The dataset the site searches, is only comprised of images, with very detailed text descriptions attached to them, as that’s something all images used for AI training need. Therefore I think this site works more based on those descriptions, than it does recognizing the text on the images themselves. 🤔
@prologic@twtxt.net Google scanned all kind of books, to improve their search results - letting people find books and studies, based on any of the text in them, while not having the content of the books freely available, for obvious legal reasons.
Despite only doing that, it still resulted in a big lawsuit, that dragged on forever, then settled, only to be brought back to court again. Eventually Google won, more or less because their service did more good, than harm, for both book sales and people looking for books to do book things with.
This case was also recently brought up by many, when some artists filed a class-action lawsuit against Stable Diffusion and Midjourney, for their AI, trained, using copyrighted images.
This is just a “brief” and maybe not entirely accurate summary, mostly based on this stream, by Uncivil Law, who is a real lawyer and surely more qualified to talk about this, than I am: https://www.youtube.com/live/CwTWwvLRdeo
@off_grid_living@twtxt.net There is a difference, between scraped data, used for AI training and the advertising companies, who track your behavior online, across sites, to better target you, with more relevant advertisements. These tools also give companies that use this technology, statistics about how people use their sites, so they can deduce, what needs to be changed, to increase their profits - but no AI is involved in any of that yet.
There might not be much one can do, to prevent what they publish online, from being scraped, to make some AI, but you can fight against targeted advertising and corporate analytic, by hardening your browser.
I think it’s best to combine things like: Using adblockers, scriptblockers/filters, incognito modes, settings that delete everything, when the browser is closed, not allowing unnecessary cookies, logging out of services, right after you’re done using them (unless you are sure, they don’t track you).
There are more extreme measures too, but those are a bit of an overkill, for normal web browsing, in my opinion.
@off_grid_living@twtxt.net By more extreme measures, I meant the things, that Tor browser does. For all that is holy, or unholy, please don’t ever return to Internet Explorer - it has destroyed the sanity of enough web developers as is, we can’t risk it’s user numbers ever increasing - never!