AmazonBot - What is Bezos up to ?

Yet another shitty corporate bot sucking up data and probably for another LLM

Every time this happens I get unreasonablly upset about it, a company deciding that things on the internet are theirs to hoover up to repackage and resell, are the same kind of bros that think consent is given because you were too drunk to say no at a frat party.

Yet another shitty corporate bot sucking up data and probably for another LLM

I don’t have a problem with crawlers as long as their intentions are right, the internet is for sharing, I want to share my things with the world and if your bot comes to index my site in search results that’s cool. I allow plenty of crawlers that I know of to come as often as they like, I in fact encourage projects like the Internet Archive to take what they want.

# Allow the internet archiver to access my whole site
User-agent: ia_archiver
Allow: /
User-agent: archive.org_bot
Allow: /

The act of having a .fr domain already says yes, I want the BnF to take a copy and preserve it, even the cringe shit from 2001.

What I absolutely take exception to is assholes like Sam Altmann crying that they should be able to fuck over everybody because their business of making plagarising machines doesn’t work if people don’t want their work to be plagarised.

PISS

Back in 2022 I found Salesforce had a bot that went wild on one of my sites and started to try and break into the comments section, it took months to get an answer from them but I had to threaten them on two fronts - copyright and GDPR. The work to do this is unreal and most of the time they’ll back down as my copyright policy is pretty clear Attribution, Non Commercial, Share Alike. GDPR is also a clear one if they’re collecting information to build profiles without consent, which is exactly what Salesforce does.

But again, this takes my time and I shouldn’t have to do this, I shouldn’t have to figure out what asshole bots are being created daily. Our dipshit governments that are far too busy trying to break encryption should be coming up with a legal framework to put a limit on what companies can take, allow legitimate uses such as research, while forcing an opt-in solution for anything that isn’t simple search index. There are much better people on this planet that could give the pros and cons of this kind of thing and write a proposal that respects the openness of the internet, while restricting the capitalist theft machines.

But what is BezosBot upto ? Probably wants us all to piss in bottles, but it claims Amazonbot is Amazon’s web crawler used to improve our services, such as enabling Alexa to answer even more questions for customers.

I don’t even know what that means

We can guess that rolling my blog at 3h in the morning and hoovering up my entire LiveJournal archive has absolutely nothing to do with shopping tips and would be an indication that the world’s data monster is probably trying to make line go up with their own rendition of a LLM.

Bref, this means I have to send yet another stern email to force them to delete everything they just captured and never do it again.

Ugh