In this episode of the DevOps Sauna, Darren and Pinja discuss the impact of artificial intelligence and bot accounts on social media platforms like X and Bluesky and the wider implications AI will have on these platforms in the future.
[Darren] (0:06 - 0:25)
They're going to start inserting AI-generated content into your feeds, like deliberately.
Welcome to the DevOps Sauna, the podcast where we deep dive into the world of DevOps, platform engineering, security, and more as we explore the future of development.
[Pinja] (0:25 - 0:35)
Join us as we dive into the heart of DevOps, one story at a time. Whether you're a seasoned practitioner or only starting your DevOps journey, we're happy to welcome you into the DevOps Sauna.
[Darren] (0:48 - 0:53)
Welcome back to the DevOps Sauna. I'm here with Pinja. How's things going today, Pinja?
[Pinja] (0:53 - 0:58)
Hi, things are okay for me. It's a nice sunny day here in Helsinki. How about you, Darren?
[Darren] (0:59 - 1:10)
Pretty good. I, like 23 million other people, have been diving into Bluesky lately to see what the Twitter clone-slash-competitor has to offer.
[Pinja] (1:10 - 1:20)
Yeah, the new service is a competitor to not only formerly known as Twitter, now X, but also Threads from Meta. So what's the difference here?
[Darren] (1:21 - 1:50)
It's actually a tool that's kind of similar to Mastodon, if you ever were familiar with that. It was the very tech-oriented version, and Bluesky has a similar federated style. But I think the thing that's most important to talk about is bots.
Because over the last couple of years, we've seen this explosion of AI, and I don't think it's any clearer than when we see in social media these kind of faked interactions.
[Pinja] (1:50 - 2:12)
No, it is even harder nowadays to spot what is a bot and what is not a bot. Because I dove into this subject a couple of days ago, and I was looking into the guidance documents. I think they're pretty outdated at the moment, to be honest.
They're from four years ago, a couple of years ago. But now, as I say, with the rise of AI, the situation has changed quite a lot.
[Darren] (2:12 - 2:55)
Yes, we now have these LLM systems, large language models, which are kind of—it was the same with phishing attacks. The phishing attacks used to be very easily detectable because they used poor language. They had weird spelling issues. They had punctuation in the wrong places or doubled up.
It's like they weren't proofreading anything they put out, and that gave us a great clue of “Hey, this is maybe not what we expect it to be”. But now, anyone can pipe the output of OpenAI's ChatGPT into their bot and have a perfectly fluent English-speaking bot, and people will struggle to pick those out from regular people talking.
[Pinja] (2:56 - 3:17)
Yeah, that's true. And previously, in my native language Finnish, it was even worse than what it was for English. But nowadays, even for Finnish, the comparison is pretty difficult, actually.
Even if we consider written Finnish with the one that the AI is using, LLMs are using, it is really difficult to spot the difference at the moment.
[Darren] (3:18 - 3:25)
I guess the only advantage there is that, from what I hear, no Finns actually speak with written Finnish. No one talks like that.
[Pinja] (3:26 - 3:35)
No, nobody speaks like that. Nobody writes like that, especially on social media, and nobody's that polite. Like the AI bots are at the moment.
[Darren] (3:35 - 4:14)
Yeah, it seems to be a similar case with German, because one thing that I stumbled across while exploring Bluesky was this idea they have these bots that they've dubbed reply-as-a-service bots, and they seem to be bots that's only purpose is to kind of scan for comments on whatever topic and disagree in the replies. It doesn't matter what it is, they'll just disagree and say something kind of generic.
Doesn't really say much, but it’s enough to cause this kind of discord, this kind of clash, as they come together.
[Pinja] (4:15 - 4:59)
Yeah, and Germany is facing their federal election now, a little bit ahead of time in February, so that is why this topic has also come up, and it is a very timely topic for Germany as well. And that was one of the points that you just made there. The comments that were now raised on Bluesky, one thing was that it was very generic.
It felt like they had not exactly read the content and the context they were replying to. So, well, them being bots, that makes a lot of sense. But also there, the commenters said, yeah, nobody's that polite, even in German.
So, the language struck to many people in these comments that this cannot be a real person, but they're flaunting the conversation nonetheless.
[Darren] (4:59 - 5:10)
Yeah, neither that polite nor accurate, because, again, it's one of those languages where words somehow end up 700 letters long, and no one could figure out which order is the correct one, so.
[Pinja] (5:11 - 5:23)
Yeah, yeah, exactly. And it is not. The AI bots, one thing that they're not doing at the moment, at least, is that they're not picking up local dialects and local variations of words.
So, that's one of the telltale signs.
[Darren] (5:23 - 5:33)
Yeah, regional dialects are just complicated. They've been the bane of my existence in Finland. When I try to talk to someone from Oulu, and they start saying words that I've never heard before.
So, yeah.
[Pinja] (5:33 - 6:04)
I actually tried this yesterday. I asked ChatGPT in Finnish to tell me something about eastern parts of Finland in the Savo region dialect. It did a pretty good job, actually.
So, it was detectable for a Finn to actually see that, yes, this was Savo dialect. But when it comes to these AI-generated bot responses, those dialects, for example, in Finnish or in English or now in German, in this case, are not being used. So, that's why the written language strikes them as odd.
[Darren] (6:05 - 6:47)
Yeah. But one thing I find kind of fascinating about Bluesky compared to any other social media network at the moment is, in the way they've federated everything, they've also kind of democratized moderation. So, Bluesky has these features that talk about things like moderation lists, block lists, and these are lists that are predominantly made by users.
So, they have this kind of community that's working towards identifying bots, flagging bots, and adding them to moderation lists. And that was actually kind of impressive for me to see a community working together to try and get, basically, to try and improve the platform they're invested in.
[Pinja] (6:47 - 6:59)
Yeah. I saw that people are now sharing these lists and warning one another of the bots. So, please add this to your own lists.
So, the community is improving the service here.
[Darren] (6:59 - 7:21)
Yeah. The community is improving the service, but they're also spreading information, like sharing how to detect bots, how to cast a critical eye over the followers you're receiving, how to just examine users, like if they've popped up three hours ago and they're already following 4,000 accounts, you can probably assume that's a bot.
[Pinja] (7:22 - 7:45)
Yeah, that's one of the telltale signs. The lack of personal engagement is also one. Somebody pointed out the profile pictures and the account descriptions as well.
So, there was one example where the account description said that I enjoy walking on the beach, I enjoy the sea, and then the profile picture was this very clearly AI-generated photo of waves.
[Darren] (7:45 - 8:14)
Yeah, I can imagine that one. There's also another one, I read a comment about that there were people putting no DMs as a sign of being a bot, because you can always turn that off in Bluesky. So, a real person doesn't need to type that.
They just need to set the moderation settings themselves. So, it's kind of like the people making bots haven't stepped away from their script. They're still using scripts that they would have had from old platforms.
[Pinja] (8:14 - 8:20)
Yeah, that's a really good sign to be aware of. But this is not only a Bluesky problem, is it, Darren?
[Darren] (8:21 - 8:42)
No, no. This is a huge problem. It's been a massive problem for every social media network.
And again, from one of the posters on Bluesky who put together this list of the amount of bot accounts, and I was kind of staggered by the numbers. So, in the pre-Musk era, Twitter was removing a million accounts a day.
[Pinja] (8:42 - 8:51)
Million a day. I think that gives us the scale of the number here, how many bots and spam accounts we were talking about at that time.
[Darren] (8:51 - 9:33)
Yeah, it comes down to scale. Let's get into scale in a little moment, because there's some other networks that I think we can highlight. Things like Discord, and it's like 8 million accounts in Q2 compared to 300,000 for policy violations.
So, that's 8 million accounts removed for spamming and 300,000 for policy violations. And we see that in every platform, like Reddit, for example. 173 million pieces of content from July to December 2023 were removed, and 70% of those were due to spam.
173 million pieces of content.
[Pinja] (9:33 - 9:48)
That's a huge number. If we're thinking about how many posts are being made on Reddit per day, that's not per se the largest amount, of course, but it's still a significant amount of bot-related posts and spam-related posts.
[Darren] (9:48 - 10:52)
Yep. And we see it on what I wouldn't call traditional social media, too. We see it on YouTube, where they removed 15 million channels, 104 million total videos in Q1 of 2024.
96% were spam bots. Wow. And the reason they do this, as I said, it's scale.
So, what we know is that data is valuable, and being able to reach people is valuable. So, we know about spam bots. Everyone understands spam bots.
They are advertising, usually, illicit services or something weird to try and get clicks, to try and get malware on your system. But I feel like there is a hidden type of bot, which is the data-gathering bot. Because if we look at social media networks, being able to gather data from social media networks is what propelled Meta to be one of the richest companies on the planet.
The sale of people's data at scale is a comically large business.
[Pinja] (10:52 - 11:04)
And this is actually, if we think of how LLMs are actually run and what is the main information for them, it's the data. So, again, this whole loop seems to go around.
[Darren] (11:04 - 11:55)
It does lead us to the idea of AI stagnation though. But it will get to the point where, if more data is coming into the LLMs that's actually been generated by the LLMs, then it won't actually improve. It will just kind of—it's hit a plateau, and now all it can do is decline, because original content is no longer being added.
And there's actually, I would say, a kind of problematic idea behind that, because we're talking about Bluesky and how they've democratized moderation and how they've tried to improve things. And then if we look at Meta, Meta connects in September, I think. Mark Zuckerberg literally announced the idea of AI personas and saying, they're going to start inserting AI-generated content into your feeds, like deliberately.
[Pinja] (11:56 - 12:41)
Yeah, and it's also so that it will be available to other content creators as well, so that you can have your AI twin, basically. So, if we take a content creator from Instagram or Facebook, somebody creating an AI twin of themselves, and that actually being available for, I'm going to use quote marks, personal engagement with the followers. And how is, for some people, they would like to have more time with their favorite creators, but if they know that there's an AI-created twin for this kind of creator, it might allow them to be online 24-7.
It might allow them to respond faster to all the messages coming from the followers. But is it something that people will want?
[Darren] (12:41 - 13:27)
That's a great question. I think not only that question, but is it something people will notice? Because right now, the EU is the only place that has any kind of law to prevent this kind of thing.
The EU AI Act, which requires, as far as I'm aware, the noting of AI-generated content, meaning that the EU is one of the only places that will be protected from this kind of, as you say, 24-7 AI-based availability for content generators. So yeah, it will be easy to spy in the EU because it will have a notice at the bottom saying generated by AI. But for the 350 million people in the US and the billion accessing from everywhere else in the world, it's a pathway to manipulation.
[Pinja] (13:28 - 14:06)
Yeah. And as you said, the stagnation of AI, of course, because it feeds on the data it gathers itself. So, how real is the data provided?
How much can we know about just the content and the data provided by LLMs at the moment? Because right now, we know that many people are using, for example, ChatGPT to generate ideas, data for themselves, and so forth. But if we have this availability to AI bots to a larger scale, what is that going to be like?
Will that help us in the long run? How can we navigate around this? Those are the questions that I have, at least at the moment.
[Darren] (14:06 - 14:56)
Yeah, I honestly wish I had answers for those. I feel like the idea of AI without, well, let's say social media without the social aspect, is just, it's pandering to this whole. We're all aware. I assume everyone has seen and understood the idea behind social networks and how they basically are these dopamine loops that keep you scrolling and keep you clicking, just so you get that little hit of dopamine from the like, the hit of dopamine from the response.
The idea of introducing AI into this mix, and not just AI, but legitimate AI, legitimized AI tooling to prevent actual social interaction, basically turns social media into dopamine farms. The idea of just keeping someone staring at the screen for as long as possible with minimum input.
[Pinja] (14:56 - 15:09)
Yeah, that's a really powerful point, because there is also the hunter instinct in human beings and how that creates dopamine, right? That's one of the reasons why we get looped in.
[Darren] (15:10 - 15:55)
Yeah, the idea of the hunter instinct, and it's kind of curious because we've had AI systems before. I don't know if you're familiar with the uncanny valley. It's basically the idea that something that looks human but isn't human is somehow repellent to human beings.
And there are lots of fascinating theories around this. My favorite is, why does this exist as an evolutionary instinct? As in, why did humans need to be terrified of something that looked human in the past?
But that's a different discussion. But this idea, will it trigger that kind of feeling of unpleasance when interacting purely with AI?
[Pinja] (15:55 - 16:23)
Yeah, and then we can turn this around. Let's take social media, for example, here. Let's take a content creator on social media.
Will it be an advantage in the future if there's a tag saying, no AI has been used? A couple years ago, what trended was hashtag no filter on photos, but now, is there no AI used? #No AI, is that the next thing?
[Darren] (16:23 - 17:14)
I mean, it will be a requirement in Europe, at least. It will be a selling point. Apparently.
And it actually kind of leads us into this thing that's happening in Norway. So, I read about this. This is the idea of social media in Norway.
In Finland, we have this strong authentication, where basically you use bank credentials to log in and prove you are who you say you are. And now it seems like in Norway, that someone has built a social media network that requires a similar identity. So, they are guaranteeing no bots.
They are guaranteeing full human interaction on that service. And that seems like it will insert some much needed accountability that hasn't really been on the internet for, I don't know, maybe a couple of decades.
[Pinja] (17:14 - 18:26)
And in a similar manner, there's a startup in Finland called The Koll. It's spelled with a K, K-O-L-L. And their idea is always only allow in the app call with full identity and not just a number.
So, in this sense, you would no longer get calls from bots, any marketing bots, for example. You're able to also input the reason why you're calling somebody, because nowadays, many younger people do not pick up their calls, by the way. So, in Koll, you would have the face and the name, and you would always know that this is the person that is calling me.
In a similar manner, then, with Hudd.no, you would always have the ability to know who's calling. So, then comes the question that you said, accountability to be added to internet and our experience being online. There are many people who, of course, take advantage from the anonymity that we have.
But what about when people start saying, what about the free speech? Of course, it's that. Those elements come into play when we talk about everybody being identifiable, everybody only having one account, always linked to their own personality and identified personality as well.
[Darren] (18:26 - 19:39)
Yeah, but it's a good discussion, because there is the privacy concern. And while this is isolated to something like one social media network that's optional, I think that's a great idea. When you try to start applying it to the whole internet at large, like, you know, I'm a security specialist, which means I have to occasionally go to weird websites and download weird malware.
And if I then end up on some watch list, because I've gone to a weird website and downloaded weird malware, that makes my job quite difficult. But I do think it's a good step, both of these, this call app and this hudd.no is, they're both steps towards accountability. And that's what we're seeing in Bluesky too.
But in Bluesky, it's being led by community effort, which is kind of, it actually gives me kind of hope for the ideas of people pushing for these things because the community wants it, because the community wants to make sure there aren't bots, or because Bluesky welcomes bots. Bluesky has an API, it has documentation, it has examples, and they just have rules. They say make sure your bot only interacts if people like specifically tag it and make sure it's stated it's a bot.
Bots are not unwelcome there, but bots pretending to be people are.
[Pinja] (19:39 - 20:07)
Yeah, we use a lot of bots in our day-to-day life. Many, many companies have the chat bots, for example. In the Slack system, we might have some bots helping our day-to-day life, but we always know that it's a bot.
So that's, as you say, in social media, it's important to know who's real and who's not. And especially nowadays, and if we take the federal election in Germany, and the attempts to impact people voting and creating chaos around it, is a real threat in this case.
[Darren] (20:07 - 21:00)
It is, and it's actually never been more of a threat, given the implementation of AI models, because not only are we, we've talked about content, but it's actually extremely easy to create a bot just using OpenAI and ChatGPT. Like, I coded a bot in Python using the AT Proto Library for interacting with Bluesky, and it worked in a couple of hours, and some minor debugging and telling ChatGPT how to do basic things. It's so weird.
It does some things extremely well, and then trips up. Like, if you want to do an experiment, go to OpenAI and ask it to draw someone holding an umbrella upside down. It can't comprehend the idea that an umbrella would not be held upwards against the rain.
So, no matter what you do, it will constantly put an umbrella facing the correct direction.
[Pinja] (21:01 - 21:31)
That seems like a very simple request to fulfill, and is really staggering how often, when you ask an AI to draw a picture for you of a human being, and you end up having feed as instead of the hands in the photo, right? So as you said, it can seem very intelligent, and then it trips on the most simple requests. So, what can a non-technical person do to navigate this?
What can be done in this landscape?
[Darren] (21:31 - 22:54)
That's a very good question and one that I think we've been asking ourselves. For every kind of interaction with non-technical people who are being forced to deal with IT that they don't fully understand and shouldn't fully need to. So I think one of the things is listen to the people around you.
Become part of the community. We'll take Bluesky as the example because it's what we've been talking about. If you go there, you can find people who will make these lists and instruct you on how to keep yourself safe, on how to vet any people following you, to make sure they are people and not bots.
And a lot of the things we see all over are still the same. Every bot has to automatically generate a username, right? And the easiest way to do that is by adding a string of numbers to the end of a generic name.
So, if you see someone called RobertFranklin849462, you can fairly safely assume that's a bot. And Bluesky are actually doing really well in that they're doing kind of a shoot first, ask questions later, that if they suspect a bot, they'll just ban it and wait for an appeal. And obviously, people who are bots aren't appealing, so that's working pretty well.
But I think it's difficult. It's always difficult to try and make people understand that the internet is dangerous.
[Pinja] (22:54 - 23:33)
Yeah, especially now with AI and LLMs getting more sophisticated constantly. The guidance I said on how to detect a bot seem outdated already right now, even though the latest ones that I found online were only six months ago. So that's the threat that I fear at the moment, that can we keep up with this?
How can we make people be aware of that? But it is the healthy thing always is, in my opinion, is to have that critical sense of thinking, right? When you go online, if something seems too odd, if something seems off, you might want to investigate a little bit further.
[Darren] (23:33 - 24:02)
Yeah, I think the critical thinking will get you far. I do think we're going to have to start seeing answers to these questions relatively soon, because yeah, AI is accelerating. No one's stopping it.
Very few people are even trying to slow it down. And we're the people who work in the tech business, and we're having this discussion. So I imagine how this discussion is going for the people who don't really understand computers.
It's going to be a difficult few years for them to navigate, I think.
[Pinja] (24:02 - 24:10)
Yeah, the acceleration of technology is now wild. So what we can do is follow the discussions around us.
[Darren] (24:11 - 24:23)
Yep, and hopefully write some of our own. But luckily, we don't have to have any answers now, because this seems like a difficult task to cover. I fully agree.
That's everything we have for the DevOps Sauna today. Thank you, Pinja, for joining me.
[Pinja] (24:24 - 24:25)
Thank you, Darren. It was a pleasure.
[Darren] (24:25 - 24:28)
And we will see you next time. Bye.
[Pinja] (24:28 - 24:35)
Thank you. Bye. We'll now tell you a little bit about who we are.
[Darren] (24:36 - 24:38)
I'm Darren Richardson, Security Consultant at Eficode.
[Pinja] (24:39 - 24:43)
I'm Pinja Kujala. I specialize in Agile and portfolio management topics at Eficode.
[Darren] (24:43 - 24:46)
Thanks for tuning in. We'll catch you next time.
[Pinja] (24:46 - 24:54)
And remember, if you like what you hear, please like, rate, and subscribe on your favorite podcast platform. It means the world to us.