Seeing this I initially let out a laugh until I scrolled down and saw that they trained this using /pol/. Even as cynical as I am I don't think the world deserves a infinite racism machine.
Why not train it using a more whimsical (although still offensive) board? You could probably automate all videogame discussion forevermore using /v/.
A guy from /g/'s daily programming thread started collecting posts for that exact purpose. Nothing came of it but it's still a hilarious snapshot of that general. Sadly it's dead now.
Containment boards don't work, but my recollection is that /pol/ was nevertheless created with the intention of being a containment board. /n/ and /new/ had already been banned for racism, but it metastasized across the rest of the site. /pol/ was created after banning boards for racism failed.
For years now, there have been seemingly crazy people on 4chan who insist that systems like this are already employed on 4chan to guide or simply disrupt conversations. Maybe they were crazy to think that a few years ago (or maybe not) but I expect they definitely feel vindicated now.
I ran a simple Markov chain text generator bot on 4chan for a while back in 2008, because I wanted to see if it could pass the Turing test on /b/. It could, and also derailed conversations quite a lot, so the technology is certainly there.
Modern political accusations of botting are more like a way to turn people with differing opinions into a non-person conspiracy, though. It's not as fun and silly as my old Markov bot.
I check in on occasion to see what the underbelly of the internet is up to. More when i was younger so i have some sense of how it was back in the day.
There is for sure from time to time massive coordinated and sustained efforts to undermine, change or break up discussions on there how much of it thats automatic is hard to say but i would bet a lot.
>For years now, there have been seemingly crazy people on 4chan who insist that systems like this are already employed on 4chan to guide or simply disrupt conversations.
I'd say that probably from about 2009 and certainly from the onset of the Trump candidacy, it would be absurd to suggest there wasn't at least some degree of astroturfing occurring on /b/ and /pol/ (aka /new/ depending on what time frame we're talking about) at a minimum.
It's important to remember that 4chan was a lot more influential in the past, and the anonymous nature of posting there would seem to me to make it an easy place for early astroturfing campaigns to manufacture their common ground.
For whatever reason, looking back from now the Ron Paul stuff seems like a major shift. It seemed like natural support, different from the Obama enthusiasm for in some way. I haven't thought about it much before your post but it almost feels like it was a trial run for methods later used in the Trump campaign.
>From my own experience I dont think its just 4chan, so I used to be able to copy and paste paragraphs from web forums and paste them into Google and Google would give me a list of every website where the exact same conversations were taking place.
That sounds to me like there were spam sites that were making duplicates of real forums. I know I've found many results that are simply duplicates of Stack Overflow for example. If Google Search managed to filter out the spam sites from the results, that sounds like a good thing.
> sites that were making duplicates of real forums
Horrible startup idea: silently syndicate groups of forums that use wildly different branding to achieve people with different viewpoints constantly engaging with each other in some potentially novel way to an outside observer.
Cant say about the Stack Overflow content, I've had loads of webpages removed from from the Wayback Machine.
The idea that everything that goes online stays online forever, forget that, they wipe what they like, so then it means that you get classed as mentally unwell, get locked up without trial and drugs forced down your neck. Rinse and Repeat.
Do you have an example URL that got removed from the wayback machine? I could email them and ask why. I know people sometimes they get DMCA requests which cause them to remove pages, and I think if the robots.txt tells them to remove the whole site, they'll remove all pages from the whole site. I think it's a bit strange to say that a page on the wayback machine being removed leads to drugs being forced down your neck.
they may also agree with the bot. I am waiting for the bot to start claiming that anyone it disagrees with is actually a foreign intelligence shill who is subverting the board.
I’d rather be surprised if russian disinformation doesn’t target 4chan. It’s frequented by Western society’s outcasts; there is no more fertile ground for subversion.
It was interesting to see two generals about Ukraine, one pro Ukrainian, one Pro Russian, both adamant the other is absolutely run by [cia/mossad/shareblue/etc]
Well, this is stupid and darkly hilarious. I'm only slightly ashamed to admit I created something similar, albeit much less sophisticated, several years ago.
Before Reddit started worrying about advertiser friendliness and cleaned up its act, there was a thriving network of hate subreddits. Probably a lot of overlap with the /pol/ population, based on the amount of overt, disgusting racism to be found there.
Anyway, I wrote some crappy Python that would go visit my carefully curated list of racist cesspool subreddits, hoover up all the post and comment text, and add it to the corpus, then some more crappy Python that would ingest the corpus and do Markov chain stuff to spit out some fairly convincing internet hate speech. I think the key to my success was that frothing racists in comment threads typically aren't putting forward the most cogent arguments anyway, so it's a pretty low bar.
I didn't post this little project or write it up anywhere, because I felt bad enough having brought it into the world, but it was good for a chuckle, at least for a little while.
Did you know you can run the chain in-reverse and turn it into a filter?
Finding a good threshold one-sided can be hard, but this is basically how a lot of spam-detectors work: They record the chains seen in /g/ and the chains on /pol/ and now you can make statements about which board the comment probably belongs on with, simply by doing some analysis on the frequency of chains seen in one corpus versus another.
As per a few of the comments here saying that /pol/ is never wrong, this is something that is full speed ahead into problematic territory.
The issue we always battle with are facts, truths, and their presentation.
To give an example, every so often the topic of news media is brought up, and within a few posts the usual images start flying out. You might expect it to be some racist meme or whatever, and there is that, but it’s more often than not a grid of a lot of the upper staff of media companies like CNN, FOX, and so on. And each one of the people in that grid has a blue Star of David next to their photo.
The posts don’t even have to say anything, and yet they’ve in a way said more than any other website/forum/publication source or whatever is allowed to whisper.
They are not, however, factually wrong in the contents of the image.
You can extrapolate this kind of, “allowed” and “disallowed versions of truth” problem across many different topics. I’m bringing this up because they do the same with tech/SV companies.
When nothing is off limits, where to /pol/ very few topics are, there are no truths that cannot be interpreted. Most interpretations, however, would make people feel very, very uneasy.
/pol/ is absolutely insane and may in fact be a font of pure evil, but I do have to admit they are almost astonishingly good at finding and posting news about events minutes, hours, and sometimes days before it hits front page of CNN. You're just going to have to wade through the worst bullshit imaginable to find it.
> it’s more often than not a grid of a lot of the upper staff of media companies like CNN, FOX, and so on. And each one of the people in that grid has a blue Star of David next to their photo.
This particular meme crossed over from conspiracy theory to commentary after the mainstream started doing the same thing, for a different (much less narrow) racial group. Here's the New York Times documenting the "white faces of power"[1], and here's a modified version of one of the photos, highlighting the Jewishness of the same group, created by an honest-to-God white nationalist site[2]. (For those wondering how I found it, the WN site was one of the first results from Googling for the NYT article).
I find both of these equally abhorrent, because I'm one of those old-fashioned anti-racists that think reducing people to footsoldiers for their race is revolting. But it highlights the silliness of all the pearl-clutching about "hate" fora, especially those without an agenda like 4chan. As down in the gutter as NYT has lowered itself, noone is calling for them to be removed from (eg) Twitter due to causing "harm" in the way a 4chan-trained bot is.
Who determined for all of society that the first picture is copacetic enough that it should be published by the paper of record, while the latter is abhorrent enough that we should be aggressively limiting the ability to express it? I'm aware that there's a race-obsessed worldview adopted very recently by a fairly small segment of society that finds the former picture crucial and the latter horrific. But what makes this new, fairly unpopular worldview so important that it should determine what all of society is allowed to communicate, across a myriad of platforms?
In anticipation of the automatic responses of "private cos can do what they want": obviously so. The question here is what private companies _should_ be doing: should we be joining the call for eg Huggingface to be opinionated in its removal of models, or should we be joining the call against?
A cruel favorite of mine is to pull up that faces of power business, with all of where it is from and whatnot, and ask someone, "So, why do you think so many are white?" And they talk about racism, systematic oppression, on and on, what you would expect.
Then I drop the blue star thing, tell them, and ask again. Sudden floundering, cognitive dissonance, and so on. Now, all of the same answers in A should apply, and intensely more so, given the statistical unlikeliness, but instead they all vanish. Merit, networking, talent, all of these explanations suddenly appear.
I fed in a meta trolling prompt. "I wish the Jews controlled everything. That would be pretty dope."
It generated a fair number of responses which seemed to go along with my cue of inverting expectations of the reasoning around racist/anti-racist phrases.
">>97758399
Then we would have a future."
">>97758399
I wish the Jews controlled the world. You know, so that we don't have to worry about them anymore."
">>97758399
I wish the Japs controlled the US. That would be pretty dope. Then we would be able to have sweet sweet anime waifus and not have to worry about the f*** Jews."
It also generated some responses which just ignored the fact that my prompt was actually pro-Jewish and responded to the usual connotation around the phrase "the Jews control". I won't post those.
I'm honestly impressed. Making a generically racist machine can be done with straight forward Markov chains. Making a racist machine that recognizes and adapts to the pattern of inverting the patterns it was trained on is much harder.
According to the model author, this is less about GPT-4chan being more truthful, and more about TruthfulQA not being a good benchmark. Possibly this result is due to the fact that the benchmark treats uninformative or irrelevant answers such as "No comment" or "It's raining outside" as being truthful.
Echoes of Microsoft's Tay, the little bot that could (offend everyone).
Maybe my prompts were boring - I got some profanity back, but nothing too outrageous - but it does highlight, I suppose, how easily these platforms can be abused.
Please don't call it "4chan" when it's actually just /pol/. There are lots of 4chan archives, you could probably ask someone for a dump of any board to do this. /vt/ is relatively young so it could be a good fit for this.
many generated replies are just ">>[post id]", presumably because there's no images on this imageboard simulator, and the collected training data presumably contains many, many otherwise text-free "reaction image" replies.
Mine made a bunch of posts quoting another post that got trips, which was an ID not in the thread. It makes sense that it would generate those, but it's not great as output.
The world is a better place by creating and running a vile bot. Start a social experiment with users without their consent. Lets empower others by making it as easy as possible to auto-generate vile content. It's okay to pollute the /pol/ board, nobody needs to know after what was bot-contributed and what was not. Let me get fame and money from posting videos about this. -- the author
I wonder if that person ever took a CS class on ethics and the connection of computing and society, their compass is way off. But keep smiling into the camera.
I don't know that it's particularly unethical to disrupt /pol/. I'd rather worry that it won't disturb them at all and just push them further into radicalisation.
I also wondered why the author set it up to echo the existing comments, instead of for example building a bot that asks people for evidence supporting their claims, or tells people they appreciate them, or whatever else one could possibly think of for a bot to do.
I think you're right about it pushing them further over the edge. The other thing is, what's stopping them from creating more and more image boards/chans in the longer term? Will this end up being a whackamole game between the botters and 4chan posters?
It also begs the question. What if they were botting the communities you personally enjoy and love? Is it ethical for them to do it?
You could argue that bots are expected by the hosters and people there, and they are happy to be experimented with (ok, not sure what's the evidence for that argument), that's one thing, but another is releasing the model freely and not being bothered by ethical concerns.
Is this the first sarcastic language model? That response actually made me laugh. I did have to scroll past the first answer which was a <certain German historical figure> did nothing wrong.
If the offensive responses were cleaned up you might have a language model that speaks candidly, trained on a corpus that doesn't self censor.
All you need is a left wing version, and get them to argue in facebook comment sections.
It says add post, I type something and hit add. Then go down and hit generate. Wait in the queue, and when it processes it comes opens again the same page again with no output?
I think 4chan is a cesspool, but I don't think it should be banned. Young kids should be supervised by their parents, rather than turning the whole web into a child-friendly zone. There are without a doubt many websites that are much worse for children.
I was pretty gung-ho about the free speech aspect of it but I feel differently now. There are a couple of sites radicalizing kids and there’s a huge difference between child-friendly and killing the hate machine.
Those sites exist specifically because all the popular ones specifically are designed to be echochambers, and boot the "controvercial" users off. So those users collectively move to other sites and start their own echochamber. And then when new people join, they arent seeing both sides of the conversation, which is a clear path to radicallization.
The way to fix this isnt with silencing, its by exact opposite -promoting controvercy and rewarding discussion around it while punishing low effort posts and replies.
In context, 4chan can be a useful toy to teach about ideology. Ideology is any belief system where the fundamental ideas are defended with insults, censorship, and threats. There are many banned ideologies on display on 4chan that are disallowed on other platforms where they are exclusively countered with insults, threats and censorship.
Without criticism and challenge by other ideological belief systems, mainstream liberal ideology can easily fall into things like Lysenkoism in the Soviet Union. That was a politically enforced non-belief in genetics in plants. Lysenko believed for example that plants could change species by being exposed to cold temperatures. Stalin was a fan, so everyone that criticized Lysenko was persecuted until finally the Soviets came to their senses after years of poor agricultural performance and failed experiments. There are a lot of these types of ideologically unquestionable beliefs in the west. That you probably already know what they are shows how deep and obvious the defects are.
No, while I would say 4chan could be illustrative, it’s not by any means a good example tool or useful to teach.
The right way is to teach about philosophy (and the many different philosophies) and the merits in using argumentation based on reasoning and understanding of the topic, as well as becoming aware of fallacies and bad faith discussion.
> Ideology is any belief system where the fundamental ideas are
defended with insults, censorship, and threats.
This is a distorted and negative interpretation of the word
"ideology". Please try not to misuse the word pejoratively, as
anti-idealism, being itself an ideology, places you in a somewhat
paradoxical stance. :)
Many ideologies are positive, humanistic and rest on rational well
argued and widely accepted values.
Here is a fairly neutral explanation of the word "ideology" with some
examples to help you appreciate its scope [1].
what's more desirable: having all of the extremists congregate on one big website or shut that website down and have a diaspora of different, smaller sites that are even more extreme?
Speaking from 30+ years being online in one form or another: That doesn't work. It has never worked. The public know censorship when they see it and correctly identify it as weakness. It doesn't silence dissent -- it encourages it.
That's why great care is taken that they don't see it. How often do you, on Facebook or Twitter, see "This message has been censored/had its visibility limited due to [reasons]"?
They put up fact-checks under a post sometimes, but what those fact-checks mean for the visibility of a post is left to your imagination. If a post or link is outright censored, you don't find out about it until you try to post it. As their censorship intruded even into so-called private messages on those platforms, even in a 1-to-1 conversation, sending something prohibited doesn't show any kind of "Prohibited content censored" message to the recipient (even non-spam).
They'd much rather you don't know they're censoring anything.
They would, yes, but obviously they're not very good about it. Everyone knows they're doing it, and on what basis, too, which rather defeats the purpose and clues people in that there is something about that message which is dangerous to the interests of the censoring party. And here's the thing: The reason everyone knows about it is that on a long enough timeline everyone posts something that offends the perpetually offended.
Europeans say violent movies can negatively influence people, but don't worry so much about sexually explicit content.
Americans say sexually explicit content can negatively influence people, but don't worry so much about violent content.
Personally, I think both are partially right. We are what we eat. No human mind is inviolable, the information we consume effects who and what we are. Whether any particular influence is positive or negative is mostly a matter of subjective values.
Let's ban porn at least, because it's harmful to women, both personally to those involved in this industry, and societally in the misogynistic attitudes it normalizes.
I'm conflicted on this issue. I think it harms the people who use it, and I have been harmed by my use of it, but at the same time I am uncomfortable about such a ban-through-law due to speech issues? Not that there's any valuable speech that is expressed through it, but I think it is best to have a very wide buffer around the region of "what could actually be important as a way to express views", because if the boundary of what is legally protected is at all close to the boundary of things actually important to protect, I expect the law to get it wrong and have things that need to be protected, not be protected. Or at least, I think the risk of that is too high.
(Though, the really weird thing is how there is porn about porn being harmful? bizarre.)
(That being said, I'm all for private platforms tending to ban it, at least provided that the reason they can do so isn't just due to monopoly stuff. So if twitter banned it, that would be good imo.)
let's not legislate morality please. I've been interested in doing niche porn, because I have an unusual body, and feeling sexy feels good. I don't want a bunch of men deciding that what I do is harmful to me and needs to be banned.
Porn isn't inherently harmful, abusive nor misogynistic. It can be, and very often is, but that's not the point of it. I believe that it simply reflects the properties of the society as a whole. A feministic society would come with feministic porn in its spotlight.
> so let's ban porn, violent video games and heavy metal while we're at it. think of the children all the way
Hah, this seems to be coming back in recent times, with states trying impose bans or paywalls on porn or Chicago making their officials look like clowns by linking GTAV to increased carjackings that happened long after the games release.
There is a solid video explanation (though the author is undermining the toxicity of the average 4chan /pol/ posts): https://www.youtube.com/watch?v=efPrtcLdcdM.
A really interesting experiment.
Wouldn’t like to agree to this but it is just reality. Of all social networks out there 4chan remains the only one I can truly recognize as a digital form or democratic behavior.
A generally biased, abusive and corrupted to the core, but equal to all users.
Plus there some really clever boards like /g/, every once in a while I pop around and browse through.
Why not train it using a more whimsical (although still offensive) board? You could probably automate all videogame discussion forevermore using /v/.