Why AI safety is like a bolt in a croissant
Jacob Ward on technological progress, casino gambling, and how to make AI safe for humanity
As we’ve crossed three years since OpenAI debuted ChatGPT in 2022, AI technologies have gone from a curiosity among academic scientists to one of the most popular products ever shipped. Billions of people now use AI for everything from sundry amusements to mission-critical applications, and it has started to diffuse into nearly every industry imaginable. But along with such power comes great responsibility, or at least, one would hope.
Jacob Ward — the former editor of Popular Science, long-time tech correspondent, podcast host of “Rip Current” and the author of the popular book The Loop — is skeptical. Via his own personal experiences and reporting, he sees AI’s addictive qualities and its lack of safety as a serious challenge for regulators and society as a whole. He analogizes this challenge with the cultures of software and hardware engineers, where software is about “if we ship, then we’re going to sort it out” and hardware is about how “scale compounds your problems.”
We talk about biases and decision-making, the connections between AI and casino gambling, why LLMs are like experimenting on people in the wild, how to think about regulating edge cases, ex-anti legal frameworks, Nita Farahany’s idea of cognitive liberty and why product enthusiasm is not a substitute for safety.
This conversation has been edited for length and clarity. For the full version, please subscribe to our podcast.
Danny Crichton:
You’ve described your work as the “Black Mirror” of Thinking, Fast and Slow. So for listeners who haven’t read your book, The Loop, or know your body of work, talk a little bit about your main thesis.
Jacob Ward:
Sure. I was in a documentary production PBS did. We got a couple million dollars from the National Science Foundation to create a kind of crash course on Daniel Kahneman’s work and the constellation of research that came out of it. That show was called “Hacking Your Mind.” And for me, it was a totally life-changing experience. Not only were we meeting all of these incredibly important thinkers in behavioral science, but I was also kind of a guinea pig.
I was the prototypical person who had no idea how automatic my thinking was.
The conceit of the show was here’s a guy who thinks he’s immune to all of this stuff, right? He makes his own choices and has no biases. And then they subjected me to test after test after test, and I did exactly as anybody else would have done. I was the prototypical person who had no idea how automatic my thinking was. I realized how much of it was out of my control and based on these heuristics that Kahneman and others had identified.
At the same time, I was also a technology correspondent for Al Jazeera and at NBC. In my day job as a tech correspondent, I was bumping into company after company after company using these pre-transformer models, ML and neural network, human-reinforced-learning systems to try to identify behavior in people and predict (and sometimes influence) their choices. Some companies were trying to help people lose weight or save money — there were some positive paternalistic ideas there. But in many cases, I was finding people who were working for big gambling companies and trying to shape people’s behavior that way.
As I learned more and more about transformer models — which then came along and made the current LLM language model possible — I suddenly realized, “Oh, we’re about to enter a world in which you have all of these behavioral science findings creating a kind of manual for how people make choices.” And then you also have a whole world of pattern recognition systems that can pick apart that data in ways that human researchers would never be able to and come to predictive conclusions about how we’re going to behave.
And if I knew anything from 20 years as a correspondent studying how businesses make choices, it was that people with the best of intentions wind up preying on human vulnerabilities if they start to run out of funding and need to come to a minimally viable product as quickly as they can.
People with the best of intentions wind up preying on human vulnerabilities if they start to run out of funding.
Suddenly I was like, “Wow, this is an emergency.” Where a lot of people worry about the “Terminator” possibility, I worry about the “Idiocracy” possibility. I worry we are going to wind up amplifying our most ancient, primitive instincts and become, if we’re not careful, a more primitive version of ourselves.
Danny Crichton:
When I think about the last 15 years, I see two sides of the coin. So on one hand, you have a neuro-side focused on gamification and how you take insights from psychological behavior. For example, the idea of offering variable rewards to which humans respond very, very well. And then on the other side, about 15 years ago, Nicholas Carr was working on The Shallows. More recently, Shoshana Zuboff wrote about surveillance capitalism, and Karen Hao came out this year with Empire of AI.
So there’s two sides. One is like, “Here’s how to make money — a guide for how to do it effectively, efficiently.” And then there is an increasing number of books on the shattering of the intentional commons and the inability of folks to focus.
When you think about all this in 2025, three years after ChatGPT came out, how do you start to think about the impact that’s having?
Jacob Ward:
One of the things I’m most concerned about is that there’s some very dated assumptions about how problems with something like this get worked out. What I come back to all the time is the idea that people experimenting on other people in the wild is kind of the essence of a lot of these large language models. The companies behind them have the idea that we’re just going to get it out there. If we ship, then we’re going to sort it out. And that has to do with, I think, people running these companies who came from software. They are people who believe scale solves your problems — over time, scale will work out the bugs.
Hardware people instead will tell you that scale compounds your problems. I once interviewed the CEO of Midjourney and was asking him about some of the problems he’d been seeing on his platform, how much porn was being developed, how much violent porn was out there, all kinds of stuff. And he basically said, “Well, I take this attitude that people are generally good and we have to act on that assumption and we’ll sort of work it out over time.”
And then he said, just sort of thinking out loud, “If I’m somebody who makes muffins — I’ve made 10,000 muffins — and somebody gets food poisoning, am I supposed to stop making muffins?” And there was this awkward silence between us. I was like, “Yeah, dude. Yeah, you’re supposed to stop making muffins. That’s what the FDA would say. You’ve got to find out what’s wrong. An inspector is going to come to the facility. That’s the whole point.”
They are people who believe scale solves your problems… over time, scale will work out the bugs. Hardware people will tell you that scale compounds your problems.
I mean, I’ve gone to big industrial bakeries. There’s one here in the Bay Area called Semifreddi’s. I went on a tour of the Semifreddi’s factory. I was there for an inflation story, but at the end of the assembly line, he showed me the metal detector all the products go through. And I was like, “Metal detector? Why do you put croissants and granola through a metal detector?” He’s like, “You’ve seen all of the machinery we have here. Well, a bolt is eventually going to come loose and fall into the dough. And if even one person cracks a molar on a bolt in a croissant, then that’s the end of my business.”
So you’ve got people who are in the business of putting stuff out to 800 million people thinking edge cases are such a small percentage of the total, it doesn’t matter. But meanwhile, you’ve got people who make croissants for a living saying, “The edge case is my responsibility. It’s my fundamental responsibility to make sure that never happens.” And that disconnect for me is where I get really worried.
On Monday this week as we record this episode, OpenAI released a big report on the mental health effects we’re seeing, and they said that 0.15% of their users are developing an outsized emotional attachment to the chatbots. Similar numbers of people are developing or showing signs of psychosis and mania in their interactions with ChatGPT. And 0.07% of people are talking openly about suicide with this stuff.
You could look at that and be like, “Oh, well, that’s a tiny fraction of a percent.” But with a product that reaches 800 million weekly users, that’s 500,000 people a week exhibiting suicidal ideation. That’s more than a million people a week exhibiting signs of psychosis. We have to be thinking about the circuitry of human decision-making as if it is hardware, or at the very least we’ve got to meet the standards of food, because this stuff is really, really important. It’s going to change how we think.
Danny Crichton:
At the same time, we’re also dealing with the fact that, in these cases, we have a single piece of software with a very specific ethical, moral — we’ll even call it religious — point of view. It is Western, it’s built in Silicon Valley. It is generally imbued with the norms of the builders of these platforms, but it’s now deployed everywhere.
I wrote about this a couple of weeks ago — why doesn’t your chatbot recommend religion? It’s not going to do that because it’s been trained not to, even though maybe someone should be counseled that way. So I do think there’s this open question of ethical quandaries as you scale up.
Jacob Ward:
The new fashion for the heads of these companies is to say, “Well, now we’re just a reflection of humanity. And humanity has a bunch of problems. We can’t be expected to solve those problems.”
I think you wind up in a place where you sort of say what a lot of the social media companies say, which is, “Well, at this point, it’s gotten to be so big and unwieldy. We can’t truly take responsibility for it anymore.” The new fashion for the heads of these companies is to say, “Well, now we’re just a reflection of humanity. And humanity has a bunch of problems. We can’t be expected to solve those problems.”
But now that we’re seeing real effects, I’d like to think that attitude could change. And it’s going to be incumbent for it to change; we haven’t even begun to see how powerfully this stuff is going to seal people off socially, seal people off psychologically. It’s going to invite some new assessment of what “harm” legally means.
In this country, we only like to think about financial harm and physical harm. We regulate against money losses and death pretty well, but we don’t like to regulate anything else, and we certainly don’t like to imagine that people are anything short of entirely responsible for their choices. I’m a former drinker who, when I see “drink responsibly” on the bottom of alcohol ads, it makes me crazy. There’s no such thing. That’s why I had to quit drinking.
One thing — and I’ll be curious to hear your perspective on this Danny — is how ready these companies seem to be to own the short-term horrors that are going to be visited on people on our way to get to some utopia with AGI that’s curing cancer for us and taking us to Mars and the rest.
It’s weird to hear companies say, “Oh, there’s going to be enormous job loss, and tons of scams are going to come out of this.” It’s this weird ex-ante public relations tactic. That promise of utopia is the new excuse that, I think, people are using now to avoid responsibility for the short-term stuff. They just say, “It’s going to be worth it when we get there.” I’m curious what your perspective is on that.
Danny Crichton:
What’s interesting is that AI is the first time I’ve seen all the alignment teams and the super alignment teams with many, many people focused on this. It goes beyond PR, at least in my view. There are too many people out there who are very deeply passionate about this subject. Now, I don’t know if they’re always empowered to do what they need to do.
And unfortunately or fortunately, there’s an immense amount of competition in the AI modeling space. It feels like you’ve just got to keep running to maintain your edge. You can’t take this step back and ask, “how many jobs are going to be affected?”
I bring this up in a lot of the policy work that I do. People can transition, they can retrain, they can do a lot of stuff. We did that to some degree in manufacturing, but it was not fully complete. And that has led to populism and a lot of today’s challenges. That was a 30-year transition. This one is going to happen at a speed we’ve never seen before. 30 years will be three years. There is no world in which people can retrain that fast, and that’s what worries me — just the speed.
We’re just on the threshold of figuring out all the implications for these technologies. But I am curious about this alignment piece. It doesn’t feel like window dressing, but also doesn’t seem like anyone’s really willing to shut things down at this time.
Jacob Ward:
That’s right. I mean, I have a new book project I’m working on. The tentative title is Great Ideas We Should Not Pursue. It is a tongue-in-cheek look at points in human history in which we have had a great idea and said, “let’s not do that.” So I’ve been looking for examples of places where we’ve held back or in some way slowed down the commercialization of a thing because we thought maybe it wasn’t a good idea. It’s really hard to find examples of that!
There’s a disconnect between how powerfully this stuff is going to work on our civic lives and how little civics the people making it seem to have studied.
Jennifer Doudna and the CRISPR revolution — they signed a one-year moratorium on experimentation on human germ lines. That was one example, but that was also one where no one was beholden to financial interest. That was an innovation that came from inside public universities.
Meanwhile, I think about the disconnect between how powerfully this stuff is going to work on our civic lives and how little civics the people making it seem to have studied.
I was at a meeting once — this was pre-transformer models, or right when they were coming out — and I can’t name names, but early founders of some of the big companies were at this meeting. One of them was presenting the idea of a human-reinforcement learning program, where they were having human workers complete some sort of fill-in-the-blank sentences. And he said, “There’ll be sentences like this: ‘I would never blank with a coworker because that would be unethical,’ that kind of thing.” And after they have done enough of those, he then said, “we will arrive at a set of universal human values, and now I’ll take your questions.” And every hand in the room goes up. Some of them are trembling, they’re so outraged.
The first person was this political scientist. She says, “Okay, I actually have three questions. What’s universal? What is human? And what are values?” And the whole meeting implodes. And that was it, because he’d presented this freshman’s idea of how political science works and how morality and philosophy work with the keys to the car.
I think everyone who makes this stuff has good intentions. To my mind, there are really good long-term intentions that are easily distracted by the need to pay off vast capital expenditures. And as a result, I don’t think these companies and good intentions are going to save the day here.
Danny Crichton:
As we’re recording this today, Character.ai said that they will start to do identity checks for anyone under the age of 18. So there’s a sense that at least children should not be part of this story. I’m curious: How do you think about precautionary principles around harm?
Jacob Ward:
I’ve mostly just been thinking about the big dumb hammer of litigation — and that discovery in litigation will reveal some of the techniques being deployed and create a little bit of coalesced legislation around those. Now, I just have to always remind everybody, there’s no federal data privacy regulations in this nation. We have no rules about that stuff.
When you look at a regime like Australia’s, which is a real ex-anti legal framework, we’re going to get out in front of these harms rather than sort them out afterward. But it is so powerful and important to remember that the models are being built by that guy who was in front of that room of social scientists. This is a guy who’s trying to build into the system an ex-anti regime of values that are going to govern this stuff. Anthropic has gone on to create this constitutional AI. They have a similar notion. There’s this obvious idea that they’re going to have to pre-build some rules into this stuff.
The enthusiasm we will have as people for the experience of the product is not going to be the right measure of whether that product is okay to be selling in the first place.
Well, if the technology is going to be pre-building all this decision-making, then we have to pre-build some decision-making around the legal liability. That’s going to require real civic debate; what do we agree or disagree on about what’s right and what’s wrong? There are some really smart thinkers about this. I feel like I mention her name in every public appearance I make, but Nita Farahany is a Duke law professor who wrote a book called The Battle for Your Brain. She has this whole idea about cognitive liberty and how we need to enshrine it as a civil right. It sounds very, very gray, but she’s a sharp legal thinker who’s created a real framework for assessing this stuff. She has a whole framework for figuring out the difference between harm and manipulation — and where they overlap.
And I think we’re going to have to get into that world in the same way we eventually had to deal with cigarettes. If you asked your average cigarette smoker in 1957, “do you like cigarettes? Do you want to keep going with the cigarettes?” They’d say, “Yeah, they’re a positive refresher. This is a fantastic experience.” The enthusiasm we will have as people for the experience of the product is not going to be the right measure of whether that product is okay to be selling in the first place.
Something similar is going to have to happen here. We’re going to look at people and be like, “Oh, it’s working on your circuitry.” The loop that this has set you on is something beyond your control. It feels good to you, but that doesn’t matter because it’s causing you to make all these terrible decisions. And here are the ways in which these companies have clearly made these choices internally to pursue this kind of behavior.
Once we get to that point, you’re going to start to see big financial losses. And that’s where I think the rubber is going to meet the road in terms of people changing how they conduct business with this stuff.
Danny Crichton:
I like this idea of the lack of humanistic thought in tech circles. I’ve been in the tech industry arguably since my first startup experience in 2007, so almost 20 years.
I do think people had a much more well-rounded education in an earlier generation. One argument would be, particularly in AI, people had to go deep into their degrees. They had to get a master’s degree.
When did you have a chance to walk over to the English department and read literature and understand the human condition? So I wonder how you’re thinking about bringing that back into the equation.
Jacob Ward:
Man, I wish I knew the answer to this question. I’ve been in these weird conversations with people, where I sort of want to say, “Hey, bring me on as an advisor. I can help you.” I’ve had a couple of companies come to me and say, “Hey, I want to do this the right way. I’ve read your book and other books like yours, and I’d like to try and be on the right side of history.” I’ve had a couple conversations like that, but I really feel like that hydrophobic sand that you put it in water and it can’t get wet. I feel like that around here.
But I like to assume, as the people of the generation you’re describing have kids, they’ll sort of look at the world that their creation makes possible in a new way. I mean, all great breakthrough thinking about how humans evolved, for instance, has to do with long-term generational thinking. So we have to get into long-term generational thinking.
We’re not built to think beyond the circle of our campfire. And that’s where some of our higher functioning needs to come into play. But it doesn’t feel good, and it doesn’t make you money in the short term.
Unfortunately, that’s not the way we are incentivized right now. It’s not even how our circuitry is supposed to work. I got to spend some time with a tribe in Tanzania. They’re one of the last remaining tribes on earth that lives the way we all did 60,000 years ago. They have no last names, they have no property, there’s no marriage, and they are nomadic. They don’t raise crops or anything.
And one of the most amazing things about them is that they have no word for any number beyond five, because why would you need that? In the way that our brains are built, you don’t need to be thinking more than five moons in the future. You don’t need to count the number of people in a group. Once it’s more than five, you can just say, “It’s lots of people.” There’s three pieces of meat or there’s enough for everybody.
And this is the whole thesis of my book — we’re not built to think beyond the circle of our campfire. And that’s where some of our higher functioning needs to come into play. But it doesn’t feel good, and it doesn’t make you money in the short term. And so that’s the essential problem I think we’re going to be facing in this generation and the next.








