Category Archives: 1

AI risk was not invested by AI CEOs to hype their companies

Crossposted from world spirit sock puppet.

I hear that many people believe that the idea of advanced AI threatening human existence was invented by AI CEOs to hype their products. I’ve even been condescendingly informed of this, as if I am the one at risk of naively accepting AI companies’ preferred narratives.

If you are reading this, you are probably familiar enough with the decades-old AI safety community to know this isn’t true. But I don’t have a good direct way to reach the people who could use this information, and still I hate to leave such a falsehood uncontested. So if this is obvious, I hope the post is still perhaps useful to point more distant and confused people toward.

~

I personally know that AI risk was not invented by the tech CEOs because I have been near the middle of it since at least 2009—before any of the prominent AI companies existed, let alone had CEOs who might be trying to hype their products.

Here’s are some miscellaneous events over the years to give you a sense of the implausibility of this:

  • 2008 – I attempt to contact Eliezer Yudkowsky to inform him that I am ‘trying to figure out the optimal way to use my life’ and would like to hear a better account of why his plan (of worrying about AI risk) is good. I have read about it online, but would like a clearer account. Traveling the world shortly after undergrad later, I meet a handful of people in person in the Bay Area who care about this, and one argues strongly that I should prioritize AI risk over my previously preferred causes e.g. climate change. I decide to think about this.
  • 2009 – I am still not very convinced that AI is the most important thing to work on, but go to stay with the people who are worried about it for a few months. I argue about it a lot with a handful of them. There seem to be about twenty of them locally in the South Bay, though many more who comment on the relevant blogs. My photography collection from this era is quite sparse.


    I go to The Singularity Summit for my first time (and its fourth), which is very lively and full of people who are thinking seriously about the future of AI.

  • 2010 – Deepmind is founded. (I am back at school.)
  • 2011 – I start a philosophy PhD at CMU, hoping to be eligible to work at somewhere like the Future of Humanity Institute one day, which is a happening hub of discussion about existential risk, AI and other important issues, that I like to visit.
  • 2012 – I visit the Bay more and hang out with the growing AI risk community there. I visit the UK and do the same. I go to the AGI 2012 Winter Intelligence Conference.
  • 2013 – I move to Berkeley and work at MIRI for a semester during grad school. I measure algorithmic progress over time across various computer science domains, as input to expectations for artificial intelligence in future. I visit the UK and attend the Center for Effective Altruism’s ‘weekend away’ where we have a debate on which cause is best, between global poverty, animal welfare and extinction risk. Extinction risk wins—the crowd leaves having changed their mind in that direction on net. The three advocates just before or after:

  • 2014 – I join MIRI properly. I research The Asilomar Conference and Leó Szilárd as evidence about whether it is worth people trying to deal with risks early, because people around mostly believe that the risks from AI are at least a decade away, and there is disagreement about whether that makes it futile. I run an online reading group about Superintelligence, a new book about AI risk. I co-found AI Impacts, a project to answer questions about the future of AI, because AI risk seems at least fairly plausibly the most important thing to work on, and I want to investigate more and share my thinking with others.
  • 2015 – I attend the first FLI conference—it seems that more people and more prominent people are interested in AI safety! OpenAI is founded.

  • 2016 – I lead a team to run the first Expert Survey on Progress in AI. The median probability given to an outcome of advanced AI that is “Extremely Bad (e.g., human extinction)” is already 5%.
  • 2017 – Some people around me are getting very worried, and saying AGI will happen within several years. My survey gets a shocking amount of media attention, becoming the ‘16th most discussed paper’ in 2017 according to Altmetric. Apparently there is interest in this topic..
  • 2018 – I go to a big workshop for people working on AI risk in the English countryside, and a Chilean summit where I talk on TV and the radio about AI risk. It feels like interest is still picking up, and I feel optimistic about talking to the public.

  • 2019 – GPT-2 comes out. Someone tries to get it to name our house. My favorite names include things like “World peace: tigers and humans” and “rooftop hillside: the highest place in the world”. It is hilarious and useless, but also magical and wild. The things we have worried about for years are feeling more tangible, and people’s ‘AI timelines’ are shrinking.
  • 2020 – The world is reminded that really crazy things can happen. AI Impacts becomes remote. I spend the year with my household, who are almost all working on AI risk. We enjoy whiteboards a lot and run at least one good house conference in this period.

  • 2021 – Anthropic is founded

AI unemployment and AI extinction are often the same

Crossposted from world spirit sock puppet.

How much should the ideal person cry wolf?

Crossposted from world spirit sock puppet.

It is a fact about wolves and rationality that you should warn people about wolves quite a few times for every effective wolf attack.

In particular, there is an asymmetry between the costs of having one’s flock devoured and averting a non-eventuating wolf attack. If the carnage is a hundred times worse, then it’s worth up to ninety-nine false alarms to stop it.

The original fable was about a boy who would continually lie about wolves, and that is definitely poor form.

But in modern parlance, ‘crying wolf’ seems to be used for just being openly alarmed about things that turn out ok—I don’t hear much implication of deceit.

And in modern sensibilities, being seen to ‘cry wolf’—by even once raising an alarm that isn’t consummated with disaster—is something people seem to really fear. I think multiple people have asked me about whether AI safety people might have ‘cried wolf’ about some earlier GPT model. I’m not aware of anyone doing that, but the idea that they might have is so tantalizing that it bears investigating. Because if even a a few people somewhere did, it would be such a nice embarrassing blow to AI safety people.

And I probably responded in the tempting way: jumping to assure that I don’t recall hearing any such fears from these quarters. But I think that worsens public thought norms by implicitly buying into the unspoken premise that it would be quite shameful and naive to have raised even one warning.

And so relatedly, probably people who see real risks from AI are scared to voice them, lest they be seen to ‘cry wolf’ and tank the credit of the movement for the next round of dangers. Because it is taken for granted that one should only get one chance to raise an alarm. That the first warning must be for the most undeniably big, bad, real wolf.

This is not the wolf lookout system we want.

‘Warnings’ are usually about fairly bad events, and therefore they tend to be worth making when the probability of those events is still low. This creates a real difficulty for society in adjusting people’s credit when the low probability events they have warned of do not come to pass. Most of the time, if the person is right, the events still shouldn’t happen! The person wasn’t saying they were likely! Yet you don’t want to let the alarmist off the hook, with plausible deniability for arbitrarily many alarms.

I think the solution to this difficulty should look much more quantitative, like collecting rich track records of the predictions made by a person or a movement, and scoring them well. The present solution of childishly denouncing any unmet danger is insane.

And meanwhile if there are bad risks that have a low chance of appearing on every warning, we should still warn of them, and not be too much cowed by innumerate customs.

Eggs, rooms, puzzles, and talking about AI

Crossposted from world spirit sock puppet.

We can prevent progress! Conceptual clarity, and inspiration from the FDA

Crossposted from world spirit sock puppet.

“We can’t prevent progress” say the people for some reason enthusiastically advocating that we just risk dying by AI rather than even consider contravening this law.

I have several problems with this, beyond those unsubtly hinted at above.

First, it seems to be willfully conflating “increasing technology understanding and/or tools” with “things getting better”. The word ‘progress’ generally means ‘things getting better’, but here in a debate about whether it is good or not for society to acquire and spread some specific information and tools, we are being asked to label all increases in information and tools as ‘progress’, which is quite the presumption of a particular conclusion.

(Yes the sub-debate here is more narrowly about whether averting technology is feasible not whether it is good, but the bid here to implicitly grant that the infeasible thing is also reprehensible and backward to want (i.e. anti-”progress”) seems unfriendly.)

If we separate the conflated concepts—i.e. distinguish ‘increasing technological information and tools’ from ‘things getting better’—the statement doesn’t seem remotely true for either of them.

First: Preventing things from getting better is a capability humans have had perhaps at least as far back as the Sea Peoples of Bronze Age collapse fame. (If indeed we go ahead and make machines that do in fact destroy humanity, we will also have prevented ‘progress’ in the normal sense.)

But now let’s consider preventing “increasing technology information and tools”, which seems like the more relevant contention. I’m a bit unsure what the position is here, honestly—do people think for instance that the FDA doesn’t slow down the pharmaceutical industry? Do they think that the pharmaceutical industry is too small and insulated from financial incentives for its slowing down to be evidence about AI?

Perhaps we just don’t usually think of the pharmaceutical industry as ‘slowed down’ because we are used to that as the way it operates? Or perhaps this doesn’t count because the point isn’t to slow it down, it’s just to have it proceed at the rate it can do so safely for people, with the slowness as an unfortunate side-effect. In which case, fine—that would also do for AI!

In case this example is for some reason wanting, here are more examples of technologies slowed down to something more like a halt, from a previous post (more detail here also):

  1. Huge amounts of medical research, including really important medical research e.g. The FDA banned human trials of strep A vaccines from the 70s to the 2000s, in spite of 500,000 global deaths every year. A lot of people also died while covid vaccines went through all the proper trials.
  2. Nuclear energy
  3. Fracking
  4. Various genetics things: genetic modification of foods, gene drives, early recombinant DNA researchers famously organized a moratorium and then ongoing research guidelines including prohibition of certain experiments (see the Asilomar Conference)
  5. Nuclear, biological, and maybe chemical weapons (or maybe these just aren’t useful)
  6. Various human reproductive innovation: cloning of humans, genetic manipulation of humans (a notable example of an economically valuable technology that is to my knowledge barely pursued across different countries, without explicit coordination between those countries, even though it would make those countries more competitive. Someone used CRISPR on babies in China, but was imprisoned for it.)
  7. Recreational drug development
  8. Geoengineering
  9. Much of science about humans? I recently ran this survey, and was reminded how encumbering ethical rules are for even incredibly innocuous research. As far as I could tell the EU now makes it illegal to collect data in the EU unless you promise to delete the data from anywhere that it might have gotten to if the person who gave you the data wishes for that at some point. In all, dealing with this and IRB-related things added maybe more than half of the effort of the project. Plausibly I misunderstand the rules, but I doubt other researchers are radically better at figuring them out than I am.
  10. […]

Aside from the seeming disconnect with empirical evidence, I’m confused by the theoretical model here. Do people think the rate of technological development can’t be affected by funding, or by the costs of inputs, or by regulation? Or do they think these factors would affect technology, but that this will never in practice happen because the relevant decisionmakers will never have the will?

Do they also think technology cannot be sped up? If so, how is that different?

Do they just mean you can’t fully grind it to a halt, preventing all progress? That may be so, but in that case, slowing it down a lot would generally suffice!