Orgs: unreasonable boyfriend as service

Crossposted from world spirit sock puppet.

Suppose you and Bobby the car salesman are haggling over the price of a car. You could try saying that you won’t pay more than $3k, but Bobby can equally retort that he won’t sell it for less than $4k. If you guys manage to negotiate a sale, it will probably be at more than $3k (and involve revealing both of you as liars).

Now imagine the same situation, but you only have $3k and Bobby knows it. Now, if $3k is actually ok for him, you win and get your price.

Now imagine you are rich but you have a boyfriend at home who has only agreed to a $3k expenditure on a used car at this time, and thinks any more would be crazy. It’s shared money, so to pay more you would need to go away and get his permission, and it wouldn’t be easy. If Bobby believes you, then your situation is much like being poor again, and you win.

My guess is I read about this in Thomas Schelling’s The Strategy of Conflict when I was a teenager. The general observation is that being more constrained can often be helpful in a negotiation. Which is a bit shocking because it undermines the seeming truism that more power—more options, more resources—is always better for getting what you want.

A less general observation that also stuck with me about this is that you can trivially arrange to have such constraints through having an associate, such as a stubborn and spending-conscious boyfriend. (Ok, finding one of those is not trivial, especially if you have other desiderata.)

This is all background. The thing I want to point out is that being part of an organization rather than a free agent means creating and using this effect all over the place.

This is most obvious with timing and deadlines. I am a relatively free agent, and I am quite good at making deadlines for myself and then taking them seriously. But I feel like other people I casually negotiate with about how to spend time, aka my friends, often feel like deadlines I make are not very real, since I could just ignore them. Because it’s just an agreement with myself, it’s up for negotiation with myself. And if I insist on respecting these lines I drew myself that have no legible consequences, then it feels like I’m being weird and stubborn and unfriendly or perhaps charmingly neurodivergent. So I often don’t—once it’s a negotiation, then negotiating hard for my own goals, against my friends, doesn’t feel very friendly to anyone.

Now consider a friend working in an org. They can casually throw out that they have this thing due tomorrow, and everyone will take it as a hard constraint. I will take it as a hard constraint. I might even offer to help get it done, even though I have other things I want to do. Whereas if I had not only insisted on my imaginary deadline but hoped for any help in fulfilling it, I think that would often feel unreasonable of me.

The org believably cuts off the person’s options, like the boyfriend, and so the person implicitly wins many negotiations (or what would have been negotiations), all in the direction of doing more for the org, and without seeming unfriendly to their friends.

My own difficulties with this are partly a me problem—I’m probably not very good at ‘defending boundaries’. But my point is that if you are a solo human then there’s a whole skill-requiring task of ‘defending boundaries’ that just becomes trivially easy if you have an org around you to cut off certain possibilities. And also if your boundaries are ‘I am going to do this project tonight definitely regardless of if you want me to do something else’ then that will land a way with other people that reporting on your org’s boundary policing—‘I have to do this by tomorrow, alas’—will not.

I think this ‘service’ and making use of it is rarely intentional, but I’d guess it’s very effective, and is a dynamic that makes people more likely to join orgs rather than being solo. It just looks like ‘it’s harder to get things done on my own’ and a component of ‘it’s harder to structure my time’ and ‘I find I keep on doing stuff other than my work’.

When will AI surpass us at being limited?

Crossposted from world spirit sock puppet.

It’s not always better to be more capable. As I mentioned yesterday, it can (famously) be helpful in negotiations to have your hands tied. That is, to be disempowered from giving up everything the other party wants.

I had previously thought of this as a somewhat rare corner case of human behavior—I for one don’t haggle very often—but I now think negotiations where this is an element are are quite common: yesterday I described it in friendly (and honest) negotiations about how to spend time, for instance. And I also see a related thing in the practice of dietary commitments.

But is being less capable helpful outside of negotiating? And is this going to become AI related?

Yes and yes!

Commitments: more good things come to those who can commit (e.g. rides out of deserts, secrets, trust, love). ‘Committing’ generally involves cutting off certain options to yourself, whether in practical terms or via you being the kind of honorable person who can’t bear to do a thing they promised not to do. These are both kinds of limitations. If you were a more powerful creature, who was fully capable of breaking down any barrier, and fully capable of breaking a promise—a creature to whom all options were always open—then commitments would be less available to you.

Transparency: a big way humans know what is going on inside other humans, well enough to trust them, is that there is a connection between what is happening inside them and what is happening on their faces and in their bodies, and they usually can’t control this very well. People who can break this connection and control their external behavior independently tend to be feared and distrusted. It is valuable to be unable to stop these signals escaping.

Consistency: a big way we predict how a specific human will behave in the future is that each human has specific kinds of behavior that come easily to them, and it is hard for them to behave entirely differently. So if you are friends with someone who you have observed be attentive and kind to other people for five years, it is very likely that they continue behaving in that way going forward. Whereas a creature with more freedom of behavior could wholly inhabit that persona for five years, then change to a different one.

Relatedly, we know a lot about what to expect from a human stranger because of our prior knowledge of humans. If humans had the power to rewrite their internal dynamics and become totally different creatures, then we would much less know what to expect from one.

Scope of risk: people are safer to interact with if you know they are limited in their ability to cause destruction. You might prefer to hire a person who you think would be less able to wrest control of your organization if they wanted to. You might prefer to babysit a child who does not know how to pick locks or set fires. So a person might be more employable, or be taken care of by better babysitters, if they are less capable. Similarly, an extremely capable AI system might be a less desirable accountant than a human, if you can only fully trust the human to not be up to the task of hacking your accounts.

These are all to do with interacting with other creatures. For a creature alone in the universe, I don’t know of any situation where they are better off being less capable. But when you need to trust another creature, it is better to know more about them, and better to know they are cut off from options that might harm you.

In the usual picture of AI progress, AI is worse than humans at various tasks, and we are waiting for it to surpass us everywhere, at which point humans will be obsolete as labor. But in a world where AI needs to interact with other agents (humans or AIs) the aforementioned value of being less capable complicates things: perhaps there are skills where AI is already more capable than humans, but where that capability is a liability. For instance, lying smoothly and otherwise generating outward behavior that isn’t revealing about internal dynamics, switching between entirely different personas, and hacking skills. Given that, what does the trajectory look like?

Missing markets in executive function

Crossposted from world spirit sock puppet.

It’s early in the morning, and sadly 1:29pm. After spending some time looking at things and picking them up and walking up the stairs and down the stairs and considering questions like “what should I…”, which my brain apparently considered objects of art more than of imperative, I inched into a decision to go out somewhere. Perhaps it would be clearer there.

After a blur of climbing and descending stairs and seeking objects and forgetting what I was doing and appreciating how beautiful my bag is, I set out. After remembering I should take various medications and going back inside to do that, I set out.

Often my favorite cafe seems too far away, at about four blocks, but today I had wandered half way there while I considered my options, so I decided to go. It’s a German place that feels homely and wholesome to me in its unamericanness. I too-carefully contemplated different places to sit, and chose outside: today a sunny explosion of roses and umbrellas with words like ‘Reissdorf kölsch’.

I stared at the menu until the waitress had asked me a couple of different questions she hoped would open a conversation about ordering. I tried to go along, but digressed into the pronunciation of ‘Spätzle’ to give myself longer to think. I nearly forgot to order coffee. I slopped my coffee on floor on the way outside, which the waitress offered to clean up. She brought me my food outside just as I was deciding to move all my objects to a different table, at which moment I slopped much more coffee all over my computer.

My computer was closed, but she seemed concerned by this, and perhaps concerned about me in general. She had already told me where to get silverware and napkins, but she went and got them for me anyway, which was nice because otherwise I was maybe just going to not eat things for fifteen minutes until I became fully conscious that that was why I wasn’t eating.

I’m not usually like this, but sometimes I am, and it’s hard to put a finger on what the difference is, except to point at behaviors such as ‘how long will I inexplicably stare at my arm? If I go to buy a drink, what is the chance I will lose it?’ My understanding is that this kind of thing is called ‘executive function’ and that I don’t have heaps of it at the best of times, but much less at the worst of times.

This restaurant was providing me with a certain amount of executive function alongside afternoon breakfast, just out of kindness and obligation. But what if I could recognize the need, and intentionally buy it? Just go to a place that specialized in that, where they wouldn’t only make sure I order eventually and get my utensils and clean up after me, but actively take charge on causing me to get my shit together and do something in the day?

I was reminded of an idea I had before (from ‘10 things society might try having if it only contained variants of me’):

Shopfronts where you can go and someone else figures out what you want. And you aren’t expected to be friendly or coherent about it. Like, if you are shopping, and yet not having fun, you go there and they figure out that you are the wrong temperature, don’t have enough blood sugar, are taking too serious an attitude to shopping, need ten minutes away from your companions, and should probably buy a pencil skirt. So they get you a smoothie and some comedy and a quiet place to sit down by yourself for a bit, and then send you off to the correct store.

I had thought of the value-add there as ‘figure out what you want’, but I think part of what I was imagining is that they take charge and keep the process happening and ensure that decisions are made and blood sugar is acquired for instance. Instead of the thought of blood sugar leading to staring into space or being reminded of a different idea to do with blood sugar that you want to write down but you can’t figure out where to write because there are too many tabs in your computer and you think you should close them but first you want to record the idea..

You can buy executive function in some formats—for instance, I recently hired a Chief of Staff. But what if for instance you just want to buy a little bit of executive function sometimes, on demand? Like on the occasional morning when you are failing particularly hard at being a coherent agent, or when you are stressed or in pain and failing to figure out what to do about the stress or pain because you are stressed or in pain? Are these things that only happen to me? (Humorous ADHD YouTube suggests no.)

In my vision for this kind of service, it might live in the category of ‘way to treat yourself’, like getting a manicure (which—for those who haven’t done that—often involves more hand massage and offers of champagne than it might if treated as a more pragmatic nail improvement chore). Instead of just sitting in your living room considering stuff you should maybe do, you can sit in a comfy chair in a nice smelling place petting a cute puppy while someone charming and encouraging talks to you, figures out how you should proceed, and prompts you to do it in easy and compelling pieces.

The salad market mystery

Crossposted from world spirit sock puppet.

It often happens that I desire kale, but I want it to be clean and cut up, and while shops do sell this product by the bucketload, they are actually only willing to sell it by the bucketload. As a normal-sized person wanting a one-off salad, rather than a family of nine celebrating a kale festival, the market seems very uninterested in my existence.

‘Just put it in the fridge and eat it over the coming week, this isn’t a big deal’ I hear someone say. But I already have several plotlines going on in my life. I don’t want an additional kale arc that I need to track to resolution. I don’t want to commit. I just want a no-strings-attached salad that I can consume and walk away from.

‘Just throw out the rest of the kale’, I hear somebody say. But I don’t like throwing out mounds of delicious food that were elaborately grown and brought to me. This might be a moral failing, but so it is—‘salad + perfectly good kale destruction’ is a much less delicious prospect.

The same situation holds for other greens. I love parsley, but I generally want a fistful, not a promise of parsley for the foreseeable future. Basil becomes black and bad if you don’t eat it for too long, but basically the only way to get some basil is to invest in that outcome.

Why can’t I buy greens in convenient units? I’m not the only person who often eats alone, or doesn’t like throwing out food. My dislike of owning a pile of mildly decaying greens and feeling obliged to eat them is stronger than most, but surely not that rare. Greens don’t last well. I would have thought ‘one meal’s worth’ would be the most likely quantity of greens to want, but instead there is no apparent market for that (at least where I am, in California).

What is going on?

My current best theory: kale is pretty cheap, so a lot of the cost of providing it is in non-kale components, such as packaging and people putting putting it out on shelves. This means if you sold a single serve of kale, it would cost a disproportionate fraction of the price of five serves of kale. And most people, even if they did just want one serving of kale, would feel unjustified paying a much higher per-weight price for that, and so buy the mound of kale anyway and hope to figure out what to do with it. This might be a false economy—if they are like me and enacting that hope takes attention or is improbable—or not.

I love home-made salad, and probably eat much less of it than I would for this kind of reason, so the question of why I can’t buy convenient scale greens often crosses my mind, and I welcome better answers (both to why the market is like this, and the question of how to eat delicious salad now and then anyway).

AI risk was not invested by AI CEOs to hype their companies

Crossposted from world spirit sock puppet.

I hear that many people believe that the idea of advanced AI threatening human existence was invented by AI CEOs to hype their products. I’ve even been condescendingly informed of this, as if I am the one at risk of naively accepting AI companies’ preferred narratives.

If you are reading this, you are probably familiar enough with the decades-old AI safety community to know this isn’t true. But I don’t have a good direct way to reach the people who could use this information, and still I hate to leave such a falsehood uncontested. So if this is obvious, I hope the post is still perhaps useful to point more distant and confused people toward.

~

I personally know that AI risk was not invented by the tech CEOs because I have been near the middle of it since at least 2009—before any of the prominent AI companies existed, let alone had CEOs who might be trying to hype their products.

Here’s are some miscellaneous events over the years to give you a sense of the implausibility of this:

  • 2008 – I attempt to contact Eliezer Yudkowsky to inform him that I am ‘trying to figure out the optimal way to use my life’ and would like to hear a better account of why his plan (of worrying about AI risk) is good. I have read about it online, but would like a clearer account. Traveling the world shortly after undergrad later, I meet a handful of people in person in the Bay Area who care about this, and one argues strongly that I should prioritize AI risk over my previously preferred causes e.g. climate change. I decide to think about this.
  • 2009 – I am still not very convinced that AI is the most important thing to work on, but go to stay with the people who are worried about it for a few months. I argue about it a lot with a handful of them. There seem to be about twenty of them locally in the South Bay, though many more who comment on the relevant blogs. My photography collection from this era is quite sparse.


    I go to The Singularity Summit for my first time (and its fourth), which is very lively and full of people who are thinking seriously about the future of AI.

  • 2010 – Deepmind is founded. (I am back at school.)
  • 2011 – I start a philosophy PhD at CMU, hoping to be eligible to work at somewhere like the Future of Humanity Institute one day, which is a happening hub of discussion about existential risk, AI and other important issues, that I like to visit.
  • 2012 – I visit the Bay more and hang out with the growing AI risk community there. I visit the UK and do the same. I go to the AGI 2012 Winter Intelligence Conference.
  • 2013 – I move to Berkeley and work at MIRI for a semester during grad school. I measure algorithmic progress over time across various computer science domains, as input to expectations for artificial intelligence in future. I visit the UK and attend the Center for Effective Altruism’s ‘weekend away’ where we have a debate on which cause is best, between global poverty, animal welfare and extinction risk. Extinction risk wins—the crowd leaves having changed their mind in that direction on net. The three advocates just before or after:

  • 2014 – I join MIRI properly. I research The Asilomar Conference and Leó Szilárd as evidence about whether it is worth people trying to deal with risks early, because people around mostly believe that the risks from AI are at least a decade away, and there is disagreement about whether that makes it futile. I run an online reading group about Superintelligence, a new book about AI risk. I co-found AI Impacts, a project to answer questions about the future of AI, because AI risk seems at least fairly plausibly the most important thing to work on, and I want to investigate more and share my thinking with others.
  • 2015 – I attend the first FLI conference—it seems that more people and more prominent people are interested in AI safety! OpenAI is founded.

  • 2016 – I lead a team to run the first Expert Survey on Progress in AI. The median probability given to an outcome of advanced AI that is “Extremely Bad (e.g., human extinction)” is already 5%.
  • 2017 – Some people around me are getting very worried, and saying AGI will happen within several years. My survey gets a shocking amount of media attention, becoming the ‘16th most discussed paper’ in 2017 according to Altmetric. Apparently there is interest in this topic..
  • 2018 – I go to a big workshop for people working on AI risk in the English countryside, and a Chilean summit where I talk on TV and the radio about AI risk. It feels like interest is still picking up, and I feel optimistic about talking to the public.

  • 2019 – GPT-2 comes out. Someone tries to get it to name our house. My favorite names include things like “World peace: tigers and humans” and “rooftop hillside: the highest place in the world”. It is hilarious and useless, but also magical and wild. The things we have worried about for years are feeling more tangible, and people’s ‘AI timelines’ are shrinking.
  • 2020 – The world is reminded that really crazy things can happen. AI Impacts becomes remote. I spend the year with my household, who are almost all working on AI risk. We enjoy whiteboards a lot and run at least one good house conference in this period.

  • 2021 – Anthropic is founded