Tag Archives: values

Ignorance of non-existent preferences

I often hear it said that since you can’t know what non existent people or creatures want, you can’t count bringing them into existence as a benefit to them even if you guess they will probably like it. For instance Adam Ozimek makes this argument here.

Does this absolute agnosticism about non-existent preferences mean it is also a neutral act to bring someone into existence when you expect them to have a net nasty experience?

Does it look like it’s all about happiness?

Humanity’s obsession with status and money is often attributed to a misguided belief that these will bring the happiness we truly hunger. Would be reformers repeat the worldview-shattering news that we can be happier just by being grateful and spending more time with our families and on other admirable activities. Yet the crowds begging for happiness do not appear to heed them.

This popular theory doesn’t explain why people are so ignorant after billions of lifetimes of data about what brings happiness, or alternatively why they are helpless to direct their behavior toward it with the information. The usual counterargument to this story is simply that money and status and all that do in fact bring happiness, so people aren’t that silly after all.

Another explanation for the observed facts is that we don’t actually want happiness that badly; we like status and money too even at the expense of happiness. That requires the opposite explanation, of why we think we like happiness so much.

But first, what’s the evidence that we really want happiness or don’t? Here is some I can think of (please add):

For “We are mostly trying to get happiness and failing”:

  • We discuss plans in life, even in detail, as if the purpose were happiness
  • When we are wondering if something was a good choice we ask things like ‘are you happy with it?’
  • Some things don’t seem to lead to much benefit but enjoyment and are avidly sought, such as some entertainment.
  • We seem by all accounts both motivated in and fine at getting happiness in immediate term activities – we don’t accidentally watch a TV show or eat chocolate for long before noticing whether we enjoy it. The confusion seems about long term activities and investments.

“We often aren’t trying to get happiness”:

  • The recent happiness research appears to have fuelled lots of writing and not much hungry implementation of advice. eg I’ve noticed no fashion for writing down what you are grateful for at night starting up. Have I just missed it?
  • Few people get a few years into a prestigious job, realize status and money don’t bring happiness, declare it all a mistake, and take up joyful poor low status  activities
  • Most things take less than years to evaluate
  • I don’t seem to do the things that I think would make me most happy.
  • It seems we pursue romance and sex at the expense of happiness often, incapable of giving it up in the face of anticipated misery. Status and money have traditionally been closely involved with romance and sex, so it would be unsurprising if we were driven to have them too in spite of happiness implications.
  • Most of the things we seek that make us happy also make us more successful in other ways. People are generally happy when they receive more money than usual, or sex, or a better job, or compliments. So the fact that we often seek things that make us happy doesn’t tell us much.
  • Explicitly seeking status, money and sex looks bad, but seeking happiness does not. Thus if we were seeking sex or status we would be more likely to claim we were seeking happiness than those things.
  • Many people accept that lowering their standards would make them happier, but don’t try to.
  • We seem, and believe ourselves to be, willing to forgo our own happiness often for the sake of ‘higher’ principles such as ethics

It looks to me like we don’t care only about happiness, though we do a bit. I suspect we care more about happiness currently and more about other things in the long term, thus are confused when long term plans don’t seem to lead to happiness because introspection says we like it.

Perfect principles are for bargaining

When people commit to principles, they often consider one transgression ruinous to the whole agenda. Eating a sausage by drunken accident can end years of vegetarianism.

As a child I thought this crazy. Couldn’t vegetarians just eat meat when it was cheap under their rationale? Scrumptious leftovers at our restaurant, otherwise to be thrown away, couldn’t tempt vegetarian kids I knew. It would break their vegetarianism. Break it? Why did the integrity of the whole string of meals matter?  Any given sausage was such a tiny effect.

I eventually found two explanations. First, it’s easier to thwart temptation if you stake the whole deal on every choice. This is similar to betting a thousand dollars that you won’t eat chocolate this month. Second, commitment without gaps makes you seem a nicer, more reliable person to deal with. Viewers can’t necessarily judge the worthiness of each transgression, so they suspect the selectively committed of hypocrisy. Plus everyone can better rely on and trust a person who honors his commitments with less regard to consequence.

There’s another good reason though, which is related to the first. For almost any commitment there are constantly other people saying things like ‘What?! You want me to cook a separate meal because you have some fuzzy notion that there will be slightly less carbon emitted somewhere if you don’t eat this steak?’ Maintaining an ideal requires constantly negotiating with other parties who must suffer for it. Placing a lot of value on unmarred principles gives you a big advantage in these negotiations.

In negotiating generally, it is often useful to arrange visible costs to yourself for relinquishing too much ground. This is to persuade the other party that if they insist on the agreement being in that region, you will truly not be able to make a deal. So they are forced to agree to a position more favorable to you. This is the idea behind arranging for your parents to viciously punish you for smoking with your friends if you don’t want to smoke much. Similarly, attaching a visible large cost – the symbolic sacrifice of your principles – to relieving a friend of cooking tofu persuades your friend that you just can’t eat with them unless they concede. So that whole conversation is avoided, determined in your favor from the outset.

I used to be a vegetarian, and it was much less embarrassing to ask for vegetarian food then than was afterward when  I merely wanted to eat vegetarian most of the time. Not only does absolute commitment get you a better deal, but it allows you to commit to such a position without disrespectfully insisting on sacrificing the other’s interests for a small benefit.

Prompted by The Strategy of Conflict by Thomas Schelling.

Romantic idealism: true love conquers almost all

More romantic people tend to be vocally in favor of more romantic fidelity in my experience. If you think about it though, faith in romance is not a very romantic ideal. True love should overcome all things! The highest mountains, the furthest distances, social classes, families, inconveniences, ugliness, but NOT previous love apparently. There shouldn’t be any competition there. The love that got there first is automatically the better one, winning the support and protection of the sentimental against all other love on offer. Other impediments are allowed to test love, sweetened with ‘yes, you must move a thousand miles apart, but if it’s really true love, he’ll wait for you’. You can’t say, ‘yes, he has another girlfriend, but if you really are better for him he’ll come back – may the truest love win!’.

Perhaps more commitment in general allows better and more romance? There are costs as well as benefits to being tied to anything though. Just as it’s not clear that more commitment in society to stay with your current job would be pro-productivity, it’s hard to see that more commitment to stay with your current partner would be especially pro-romance. Of course this is all silly – being romantic and vocally supporting faithfulness are about signaling that you will stick around, not about having consistent values or any real preference about the rest of the world. Is there some other explanation?

 

Why will we be extra wrong about AI values?

I recently discussed the unlikelihood of an AI taking off and leaving the rest of society behind. The other part I mentioned of Singularitarian concern is that powerful AIs will be programmed with the wrong values. This would be bad even if the AIs did not take over the world entirely, but just became a powerful influence. Is that likely to happen?

Don’t get confused by talk of ‘values’. When people hear this they often think an AI could fail to have values at all, or that we would need to work out how to give an AI values. ‘Values’ just means what the AI does. In the same sense your refrigerator might value making things inside it cold (or for that matter making things behind it warm). Every program you write has values in this sense. It might value outputting ‘#t’ if and only if it’s given a prime number for instance.

The fear then is that a super-AI will do something other than what we want. We are unfortunately picky, and most things other than what we want, we really don’t want. Situations such as being enslaved by an army of giant killer robots, or having your job taken by a simulated mind are really incredibly close to what you do want compared to situations such as your universe being efficiently remodeled into stationery. If you have a machine with random values and the ability to manipulate everything in the universe, the chance of it’s final product having humans and tea and crumpets in it is unfathomably unlikely. Some SIAI members seem to believe that almost anyone who manages to make a powerful general AI will be so incapable of giving it suitable values as to approximate a random selection from mind design space.

The fear is not that whoever picks the AI’s goals will do so at random, but rather that they won’t forsee the extent of the AI’s influence, and will pick narrow goals that may as well be random when they act on the world outside the realm they were intended. For instance an AI programmed to like finding really big prime numbers might find methods that are outside the box, such as hacking computers to covertly divert others’ computing power to the task. If it improves its own intelligence immensely and copies itself we might quickly find ourselves amongst a race of superintelligent creatures whose only value is to find prime numbers. The first thing they would presumably do is stop this needless waste of resources worldwide on everything other than doing that.

Having an impact outside the intended realm is a problem that could exist for any technology. For a certain time our devices do what we want, but at some point they diverge if left long enough, depending on how well we have designed them to do what we want. In the past a car driving itself would diverge from what you wanted at the first corner, whereas after more work they diverge at the point another car gets in their way, and after more work they will diverge at the point that you unexpectedly need to pee.

Notice that at all stages we know over what realm the car’s values coincide with ours, and design it to run accordingly. The same goes with just about all the technology I can think of. Because your toaster’s values and yours diverge as soon as you cease to want bread heated, your toaster is programmed to turn off at that point and not to be very powerful.

Perhaps the concern about strong AI having the wrong goals is like saying ‘one day there will be cars that can drive themselves. It’s much easier to make a car that drives by itself than to make it steer well, so when this technology is developed, the cars will probably have the wrong goals and drive off the road.’ The error here is assuming that the technology will be used outside the realm it does what we want because the imagined amazing prototype can and programming what we do want it to do seems hard. In practice we hardly ever encounter this problem because we know approximately what our creations will do, and can control where they are set to do something. Is AI different?

One suggestion it might be different comes from looking at technologies that intervene in very messy systems. Medicines, public policies and attempts to intervene in ecosystems for instance are used without total knowledge of their effects, and often to broader and iller effects than anticipated. If it’s hard to design a single policy with known consequences, and hard to tell what the consequences are, safely designing a machine which will intervene in everything in ways you don’t anticipate is presumably harder. But it seems effects of medicine and policy aren’t usually orders of magnitude larger than anticipated. Nobody accidentally starts a holocaust by changing the road rules. Also in the societal cases, the unanticipated effects are often from society reacting to the intervention, rather than from the mechanism used having unpredictable reach. e.g. it is not often that a policy which intends to improve childhood literacy accidentally improves adult literacy as well, but it might change where people want to send their children to school and hence where they live and what children do in their spare time. This is not such a problem, as human reactions presumably reflect human goals. It seems incredibly unlikely that AI will not have huge social effects of this sort.

Another suggestion that human level AI might have the ‘wrong’ values is that the more flexible and complicated things are the harder it is to predict them in all of the circumstances they might be used. Software has bugs and failures sometimes because those making it could not think of every relevant difference in situations it will be used. But again, we have an idea of how fast these errors turn up and don’t move forward faster than enough are corrected.

The main reason that the space in which to trust technology to please us is predictable is that we accumulate technology incrementally and in pace with the corresponding science, so have knowledge and similar cases to go by. So another reason AI could be different is that there is a huge jump in AI ability suddenly. As far as I can tell this is the basis for SIAI concern. For instance if after years of playing with not very useful code, a researcher suddenly figures out a fundamental equation of intelligence and suddenly finds the reachable universe at his command. Because he hasn’t seen anything like it, when he runs it he has virtually no idea how much it will influence or what it will do. So the danger of bad values is dependent on the danger of a big jump in progress. As I explained previously, a jump seems unlikely. If artificial intelligence is reached more incrementally, even if it ends up being a powerful influence in society, there is little reason to think it will have particularly bad values.