Consciousness and how it got to be that way

Saturday, August 21, 2010

Thoughts on Newcomb

I'm currently reading Robert Nozick's Socratic Puzzles. It contains two essays about Newcomb's Problem. If you've not encountered Newcomb before, a brief description follows, and if you want more, the most discussion I've seen anywhere is at Less Wrong. I can sum up this post thusly: how can Newcomb be a hard problem?

Imagine a superintelligent being (a god, or an alien grad student as Nozick imagines, or more plausibly a UCSD medical student. It's up to you.) This superintelligent being says that it can predict your actions perfectly. It shows you two boxes, Box #1 and Box #2, into which it will place money according to rules that I will shortly give. As for you, you have two options: either open both boxes and take the money from both if there is any, or open only Box #2 and take the money from just Box #2. Now here are the rules, and the kicker. Since the being can predict your actions perfectly, it does the following trick. If it predicts that you're going to take just Box #2, it will place a thousand dollars in box #1, and a million dollars in Box #2. So in this instance, you will get a million dollars, but you'll miss out on the thousand in box #1. On the other hand, if it predicts that you will take both boxes, the being will place a thousand dollars in box #1, but place nothing in Box #2. In that case, you end up with just a thousand dollars. So in other words: the being always puts a thousand dollars in Box #1, whereas in Box #2 there's either a million, or nothing.

So, now the superintelligent being has gone back to its home planet of La Jolla, and you are left wondering what to do. Assuming you want the most money possible, which option do you pick and why?

Figure 1. Decision table for Newcomb's Problem.

There's been a lot of discussion about Newcomb's Box, and not all of the responses adhere to the standard one-or-both answer. But I take the point of this particular logic-koan to be that we're to decide based on the givens of the problem which of the options we would take, so cute answers about trying to cheat, making side bets, etc. are wasting our time. If we're going to introduce those kinds of non-systematic "real world" options into this exercise, then we're going to need a lot more context than we currently have to make a decision. In fact after ten years living in Berkeley I'm surprised that I haven't yet met someone on a street corner claiming to be an alien with a million dollars for me, but if I did I would walk away and not play at all. (Come to think of it, I frequently get similar spontaneous offers of a million dollars or more in my spam folder which I ignore at my peril.)

My own answer is to take only box #2, expecting to get a million dollars. Why? Because I want a million dollars, and the superintelligent alien is apparently smart enough to know that I'll gladly cooperate and not try to make myself unpredictable (more on this in a moment). Why try to be a smart-ass about it? (It's both to your disadvantage and not even possible anyway per the terms of the problem.) The being told you where it would put the million dollars (or not) based on your actions, and it's a given in the problem that the being is perfect at predicting your actions. This is what gives the both-boxers fits. They say one-boxers are idiots because if the being got my choice wrong, it didn't put anything in box #2 because it thought I would choose both. If the being is wrong, I open only box #2, and I get zero (because the alien thought I was going to take both and least get a grand, but he was having an off day.)

I will be beating the following dead horse a lot here: the problem states you have a reliable predictor. Why does Figure 1 above even have a right-side column? If you assume the being is fallible then you're not thinking about Newcomb's problem as stated any more: you're ascribing properties to the being that either conflict with what is given in the problem, or your're making stuff up. (Maybe the alien is fallible and copper and zinc are toxic to it! That way it won't predict in time that I'm going to kill it by throwing my spare pennies and brass keys at it, and then I can get the full amount from both boxes! Sucker. Ridiculous? No more than worrying about the given perfect predictor's not being perfect.)

Figure 2. Correction to Figure 1. This figure is the actual real table for Newcomb's Problem. Figure 1 is somebody else-not-Newcomb's problem that features fallible aliens.

Complaints about the logic of the Box #2-only response (which is the majority's response, if the ones Nozick cites in one of his essays are representative) typically focus on two things. One, that we're assuming reverse causality, that we must think our choice of the boxes will make there be a million dollars in it; and two, that it suggests we don't have free will. I dismiss the second objection out of hand because the whole point of the problem is that the being is a reliable predictor of human behavior - for that one aspect of your behavior, in this problem, no, you don't have free will. Look: we already accepted a being with near-perfect predictive powers. Without that, then the problem changes and we have to guess how likely the being is to get it right. But as long as we have Mr./Ms. Perfect Predictor, then the nature or mechanism is unimportant. You can justify how it accomplishes this however you like (we don't have free will in this respect, or the alien can travel through time) but the point is, any cleverness or strategy or philosophizing you do has already been taken into account by the alien.

But things can be predicted in our world, including human behavior, and for some reason this doesn't seem to evince outcries about undermining the concept of free will. Like it or not, other humans predict things about you all the time that you think you'd have some conscious control over - whether you'll quit smoking, your credit score, your mortality - and across the population, these predictions are quite robust. They don't always have the individual exactitude that our alien friend does of course. But at the very least you must concede that if our alien friend is even as smart as humans, after playing this game multiple times with us, its ability to predict which box you take would be greater than random chance, and you would get some information about which box you should pick based on this. Being completely honest, I think a lot of the resistance to one-boxing comes from the repugnance with which some people regard the idea that their behavior is extremely predictable. (Hey! News flash: it is.) Nozick even offers additional information in his example by saying that you've seen friends and colleagues play the same game, and the being predicted their choice reliably each time. Come on Plato, do you want a million dollars or not? Absolute no-brainer!

The first objection (regarding self-referential decision-making) is slightly more fertile ground for argument and it's the one to which Nozick devotes the most time. The idea is that you're engaging in circular logic: I'm deciding to one-box, therefore the being knew I would one-box, therefore I should decide to one-box. (Again: what's the whole point of the exercise? That whatever decision you're about to make, the being knew you would do it, including all the mental gyrations you're going through to get to your answer.) Nozick gives the example of a person who doesn't know whether his father is Person A or Person B. Person A was a university scientist and died of a painful disease in mid-life which would certainly be passed onto all offspring; children of person A would be expected to display an aptitude for technical subjects. Person B was an athlete, and likewise his children would be expected to display an athletic character. So the troubled young man is deciding on a career, noting that he has excelled equally in both baseball and engineering. "I certainly wouldn't want to have a painful genetic disease. Therefore, I'll choose a career in baseball. Since I've chosen a career in baseball, that means my true prowess is in athletics and therefore, B was my father, and I won't get a genetic disease. Phew!"

Yes, that would be a ridiculous decision process. The difference between the two is this: the category the decider is in the whole time is defined in Newcomb as definitely affecting the decision, whereas in Nozick's parallel, it does not (he could've gone either way.) Whatever you decide in Newcomb, the alien knew you would go through your whole sequence of contortions, and you were in that category all the while. Whether such a deterministic category is meaningful is a different and probably more interesting question than Newcomb as-is. Here's another example: you're in a national park, following a marked trail. You get so far along the trail until you come to a frighteningly steep rock face with only a single cable hammered into it. You reason, "I am about to proceed up these cables. If I'm about to do it, it's only because my action was anticipated by the national park people who design the map and trails and they can predict my actions as a reasonably fit and sensible hiker, and furthermore they put these cables here; they're not in the business of encouraging people to do foolishly dangerous things. Therefore, because I am going to do it, it is safe and I should do it." (Any reader who's ever braved the cables on Half Dome in Yosemite by him or herself without knowing ahead of time what they were getting into has had this exact experience.) This replicates the decision process relating to the for-some-reason mysterious perfect predictor: "I am about to open Box #2 only. If I'm about to open it, the superintelligent being would have put a million dollars in it. Therefore I should open Box #2 only." In fact, all the time we go through such circular reasoning processes as they relate to other human beings who are predicting are actions either in general or specifically for us: I am going to do A, and A wouldn't be available unless other agents who can predict my actions reasonably well knew I would come along and do A, therefore I should do A. This still may be an epistemological mess (something I'm not going to debate here) but the fact is that we use this kind of reasoning constantly, living in a world shaped in the to-us most salient ways by other agents who can predict our actions.

Incidentally, I intentionally used the example of the national park because that we use that kind of reasoning becomes obvious when you're trying to decide whether to climb something or undertake an otherwise risky proposition in a wilderness area, rather than on developed trails with markers; you become acutely aware that this circular justification heuristic based on other agents predicting your actions is suddenly unavailable, and then when it's available again (five miles further on, you run across an old trail) the arrangement seems quite obvious.

As a final note, as in other games (like Prisoner's Dilemma) the payouts can be critically important to how we choose. As the problem is traditionally stated (always a thousand in box #1, either zero or a million in box #2), it actually makes the decision quite easy for us, even if we're worried about the fallibility of our brilliant alien benefactor (which again, if we are, then what's the point of this whole exercise!?!?). Making a decision that throws away a thousand for a crack at a million is not for most humans in Western democracies a bad deal. (If someone could show me a business plan that had a 50% chance of turning a thousand bucks into a million within the few minutes that the Newcomb problem could presumably take place in, I'd be stupid not to do it!) On the other hand if I lived in the developing world and made $50 a month and had six kids to feed, I might think harder about this. (This is the St. Petersburg lottery problem, in which the expected utility of the same payout differs between agents based on their own context, and can be applied to other problems as well.) Similarly if it were five hundred thousand in Box #1 and a million in Box #2, things would be more interesting, for my own expected utility at least. Opening a box expecting a million and getting nothing doesn't hurt so much if you would have only got a thousand by playing it safe and opening both; it would be pretty bad if you'd expected a million and got nothing but could still have half a million if you'd played it safe. (For me. Bill Gates would probably shrug.)

Overall, the whole exercise of Newcomb's Box, as given, seems to me uninteresting and obvious. But enough smart people have gone on debating it for long enough that I must be some kind of philistine who's missing something about it. Nonetheless the arguments I've seen so far are not compelling; feel free to share more.

No comments:

Post a Comment