Choices - The Beginning of Infinity: Explanations That Transform the World - David Deutsch

The Beginning of Infinity: Explanations That Transform the World - David Deutsch (2011)

Chapter 13. Choices

In March 1792 George Washington exercised the first presidential veto in the history of the United States of America. Unless you already know what he and Congress were quarrelling about, I doubt that you will be able to guess, yet the issue remains controversial to this day. With hindsight, one may even perceive a certain inevitability in it, for, as I shall explain, it is rooted in a far-reaching misconception about the nature of human choice, which is still prevalent.

On the face of it, the issue seems no more than a technicality: in the US House of Representatives, how many seats should each state be allotted? This is known as the apportionment problem, because the US Constitution requires seats to be ‘apportioned among the several States … according to their respective Numbers [i.e. their populations]’. So, if your state contained 1 per cent of the US population, it would be entitled to 1 per cent of the seats in the House. This was intended to implement the principle of representative government - that the legislature should represent the people. It was, after all, about the House of Representatives. (The US Senate, in contrast, represents the states of the Union, and hence each state, regardless of population, has two senators.)

At present there are 435 seats in the House of Representatives; so, if 1 per cent of the US population did live in your state, then by strict proportionality the number of representatives to which it would be entitled - known as its quota - would be 4.35. When the quotas are not whole numbers, which of course they hardly ever are, they have to be rounded somehow. The method of rounding is known as an apportionment rule. The Constitution did not specify an apportionment rule; it left such details to Congress, and that is where the centuries of controversy began.

An apportionment rule is said to ‘stay within the quota’ if the number of seats that it allocates to each state never differs from the state’s quota by as much as a whole seat. For instance, if a state’s quota is 4.35 seats, then to ‘stay within the quota’ a rule must assign that state either four seats or five. It may take all sorts of information into account in choosing between four and five, but if it is capable of assigning any other number it is said to ‘violate quota’.

When one first hears of the apportionment problem, compromises that seem to solve it at a stroke spring easily to mind. Everyone asks, ‘Why couldn’t they just … ?’ Here is what I asked: Why couldn’t they just round each state’s quota to the nearest whole number? Under that rule, a quota of 4.35 seats would be rounded down to four; 4.6 seats would be rounded up to five. It seemed to me that, since this sort of rounding can never add or subtract more than half a seat, it would keep each state within half a seat of its quota, thus ‘staying within the quota’ with room to spare.

I was wrong: my rule violates quota. This is easy to demonstrate by applying it to an imaginary House of Representatives with ten seats, in a nation of four states. Suppose that one of the states has just under 85 per cent of the total population, and the other three have just over 5 per cent each. The large state therefore has a quota of just under 8.5, which my rule rounds down to eight. Each of the three small states has a quota of just over half a seat, which my rule rounds up to one. But now we have allocated eleven seats, not ten. In itself that hardly matters: the nation merely has one more legislator to feed than planned. The real problem is that this apportionment is no longer representative: 85 per cent of eleven is not 8.5 but 9.35. So the large state, with only eight seats, is in fact short of its quota by well over one seat. My rule under-represents 85 per cent of the population. Because we intended to allocate ten seats, the exact quotas necessarily add up to ten; but the rounded ones add up to eleven. And if there are going to be eleven seats in the House, the principle of representative government - and the Constitution - requires each state to receive its fair share of those, not of the ten that we merely intended.

Again, many ‘why don’t they just … ?’ ideas spring to mind. Why don’t they just create three additional seats and give them to the large state, thus bringing the allocation within the quota? (Curious readers may check that no fewer than three additional seats are needed to achieve this.) Alternatively, why don’t they just transfer a seat from one of the small states to the large state? Perhaps it should be from the state with the smallest population, so as to disadvantage as few people as possible. That would not only bring all the allocations within the quota, but also restore the number of seats to the originally intended ten.

Such strategies are known as reallocation schemes. They are indeed capable of staying within the quota. So, what is wrong with them? In the jargon of the subject, the answer is apportionment paradoxes - or, in ordinary language, unfairness and irrationality.

For example, the last reallocation scheme that I described is unfair by being biased against the inhabitants of the least populous state. They bear the whole cost of correcting the rounding errors. On this occasion their representation has been rounded down to zero. Yet, in the sense of minimizing the deviation from the quotas, the apportionment is almost perfectly fair: previously, 85 per cent of the population were well outside the quota, and now all are within it and 95 per cent are at the closest whole numbers to their quotas. It is true that 5 per cent now have no representatives - so they will not be able to vote in congressional elections at all - but that still leaves them within the quota, and indeed only slightly further from their exact quota than they were. (The numbers zero and one are almost equidistant from the quota of just over one half.) Nevertheless, because those 5 per cent have been completely disenfranchised, most advocates of representative government would regard this outcome as much less representative than it was before.

That must mean that the ‘minimum total deviation from quota’ is not the right measure of representativeness. But what is the right measure? What is the right trade-off between being slightly unfair to many people and very unfair to a few people? The Founding Fathers were aware that different conceptions of fairness, or representativeness, could conflict. For example, one of their justifications for democracy was that government was not legitimate unless everyone who was subject to the law had a representative, of equal power, among the lawmakers. This was expressed in their slogan ‘No taxation without representation’. Another of their aspirations was to abolish privilege: they wanted the system of government to have no built-in bias. Hence the requirement of proportional allocation. Since these two aspirations can conflict, the Constitution contains a clause that explicitly adjudicates between them: ‘Each State shall have at least one Representative.’ This favours the principle of representative government in the no-taxation-without-representation sense over the same principle in the abolish-privilege sense.

Another concept that frequently appeared in the Founding Fathers’ arguments for representative government was ‘the will of the people’. Governments are supposed to enact it. But that is a source of further inconsistencies. For in elections, only the will of voters counts, and not all of ‘the people’ are voters. At the time, voters were a fairly small minority: only free male citizens over the age of twenty-one. To address this point, the ‘Numbers’ referred to in the Constitution constituted the whole population of a state, including non-voters such as women, children, immigrants and slaves. In this way the Constitution attempted to treat the population equally by treating voters unequally.

So voters in states with a higher proportion of non-voters were allocated more representatives per capita. This had the perverse effect that in the states where the voters were already the most privileged within the state (i.e. where they were an exceptionally small minority), they now received an additional privilege relative to voters in other states: they were allocated more representation in Congress. This became a hot political issue in regard to slave-owners. Why should slave-owning states be allocated more political clout in proportion to how many slaves they had? To reduce this effect, a compromise was reached whereby a slave counted as three-fifths of a person for the purpose of apportioning seats in the House. But, even so, three-fifths of an injustice was still considered an injustice by many.* The same controversy exists today in regard to illegal immigrants, who also count as part of the population for apportionment purposes. So states with large numbers of illegal immigrants receive extra seats in Congress, while other states correspondingly lose out.

Following the first US census, in 1790, notwithstanding the new Constitution’s requirement of proportionality, seats in the House of Representatives were apportioned under a rule that violated quota. Proposed by the future president Thomas Jefferson, this rule also favoured states with higher populations, giving them more representatives per capita. So Congress voted to scrap it and substitute a rule proposed by Jefferson’s arch-rival Alexander Hamilton, which is guaranteed to give a result that stays within quota as well as having no obvious bias between states.

That was the change that President Washington vetoed. The reason he gave was simply that it involved reallocation: he considered all reallocation schemes unconstitutional, because he interpreted the term ‘apportioned’ as meaning divided by a suitable numerical divisor - and then rounded, but nothing else. Inevitably, some suspected that his real reason was that he, like Jefferson, came from the most populous state, Virginia, which would have lost out under Hamilton’s rule.

Ever since, Congress has continually debated and tinkered with the rules of apportionment. Jefferson’s rule was eventually dropped in 1841 in favour of one proposed by Senator Daniel Webster, which does use reallocation. It also violates quota, but very rarely; and it was, like Hamilton’s rule, deemed to be impartial between states.

A decade later, Webster’s rule was in turn dropped in favour of Hamilton’s. The latter’s supporters now believed that the principle of representative government was fully implemented, and perhaps hoped that this would be the end of the apportionment problem. But they were to be disappointed. It was soon causing more controversy than ever, because Hamilton’s rule, despite its impartiality and proportionality, began to make allocations that seemed outrageously perverse. For instance, it was susceptible to what came to be called the population paradox: a state whose population has increased since the last census can lose a seat to one whose population has decreased.

So, ‘why didn’t they just’ create new seats and assign them to states that lose out under a population paradox? They did so. But unfortunately that can bring the allocation outside quota. It can also introduce another historically important apportionment paradox: the Alabama paradox. That happens when increasing the total number of seats in the House results in some state losing a seat.

And there were other paradoxes. These were not necessarily unfair in the sense of being biased or disproportionate. They are called ‘paradoxes’ because an apparently reasonable rule makes apparently unreasonable changes between one apportionment and the next. Such changes are effectively random, being due to the vagaries of rounding errors, not to any bias, and in the long run they cancel out. But impartiality in the long run does not achieve the intended purpose of representative government. Perfect ‘fairness in the long run’ could be achieved even without elections, by selecting the legislature randomly from the electorate as a whole. But, just as a coin tossed randomly one hundred times is unlikely to produce exactly fifty heads and fifty tails, so a randomly chosen legislature of 435 would in practice never be representative on any one occasion: statistically, the typical deviation from representativeness would be about eight seats. There would also be large fluctuations in how those seats were distributed among states. The apportionment paradoxes that I have described have similar effects.

The number of seats involved is usually small, but that does not make it unimportant. Politicians worry about this because votes in the House of Representatives are often very close. Bills quite often pass or fail by one vote, and political deals often depend on whether individual representatives join one faction or another. So, whenever apportionment paradoxes have caused political discord, people have tried to invent an apportionment rule that is mathematically incapable of causing that particular paradox. Particular paradoxes always make it look as though everything would be fine if only ‘they’ made some simple change or other. Yet the paradoxes as a whole have the infuriating property that, no matter how firmly they are kicked out of the front door, they instantly come in again at the back.

After Hamilton’s rule was adopted, in 1851, Webster’s still enjoyed substantial support. So Congress tried, on at least two occasions, a trick that seemed to provide a judicious compromise: adjust the number of seats in the House until the two rules agree. Surely that would please everyone! Yet the upshot was that in 1871 some states considered the result to be so unfair, and the ensuing compromise legislation was so chaotic, that it was unclear what allocation rule, if any, had been decided upon. The apportionment that was implemented - which included the last-minute creation of several additional seats for no apparent reason - satisfied neither Hamilton’s rule nor Webster’s. Many considered it unconstitutional.

For the next few decades after 1871, every census saw either the adoption of a new apportionment rule or a change in the number of seats, designed to compromise between different rules. In 1921 no apportionment was made at all: they kept the old one (a course of action that may well have been unconstitutional again), because Congress could not agree on a rule.

The apportionment issue has been referred several times to eminent mathematicians, including twice to the National Academy of Sciences, and on each occasion these authorities have made different recommendations. Yet none of them ever accused their predecessors of making errors in mathematics. This ought to have warned everyone that this problem is not really about mathematics. And on each occasion, when the experts’ recommendations were implemented, paradoxes and disputes kept on happening.

In 1901 the Census Bureau published a table showing what the apportionments would be for every number of seats between 350 and 400 using Hamilton’s rule. By a quirk of arithmetic of a kind that is common in apportionment, Colorado would get three seats for each of these numbers except 357, when it would get only two seats. The chairman of the House Committee on Apportionment (who was from Illinois: I do not know whether he had anything against Colorado) proposed that the number of seats be changed to 357 and that Hamilton’s rule be used. This proposal was regarded with suspicion, and Congress eventually rejected it, adopting a 386-member apportionment and Webster’s rule, which also gave Colorado its ‘rightful’ three seats. But was that apportionment really any more rightful than Hamilton’s rule with 357 seats? By what criterion? Majority voting among apportionment rules?

What exactly would be wrong with working out what a large number of rival apportionment rules would do, and then allocating to each state the number of representatives that the majority of the schemes would allocate? The main thing is that that is itself an apportionment rule. Similarly, combining Hamilton’s and Webster’s schemes as they tried to do in 1871 just constituted adopting a third scheme. And what does such a scheme have going for it? Each of its constituent schemes was presumably designed to have some desirable properties. A combined scheme that was not designed to have those properties will not have them, except by coincidence. So it will not necessarily inherit the good features of its constituents. It will inherit some good ones and some bad ones, and have additional good and bad features of its own - but if it was not designed to be good, why should it be?

A devil’s advocate might now ask: if majority voting among apportionment rules is such a bad idea, why is majority voting among voters a good idea? It would be disastrous to use it in, say, science. There are more astrologers than astronomers, and believers in ‘paranormal’ phenomena often point out that purported witnesses of such phenomena outnumber the witnesses of most scientific experiments by a large factor. So they demand proportionate credence. Yet science refuses to judge evidence in that way: it sticks with the criterion of good explanation. So if it would be wrong for science to adopt that ‘democratic’ principle, why is it right for politics? Is it just because, as Churchill put it, ‘Many forms of Government have been tried and will be tried in this world of sin and woe. No one pretends that democracy is perfect or all-wise. Indeed, it has been said that democracy is the worst form of government except all those other forms that have been tried from time to time.’ That would indeed be a sufficient reason. But there are cogent positive reasons as well, and they too are about explanation, as I shall explain.

Sometimes politicians have been so perplexed by the sheer perverseness of apportionment paradoxes that they have been reduced to denouncing mathematics itself. Representative Roger Q. Mills of Texas complained in 1882, ‘I thought … that mathematics was a divine science. I thought that mathematics was the only science that spoke to inspiration and was infallible in its utterances [but] here is a new system of mathematics that demonstrates the truth to be false.’ In 1901 Representative John E. Littlefield, whose own seat in Maine was under threat from the Alabama paradox, said, ‘God help the State of Maine when mathematics reach for her and undertake to strike her down.’

As a matter of fact, there is no such thing as mathematical ‘inspiration’ (mathematical knowledge coming from an infallible source, traditionally God): as I explained in Chapter 8, our knowledge of mathematics is not infallible. But if Representative Mills meant that mathematicians are, or somehow ought to be, society’s best judges of fairness, then he was simply mistaken.* The National Academy of Sciences panel that reported to Congress in 1948 included the mathematician and physicist John von Neumann. It decided that a rule invented by the statistician Joseph Adna Hill (which is the one in use today) is the most impartial between states. But the mathematicians Michel Balinski and Peyton Young have since concluded that it favours smaller states. This illustrates again that different criteria of ‘impartiality’ favour different apportionment rules, and which of them is the right criterion cannot be determined by mathematics. Indeed, if Representative Mills intended his complaint ironically - if he really meant that mathematics alone could not possibly be causing injustice and that mathematics alone could not cure it - then he was right.

However, there is a mathematical discovery that has changed for ever the nature of the apportionment debate: we now know that the quest for an apportionment rule that is both proportional and free from paradoxes can never succeed. Balinski and Young proved this in 1975.

Balinski and Young’s Theorem
Every apportionment rule that stays within the quota suffers from the population paradox.

This powerful ‘no-go’ theorem explains the long string of historical failures to solve the apportionment problem. Never mind the various other conditions that may seem essential for an apportionment to be fair: no apportionment rule can meet even the bare-bones requirements of proportionality and the avoidance of the population paradox. Balinski and Young also proved no-go theorems involving other classic paradoxes.

This work had a much broader context than the apportionment problem. During the twentieth century, and especially following the Second World War, a consensus had emerged among most major political movements that the future welfare of humankind would depend on an increase in society-wide (preferably worldwide) planning and decision-making. The Western consensus differed from its totalitarian counterparts in that it expected the object of the exercise to be the satisfaction of individual citizens’ preferences. So Western advocates of society-wide planning were forced to address a fundamental question that totalitarians do not encounter: when society as a whole faces a choice, and citizens differ in their preferences among the options, which option is it best for society to choose? If people are unanimous, there is no problem - but no need for a planner either. If they are not, which option can be rationally defended as being ‘the will of the people’ - the option that society ‘wants’? And that raises a second question: how should society organize its decision-making so that it does indeed choose the options that it ‘wants’? These two questions had been present, at least implicitly, from the beginning of modern democracy. For instance, the US Declaration of Independence and the US Constitution both speak of the right of ‘the people’ to do certain things such as remove governments. Now they became the central questions of a branch of mathematical game theory known as social-choice theory.

Thus game theory - formerly an obscure and somewhat whimsical branch of mathematics - was suddenly thrust to the centre of human affairs, just as rocketry and nuclear physics had been. Many of the world’s finest mathematical minds, including von Neumann, rose to the challenge of developing the theory to support the needs of the countless institutions of collective decision-making that were being set up. They would create new mathematical tools which, given what all the individuals in a society want or need, or prefer, would distil what that society ‘wants’ to do, thus implementing the aspiration of ‘the will of the people’. They would also determine what systems of voting and legislating would give society what it wants.

Some interesting mathematics was discovered. But little, if any, of it ever met those aspirations. On the contrary, time and again the assumptions behind social-choice theory were proved to be incoherent or inconsistent by ‘no-go’ theorems like that of Balinski and Young.

Thus it turned out that the apportionment problem, which had absorbed so much legislative time, effort and passion, was the tip of an iceberg. The problem is much less parochial than it looks. For instance, rounding errors are proportionately smaller with a larger legislature. So why don’t they just make the legislature very big - say, ten thousand members - so that all the rounding errors would be trivial? One reason is that such a legislature would have to organize itself internally to make any decisions. The factions within the legislature would themselves have to choose leaders, policies, strategies, and so on. Consequently, all the problems of social choice would arise within the little ‘society’ of a party’s contingent in the legislature. So it is not really about rounding errors. Also, it is not only about people’s top preferences: once we are considering the details of decision-making in large groups - how legislatures and parties and factions within parties organize themselves to contribute their wishes to ‘society’s wishes’ - we have to take into account their second and third choices, because people still have the right to contribute to decision-making if they cannot persuade a majority to agree to their first choice. Yet electoral systems designed to take such factors into account invariably introduce more paradoxes and no-go theorems.

One of the first of the no-go theorems was proved in 1951 by the economist Kenneth Arrow, and it contributed to his winning the Nobel prize for economics in 1972. Arrow’s theorem appears to deny the very existence of social choice - and to strike at the principle of representative government, and apportionment, and democracy itself, and a lot more besides.

This is what Arrow did. He first laid down five elementary axioms that any rule defining the ‘will of the people’ - the preferences of a group - should satisfy, and these axioms seem, at first sight, so reasonable as to be hardly worth stating. One of them is that the rule should define a group’s preferences only in terms of the preferences of that group’s members. Another is that the rule must not simply designate the views of one particular person to be ‘the preferences of the group’ regardless of what the others want. That is called the ‘no-dictator’ axiom. A third is that if the members of the group are unanimous about something - in the sense that they all have identical preferences about it - then the rule must deem the group to have those preferences too. Those three axioms are all expressions, in this situation, of the principle of representative government.

Arrow’s fourth axiom is this. Suppose that, under a given definition of ‘the preferences of the group’, the rule deems the group to have a particular preference - say, for pizza over hamburger. Then it must still deem that to be the group’s preference if some members who previously disagreed with the group (i.e. they preferred hamburger) change their minds and now prefer pizza. This constraint is similar to ruling out a population paradox. A group would be irrational if it changed its ‘mind’ in the opposite direction to its members.

The last axiom is that if the group has some preference, and then some members change their minds about something else, then the rule must continue to assign the group that original preference. For instance, if some members have changed their minds about the relative merits of strawberries and raspberries, but none of their preferences about the relative merits of pizza and hamburger have changed, then the group’s preference between pizza and hamburger must not be deemed to have changed either. This constraint can again be regarded as a matter of rationality: if no members of the group change any of their opinions about a particular comparison, nor can the group.

Arrow proved that the axioms that I have just listed are, despite their reasonable appearance, logically inconsistent with each other. No way of conceiving of ‘the will of the people’ can satisfy all five of them. This strikes at the assumptions behind social-choice theory at an arguably even deeper level than the theorems of Balinski and Young. First, Arrow’s axioms are not about the apparently parochial issue of apportionment, but about any situation in which we want to conceive of a group having preferences. Second, all five of these axioms are intuitively not just desirable to make a system fair, but essential for it to be rational. Yet they are inconsistent.

It seems to follow that a group of people jointly making decisions is necessarily irrational in one way or another. It may be a dictatorship, or under some sort of arbitrary rule; or, if it meets all three representativeness conditions, then it must sometimes change its ‘mind’ in a direction opposite to that in which criticism and persuasion have been effective. So it will make perverse choices, no matter how wise and benevolent the people who interpret and enforce its preferences may be - unless, possibly, one of them is a dictator (see below). So there is no such thing as ‘the will of the people’. There is no way to regard ‘society’ as a decision-maker with self-consistent preferences. This is hardly the conclusion that social-choice theory was supposed to report back to the world.

As with the apportionment problem, there were attempts to fix the implications of Arrow’s theorem with ‘why don’t they just … ?’ ideas. For instance, why not take into account how intense people’s preferences are? For, if slightly over half the electorate barely prefers X to Y, but the rest consider it a matter of life and death that Y should be done, then most intuitive conceptions of representative government would designate Y as ‘the will of the people’. But intensities of preferences, and especially the differences in intensities among different people, or between the same person at different times, are notoriously difficult to define, let alone measure - like happiness. And, in any case, including such things makes no difference: there are still no-go theorems.

As with the apportionment problem, it seems that whenever one patches up a decision-making system in one way, it becomes paradoxical in another. A further serious problem that has been identified in many decision-making institutions is that they create incentives for participants to lie about their preferences. For instance, if there are two options of which you mildly prefer one, you have an incentive to register your preference as ‘strong’ instead. Perhaps you are prevented from doing that by a sense of civic responsibility. But a decision-making system moderated by civic responsibility has the defect that it gives disproportionate weight to the opinions of people who lack civic responsibility and are willing to lie. On the other hand, a society in which everyone knows everyone sufficiently well to make such lying difficult cannot have an effectively secret ballot, and the system will then give disproportionate weight to the faction most able to intimidate waverers.

One perennially controversial social-choice problem is that of devising an electoral system. Such a system is mathematically similar to an apportionment scheme, but, instead of allocating seats to states on the basis of population, it allocates them to candidates (or parties) on the basis of votes. However, it is more paradoxical than apportionment and has more serious consequences, because in the case of elections the element of persuasion is central to the whole exercise: an election is supposed to determine what the voters have become persuaded of. (In contrast, apportionment is not about states trying to persuade people to migrate from other states.) Consequently an electoral system can contribute to, or can inhibit, traditions of criticism in the society concerned.

For example, an electoral system in which seats are allocated wholly or partly in proportion to the number of votes received by each party is called a ‘proportional-representation’ system. We know from Balinski and Young that, if an electoral system is too proportional, it will be subject to the analogue of the population paradox and other paradoxes. And indeed the political scientist Peter Kurrild-Klitgaard, in a study of the most recent eight general elections in Denmark (under its proportional-representation system), showed that every one of them manifested paradoxes. These included the ‘More-Preferred-Less-Seats paradox’, in which a majority of voters prefer party X to party Y, but party Y receives more seats than party X.

But that is really the least of the irrational attributes of proportional representation. A more important one - which is shared by even the mildest of proportional systems - is that they assign disproportionate power in the legislature to the third-largest party, and often to even smaller parties. It works like this. It is rare (in any system) for a single party to receive an overall majority of votes. Hence, if votes are reflected proportionately in the legislature, no legislation can be passed unless some of the parties cooperate to pass it, and no government can be formed unless some of them form a coalition. Sometimes the two largest parties manage to do this, but the most common outcome is that the leader of the third-largest party holds the ‘balance of power’ and decides which of the two largest parties shall join it in government, and which shall be sidelined, and for how long. That means that it is correspondingly harder for the electorate to decide which party, and which policies, will be removed from power.

In Germany (formerly West Germany) between 1949 and 1998, the Free Democratic Party (FDP) was the third largest.* Though it never received more than 12.8 per cent of the vote, and usually much less, the country’s proportional-representation system gave it power that was insensitive to changes in the voters’ opinions. On several occasions it chose which of the two largest parties would govern, twice changing sides and three times choosing to put the less popular of the two (as measured by votes) into power. The FDP’s leader was usually made a cabinet minister as part of the coalition deal, with the result that for the last twenty-nine years of that period Germany had only two weeks without an FDP foreign minister. In 1998, when the FDP was pushed into fourth place by the Green Party, it was immediately ousted from government, and the Greens assumed the mantle of kingmakers. And they took charge of the Foreign Ministry as well. This disproportionate power that proportional representation gives the third-largest party is an embarrassing feature of a system whose whole raison d’être, and supposed moral justification, is to allocate political influence proportionately.

Arrow’s theorem applies not only to collective decision-making but also to individuals, as follows. Consider a single, rational person faced with a choice between several options. If the decision requires thought, then each option must be associated with an explanation - at least a tentative one - for why it might be the best. To choose an option is to choose its explanation. So how does one decide which explanation to adopt?

Common sense says that one ‘weighs’ them - or weighs the evidence that their arguments present. This is an ancient metaphor. Statues of Justice have carried scales since antiquity. More recently, inductivism has cast scientific thinking in the same mould, saying that scientific theories are chosen, justified and believed - and somehow even formed in the first place - according to the ‘weight of evidence’ in their favour.

Consider that supposed weighing process. Each piece of evidence, including each feeling, prejudice, value, axiom, argument and so on, depending on what ‘weight’ it had in that person’s mind, would contribute that amount to that person’s ‘preferences’ between various explanations. Hence for the purposes of Arrow’s theorem each piece of evidence can be regarded as an ‘individual’ participating in the decision-making process, where the person as a whole would be the ‘group’.

Now, the process that adjudicates between the different explanations would have to satisfy certain constraints if it were to be rational. For instance, if, having decided that one option was the best, the person received further evidence that gave additional weight to that option, then the person’s overall preference would still have to be for that option - and so on. Arrow’s theorem says that those requirements are inconsistent with each other, and so seems to imply that all decision-making - all thinking - must be irrational. Unless, perhaps, one of the internal agents is a dictator, empowered to override the combined opinions of all the other agents. But this is an infinite regress: how does the ‘dictator’ itself choose between rival explanations about which other agents it would be best to override?

There is something very wrong with that entire conventional model of decision-making, both within single minds and for groups as assumed in social-choice theory. It conceives of decision-making as a process of selecting from existing options according to a fixed formula (such as an apportionment rule or electoral system). But in fact that is what happens only at the end of decision-making - the phase that does not require creative thought. In terms of Edison’s metaphor, the model refers only to the perspiration phase, without realizing that decision-making is problem-solving, and that without the inspiration phase nothing is ever solved and there is nothing to choose between. At the heart of decision-making is the creation of new options and the abandonment or modification of existing ones.

To choose an option, rationally, is to choose the associated explanation. Therefore, rational decision-making consists not of weighing evidence but of explaining it, in the course of explaining the world. One judges arguments as explanations, not justifications, and one does this creatively, using conjecture, tempered by every kind of criticism. It is in the nature of good explanations - being hard to vary - that there is only one of them. Having created it, one is no longer tempted by the alternatives. They have been not outweighed, but out-argued, refuted and abandoned. During the course of a creative process, one is not struggling to distinguish between countless different explanations of nearly equal merit; typically, one is struggling to create even one good explanation, and, having succeeded, one is glad to be rid of the rest.

Another misconception to which the idea of decision-making by weighing sometimes leads is that problems can be solved by weighing - in particular, that disputes between advocates of rival explanations can be resolved by creating a weighted average of their proposals. But the fact is that a good explanation, being hard to vary at all without losing its explanatory power, is hard to mix with a rival explanation: something halfway between them is usually worse than either of them separately. Mixing two explanations to create a better explanation requires an additional act of creativity. That is why good explanations are discrete - separated from each other by bad explanations - and why, when choosing between explanations, we are faced with discrete options.

In complex decisions, the creative phase is often followed by a mechanical, perspiration phase in which one ties down details of the explanation that are not yet hard to vary but can be made so by non-creative means. For example, an architect whose client asks how tall a skyscraper can be built, given certain constraints, does not just calculate that number from a formula. The decision-making process may end with such a calculation, but it begins creatively, with ideas for how the client’s priorities and constraints might best be met by a new design. And, before that, the clients had to decide - creatively - what those priorities and constraints should be. At the beginning of that process they would not have been aware of all the preferences that they would end up presenting to architects. Similarly, a voter may look through lists of the various parties’ policies, and may even assign each issue a ‘weight’ to represent its importance; but one can do that only after one has thought about one’s political philosophy, and has explained to one’s own satisfaction how important that makes the various issues, what policies the various parties are likely to adopt in regard to those issues, and so on.

The type of ‘decision’ considered in social-choice theory is choosing from options that are known and fixed, according to preferences that are known, fixed and consistent. The quintessential example is a voter’s choice, in the polling booth, not of which candidate to prefer but of which box to check. As I have explained, this is a grossly inadequate, and inaccurate, model of human decision-making. In reality, the voter is choosing between explanations, not checkboxes, and, while very few voters choose to affect the checkboxes themselves, by running for office, all rational voters create their own explanation for which checkbox they personally should choose.

So it is not true that decision-making necessarily suffers from those crude irrationalities - not because there is anything wrong with Arrow’s theorem or any of the other no-go theorems, but because social-choice theory is itself based on false assumptions about what thinking and deciding consist of. It is Zeno’s mistake. It is mistaking an abstract process that it has named decision-making for the real-life process of the same name.

Similarly, what is called a ‘dictator’ in Arrow’s theorem is not necessarily a dictator in the ordinary sense of the word. It is simply any agent to whom the society’s decision-making rules assign the sole right to make a particular decision regardless of the preferences of anyone else. Thus, every law that requires an individual’s consent for something - such as the law against rape, or against involuntary surgery - establishes a ‘dictatorship’ in the technical sense used in Arrow’s theorem. Everyone is a dictator over their own body. The law against theft establishes a dictatorship over one’s own possessions. A free election is, by definition, one in which every voter is a dictator over their own ballot paper. Arrow’s theorem itself assumes that all the participants are in sole control of their contributions to the decision-making process. More generally, the most important conditions for rational decision-making - such as freedom of thought and of speech, tolerance of dissent, and the self-determination of individuals - all require ‘dictatorships’ in Arrow’s mathematical sense. It is understandable that he chose that term. But it has nothing to do with the kind of dictatorship that has secret police who come for you in the middle of the night if you criticize them.

Virtually all commentators have responded to these paradoxes and no-go theorems in a mistaken and rather revealing way: they regret them. This illustrates the confusion to which I am referring. They wish that these theorems of pure mathematics were false. If only mathematics permitted it, they complain, we human beings could set up a just society that makes its decisions rationally. But, faced with the impossibility of that, there is nothing left for us to do but to decide which injustices and irrationalities we like best, and to enshrine them in law. As Webster wrote, of the apportionment problem, ‘That which cannot be done perfectly must be done in a manner as near perfection as can be. If exactness cannot, from the nature of things, be attained, then the nearest practicable approach to exactness ought to be made.’

But what sort of ‘perfection’ is a logical contradiction? A logical contradiction is nonsense. The truth is simpler: if your conception of justice conflicts with the demands of logic or rationality then it is unjust. If your conception of rationality conflicts with a mathematical theorem (or, in this case, with many theorems) then your conception of rationality is irrational. To stick stubbornly to logically impossible values not only guarantees failure in the narrow sense that one can never meet them, it also forces one to reject optimism (‘every evil is due to lack of knowledge’), and so deprives one of the means to make progress. Wishing for something that is logically impossible is a sign that there is something better to wish for. Moreover, if my conjecture in Chapter 8 is true, an impossible wish is ultimately uninteresting as well.

We need something better to wish for. Something that is not incompatible with logic, reason or progress. We have already encountered it. It is the basic condition for a political system to be capable of making sustained progress: Popper’s criterion that the system facilitate the removal of bad policies and bad governments without violence. That entails abandoning ‘who should rule?’ as a criterion for judging political systems. The entire controversy about apportionment rules and all other issues in social-choice theory has traditionally been framed by all concerned in terms of ‘who should rule?’: what is the right number of seats for each state, or for each political party? What does the group - presumed entitled to rule over its subgroups and individuals - ‘want’, and what institutions will get it what it ‘wants’?

So let us reconsider collective decision-making in terms of Popper’s criterion instead. Instead of wondering earnestly which of the self-evident yet mutually inconsistent criteria of fairness, representativeness and so on are the most self-evident, so that they can be entrenched, we judge such criteria, along with all other actual or proposed political institutions, according to how well they promote the removal of bad rulers and bad policies. To do this, they must embody traditions of peaceful, critical discussion - of rulers, policies and the political institutions themselves.

In this view, any interpretation of the democratic process as merely a way of consulting the people to find out who should rule or what policies to implement misses the point of what is happening. An election does not play the same role in a rational society as consulting an oracle or a priest, or obeying orders from the king, did in earlier societies. The essence of democratic decision-making is not the choice made by the system at elections, but the ideas created between elections. And elections are merely one of the many institutions whose function is to allow such ideas to be created, tested, modified and rejected. The voters are not a fount of wisdom from which the right policies can be empirically ‘derived’. They are attempting, fallibly, to explain the world and thereby to improve it. They are, both individually and collectively, seeking the truth - or should be, if they are rational. And there is an objective truth of the matter. Problems are soluble. Society is not a zero-sum game: the civilization of the Enlightenment did not get where it is today by cleverly sharing out the wealth, votes or anything else that was in dispute when it began. It got here by creating ex nihilo. In particular, what voters are doing in elections is not synthesizing a decision of a superhuman being, ‘Society’. They are choosing which experiments are to be attempted next, and (principally) which are to be abandoned because there is no longer a good explanation for why they are best. The politicians, and their policies, are those experiments.

When one uses no-go theorems such as Arrow’s to model real decision-making, one has to assume - quite unrealistically - that none of the decision-makers in the group is able to persuade the others to modify their preferences, or to create new preferences that are easier to agree on. The realistic case is that neither the preferences nor the options need be the same at the end of a decision-making process as they were at the beginning.

Why don’t they just … fix social-choice theory by including creative processes such as explanation and persuasion in its mathematical model of decision-making? Because it is not known how to model a creative process. Such a model would be a creative process: an AI.

The conditions of ‘fairness’ as conceived in the various social-choice problems are misconceptions analogous to empiricism: they are all about the input to the decision-making process - who participates, and how their opinions are integrated to form the ‘preference of the group’. A rational analysis must concentrate instead on how the rules and institutions contribute to the removal of bad policies and rulers, and to the creation of new options.

Sometimes such an analysis does endorse one of the traditional requirements, at least in part. For instance, it is indeed important that no member of the group be privileged or deprived of representation. But this is not so that all members can contribute to the answer. It is because such discrimination entrenches in the system a preference among their potential criticisms. It does not make sense to include everyone’s favoured policies, or parts of them, in the new decision; what is necessary for progress is to exclude ideas that fail to survive criticism, and to prevent their entrenchment, and to promote the creation of new ideas.

Proportional representation is often defended on the grounds that it leads to coalition governments and compromise policies. But compromises - amalgams of the policies of the contributors - have an undeservedly high reputation. Though they are certainly better than immediate violence, they are generally, as I have explained, bad policies. If a policy is no one’s idea of what will work, then why should it work? But that is not the worst of it. The key defect of compromise policies is that when one of them is implemented and fails, no one learns anything because no one ever agreed with it. Thus compromise policies shield the underlying explanations which do at least seem good to some faction from being criticized and abandoned.

The system used to elect members of the legislatures of most countries in the British political tradition is that each district (or ‘constituency’) in the country is entitled to one seat in the legislature, and that seat goes to the candidate with the largest number of votes in that district. This is called the plurality voting system (‘plurality’ meaning ‘largest number of votes’) - often called the ‘first-past-the-post’ system, because there is no prize for any runner-up, and no second round of voting (both of which feature in other electoral systems for the sake of increasing the proportionality of the outcomes). Plurality voting typically ‘over-represents’ the two largest parties, compared with the proportion of votes they receive. Moreover, it is not guaranteed to avoid the population paradox, and is even capable of bringing one party to power when another has received far more votes in total.

These features are often cited as arguments against plurality voting and in favour of a more proportional system - either literal proportional representation or other schemes such as transferable-vote systems and run-off systems which have the effect of making the representation of voters in the legislature more proportional. However, under Popper’s criterion, that is all insignificant in comparison with the greater effectiveness of plurality voting at removing bad governments and policies.

Let me trace the mechanism of that advantage more explicitly. Following a plurality-voting election, the usual outcome is that the party with the largest total number of votes has an overall majority in the legislature, and therefore takes sole charge. All the losing parties are removed entirely from power. This is rare under proportional representation, because some of the parties in the old coalition are usually needed in the new one. Consequently, the logic of plurality is that politicians and political parties have little chance of gaining any share in power unless they can persuade a substantial proportion of the population to vote for them. That gives all parties the incentive to find better explanations, or at least to convince more people of their existing ones, for if they fail they will be relegated to powerlessness at the next election.

In the plurality system, the winning explanations are then exposed to criticism and testing, because they can be implemented without mixing them with the most important claims of opposing agendas. Similarly, the winning politicians are solely responsible for the choices they make, so they have the least possible scope for making excuses later if those are deemed to have been bad choices. If, by the time of the next election, they are less convincing to the voters than they were, there is usually no scope for deals that will keep them in power regardless.

Under a proportional system, small changes in public opinion seldom count for anything, and power can easily shift in the opposite direction to public opinion. What counts most is changes in the opinion of the leader of the third-largest party. This shields not only that leader but most of the incumbent politicians and policies from being removed from power through voting. They are more often removed by losing support within their own party, or by shifting alliances between parties. So in that respect the system badly fails Popper’s criterion. Under plurality voting, it is the other way round. The all-or-nothing nature of the constituency elections, and the consequent low representation of small parties, makes the overall outcome sensitive to small changes in opinion. When there is a small shift in opinion away from the ruling party, it is usually in real danger of losing power completely.

Under proportional representation, there are strong incentives for the system’s characteristic unfairnesses to persist, or to become worse, over time. For example, if a small faction defects from a large party, it may then end up with more chance of having its policies tried out than it would if its supporters had remained within the original party. This results in a proliferation of small parties in the legislature, which in turn increases the necessity for coalitions - including coalitions with the smaller parties, which further increases their disproportionate power. In Israel, the country with the world’s most proportional electoral system, this effect has been so severe that, at the time of writing, even the two largest parties combined cannot muster an overall majority. And yet, under that system - which has sacrificed all other considerations in favour of the supposed fairness of proportionality - even proportionality itself is not always achieved: in the election of 1992, the right-wing parties as a whole received a majority of the popular vote, but the left-wing ones had a majority of the seats. (That was because a greater proportion of the fringe parties that failed to reach the threshold for receiving even one seat were right-wing.)

In contrast, the error-correcting attributes of the plurality voting system have a tendency to avoid the paradoxes to which the system is theoretically prone, and quickly to undo them when they do happen, because all those incentives are the other way round. For instance, in the Canadian province of Manitoba in 1926, the Conservative Party received more than twice as many votes as any other party, but won none of the seventeen seats allocated to that province. As a result it lost power in the national Parliament despite having received the most votes nationally too. And yet, even in that rare, extreme case, the disproportionateness between the two main parties’ representations in Parliament was not that great: the average Liberal voter received 1.31 times as many members of Parliament as the average Conservative one. And what happened next? In the following election the Conservative Party again had the largest number of votes nationally, but this time that gave it an overall majority in Parliament. Its vote had increased by 3 per cent of the electorate, but its representation had increased by 17 per cent of the total number of seats, bringing the parties’ shares of seats back into rough proportionality and satisfying Popper’s criterion with flying colours.

This is partly due to yet another beneficial feature of the plurality system, namely that elections are often very close, in terms of votes as well as in the sense that all members of the government are at serious risk of being removed. In proportional systems, elections are rarely close in either sense. What is the point of giving the party with the most votes the most seats, if the party with the third-largest number of seats can then put the second-largest party in power regardless - there to enact a compromise platform that absolutely no one voted for? The plurality voting system almost always produces situations in which a small change in the vote produces a relatively large change (in the same direction!) in who forms a government. The more proportional a system is, the less sensitive the content of the resulting government and its policies are to changes in votes.

Unfortunately there are political phenomena that can violate Popper’s criterion even more strongly than bad electoral systems - for example, entrenched racial divisions, or various traditions of political violence. Hence I do not intend the above discussion of electoral systems to constitute a blanket endorsement of plurality voting as the One True System of democracy, suitable for all polities under all circumstances. Even democracy itself is unworkable under some circumstances. But in the advanced political cultures of the Enlightenment tradition the creation of knowledge can and should be paramount, and the idea that representative government depends on proportionate representation in the legislature is unequivocally a mistake.

In the United States’ system of government, the Senate is required to be representative in a different sense from the House of Representatives: states are represented equally, in recognition of the fact that each state is a separate political entity with its own distinctive political and legal tradition. Each of them is entitled to two Senate seats, regardless of population. Because the states differ greatly in their populations (currently the most populous state, California, has nearly seventy times the population of the least populous, Wyoming), the Senate’s apportionment rule creates enormous deviations from population-based proportionality - much larger than those that are so hotly disputed in regard to the House of Representatives. And yet historically, after elections, it is rare for the Senate and the House of Representatives to be controlled by different parties. This suggests that there is more going on in this vast process of apportionments and elections than merely ‘representation’ - the mirroring of the population by the legislature. Could it be that the problem-solving that is promoted by the plurality voting system is continually changing the options of the voters, and also their preferences among the options, through persuasion? And so opinions and preferences are, despite appearances, converging - not in the sense of there being less disagreement (since solutions create new problems), but in the sense of creating ever more shared knowledge.

In science, we do not consider it surprising that a community of scientists with different initial hopes and expectations, continually in dispute about their rival theories, gradually come into near-unanimous agreement over a steady stream of issues (yet still continue to disagree all the time). It is not surprising because, in their case, there are observable facts that they can use to test their theories. They converge with each other on any given issue because they are all converging on the objective truth. In politics it is customary to be cynical about that sort of convergence being possible.

But that is a pessimistic view. Throughout the West, a great deal of philosophical knowledge that is nowadays taken for granted by almost everyone - say, that slavery is an abomination, or that women should be free to go out to work, or that autopsies should be legal, or that promotion in the armed forces should not depend on skin colour - was highly controversial only a matter of decades ago, and originally the opposite positions were taken for granted. A successful truth-seeking system works its way towards broad consensus or near-unanimity - the one state of public opinion that is not subject to decision-theoretic paradoxes and where ‘the will of the people’ makes sense. So convergence in the broad consensus over time is made possible by the fact that all concerned are gradually eliminating errors in their positions and converging on objective truths. Facilitating that process - by meeting Popper’s criterion as well as possible - is more important than which of two contending factions with near-equal support gets its way at a particular election.

In regard to the apportionment issue too, since the United States’ Constitution was instituted there have been enormous changes in the prevailing conception of what it means for a government to be ‘representative’. Recognizing the right of women to vote, for instance, doubled the number of voters - and implicitly admitted that in every previous election half the population had been disenfranchised, and the other half over-represented compared with a just representation. In numerical terms, such injustices dwarf all the injustices of apportionment that have absorbed so much political energy over the centuries. But it is to the credit of the political system, and of the people of the United States and of the West in general, that, while they were fiercely debating the fairness of shifting a few percentage points’ worth of representation between one state and another, they were also debating, and making, these momentous improvements. And they too became uncontroversial.

Apportionment systems, electoral systems and other institutions of human cooperation were for the most part designed, or evolved, to cope with day-to-day controversy, to cobble together ways of proceeding without violence despite intense disagreement about what would be best. And the best of them succeed as well as they do because they have, often unintentionally, implemented solutions with enormous reach. Consequently, coping with controversy in the present has become merely a means to an end. The purpose of deferring to the majority in democratic systems should be to approach unanimity in the future, by giving all concerned the incentive to abandon bad ideas and to conjecture better ones. Creatively changing the options is what allows people in real life to cooperate in ways that no-go theorems seem to say are impossible; and it is what allows individual minds to choose at all.

The growth of the body of knowledge about which there is unanimous agreement does not entail a dying-down of controversy: on the contrary, human beings will never disagree any less than they do now, and that is a very good thing. If those institutions do, as they seem to, fulfil the hope that it is possible for changes to be for the better, on balance, then human life can improve without limit as we advance from misconception to ever better misconception.

TERMINOLOGY

Representative government A system of government in which the composition or opinions of the legislature reflect those of the people.

Social-choice theory The study of how the ‘will of society’ can be defined in terms of the wishes of its members, and of what social institutions can cause society to enact its will, thus defined.

Popper’s criterion Good political institutions are those that make it as easy as possible to detect whether a ruler or policy is a mistake, and to remove rulers or policies without violence when they are.

MEANINGS OF ‘THE BEGINNING OF INFINITY’ ENCOUNTERED IN THIS CHAPTER

- Choice that involves creating new options rather than weighing existing ones.

- Political institutions that meet Popper’s criterion.

SUMMARY

It is a mistake to conceive of choice and decision-making as a process of selecting from existing options according to a fixed formula. That omits the most important element of decision-making, namely the creation of new options. Good policies are hard to vary, and therefore conflicting policies are discrete and cannot be arbitrarily mixed. Just as rational thinking does not consist of weighing the justifications of rival theories, but of using conjecture and criticism to seek the best explanation, so coalition governments are not a desirable objective of electoral systems. They should be judged by Popper’s criterion of how easy they make it to remove bad rulers and bad policies. That designates the plurality voting system as best in the case of advanced political cultures.