We have all had arguments. Occasionally these reach an agreed upon conclusion but usually the parties involved either agree to disagree or end up thinking the other party hopelessly stupid, ignorant or irrationally stubborn. Very rarely do people consider the possibility that it is they who are ignorant, stupid, irrational or stubborn even when they have good reason to believe that the other party is at least as intelligent or educated as themselves.
Sometimes the argument was about something factual where the facts could be easily checked e.g. who won a certain football match in 1966.
Sometimes the facts aren’t so easily checked because they are difficult to understand but the problem is clear and objective. One famous example is the Monty Hall dilemma. In a certain game show a contestant is presented with three doors, behind one of which is a really desirable main prize and behind the other two are booby prizes. The contestant is asked to choose the door he thinks the main prize is behind. The game show host, who knows where the prize is, then opens a booby prize door. He can do that even if you picked the main prize door. The contestant is now offered the chance to change his choice to the other closed door. The question is – should he? The correct answer is that he should because it would double his chances of winning the main prize. This fact proved so hard to grasp that only 8% of the laymen and 35% of academics who tried it, got it right initially. Many of them damned Marilyn vos Savant for misleading the public with her answer in her Parade column Ask Marilyn. Among those academics getting it wrong, were a fair number of senior professional mathematicians and statisticians, including the great Paul Erdos. After a great deal of explanation, the proportion of people who accept the correct answer increased to 56% of laymen and 71% of academics. The figure is more like 97-100% for those who carried out an experiment or simulation, so in essence a large proportion of those approaching the problem through pure logic continue to fail to grasp it.
Sometimes the facts aren’t as mathematical or logical as the Monty Hall solution. Each party to the argument appeals to ‘facts’ which the other party disputes. The disputed facts could be anything from the validity of the theory underlying a phenomenon, or the empirical results supposedly shedding light on the topic. A good example would be the debate among economists about the causes of the most recent US recession and the most useful way out of it. On one side are those who think the solution is less government spending to reduce deficits, and simply leaving the economy to painfully sort out major structural mal-investments and imbalances. They believe intervention is likely to create worse problems later. The other side says aggregate demand is the problem and that the recession can and should be fixed via some kind of fiscal and monetary intervention. This side believes intervention will make things better, not worse, in both the long and short term. Both sides claim that the other side’s view has been thoroughly discredited by empirical events in the past, and will point to current events ‘obviously’ supporting their expectations, or when they haven’t will issue dire warnings that it will, soon. A similarly insoluble argument is being had around the ‘facts’ of global warming.
Sometimes the arguments boil down to differences in values. For example, what tastes better chocolate or vanilla ice cream, or who is prettier Jane or Mary? In these cases there isn’t really a correct answer – even when a large majority favors a particular alternative. Values also have a strong way of influencing what people accept as evidence or indeed what they perceive at all.
The unreasonableness of continuing disagreement
The interesting thing is that when the disagreement isn’t a pure values difference it should always be possible to reach agreement. Robert Aumann, a Nobel winning game theorist (who I’ve had the pleasure of chatting to) proved that under conditions of common knowledge it isn’t possible to agree to disagree – even when the parties start with completely different information, facts and theories. In a simplified format it goes like this. Party A says the answer is X. Party B, who considers the answer to be Y, hears this and, if rational, will think “A must either have access to evidence that I don’t or doesn’t have access to some evidence that I do.” Under the notion of evidence I include not only facts but also the reasoning process. B may say that “Notwithstanding the possibility that I may have missed evidence I am still very confident in the evidence that I have so I will say I think the answer is Y”. A now hears B’s answer and goes through a similar chain of reasoning. He thinks “Gosh B has evidence he is so confident about that exposure to the knowledge that I have different evidence hasn’t shaken him. He must think his evidence is particularly strong. I should take that into account when evaluating my own evidence.” At this point A could decide he isn’t all that confident in his own evidence, and concede the argument to B. Alternatively he could decide that in spite of B’s confidence he still considers his own evidence to be persuasive, and re-affirm that he thinks the answer is X.
The ball now passes back to B, who now faced with A’s continued confidence in his evidence, even after making allowances for B’s confidence in his own evidence, must upgrade his view of the strength of A’s evidence relative to his own. He must then decide whether he is still thinks his evidence is strong enough to carry the day. He can decide “No it isn’t”, and concede the argument to A, or “Yes it is” and say he still considers Y to be correct. The process goes on until one of the parties concedes. At any point either party’s actual evidence can, and probably will, be shared and explained. Some readers may recognize this as an iterative Bayesian process. Others have extended Aumann’s analysis, and have shown that the process won’t go to infinity and should come to a conclusion in a reasonable number of iterations. The upshot of this is that if an argument doesn’t result in an agreement, at least one of the parties involved is being irrational or dishonest.
The rest of the article makes the unrealistic assumption that people will be rational and honest when arguing.
IQ and relative correctness
Item response theory connects a latent trait e.g. intelligence or IQ, with the probability getting a particular item in a test correct. Typically they look like this.
The formula producing these lines is
p(X)=exp(a*IQ+b)/(1+exp(a*IQ+b)).
The “a” coefficient tells one how steeply the probability rises as IQ rises i.e. how much solving the item depends on ability as measured by IQ versus how much the solution depends on uncertainty and luck. The “b” coefficient tells one the difficulty level of the item.
Suppose we select two IQ levels and compare the probabilities of a correct answer for each IQ level. With a bit of arithmetic we can show that the ratio of
p(IQ2)/p(IQ1)=exp(a*( IQ2 - IQ1)).
Suppose two people with IQs at level 1 and 2 respectively disagree about the answer. In effect they argue about it. Then the probability of person with being right in the event of a disagreement is
p(2 is right) = p(2)*(1-p(1))/((p(2)*(1-p(1)) + p(1)*(1-p(2))). Similarly for p(1 is right).
More arithmetic gives
p(2 is right)/p(1 is right) = exp(a*( IQ2 - IQ1)) too.
Therefore, when two people argue over the correctness of something, the probability of who is right is determined by the difference between their respective abilities and the degree to which solving that problem actually depends on ability. The difficulty of the item is irrelevant.
A chess diversion
Chess’s ELO rating system uses a similar method to calculate the probability of a player winning, but use base 10 rather than base e. So according to the ELO rating system used by FIDA
p(player a)/p(player b)=10**(Ra – Rb)/400 (which obviously = exp(a*( IQ2 - IQ1)).)
- where Ra is the ELO rating of player a.
This means that if the player’s ratings differ by 200 points then the highest ranked player should take roughly 3 out of every 4 wins between them. A ratings difference of 400 points means 10 wins for the highest ranked player for every win for the lower ranked player. The median rating for members of the US Chess Federation was 657 and rating of 1000 is regarded as a bright beginner. International Grandmasters typically rate 2500+, and the very best players have ratings slightly over 2800. To give you an idea of the differences in skill, consider that if the very best were to play an average player the ratio of wins is likely to be 227772 to 1. The difference between a grandmaster and a good beginner would be 5623 to 1.
The identities above mean that we can easily convert an IQ difference into an equivalent ELO ratings difference, using the following formula
Ratings difference = 173.7178*a*(IQ difference)
- “a” is the coefficient telling us how much the item depends on IQ for its solution.
I looked at the distribution of FIDA ratings in order to convert them to an IQ metric. For example, about 2% of chess players have ratings as 2300 or higher and 0.02% have ratings over 2700. If IQs are normally distributed, with an SD of 16, then these ELO ratings would correspond to 132 and 157 on the IQ scale. Note this doesn’t mean that chess players with a rating of 2700 will actually have an average IQ of 157 – it’s just a different way of specifying the same thing e.g. like a change from the Imperial to the Metric system.
I did however find a study (1) that allowed me to map real IQs onto chess ratings in experienced players. The equation is
Chess rating = 18.75*IQ – 275.
It turns out that the expected real IQs are very close to the IQ metric I calculated from the ratings distribution. (Note that the equation I developed is quite different from the one hypothesized by Jonathan Levitt (3) i.e. Rating = 10*IQ + 1000.) One should also be aware that the equation gives an average IQ – the actual IQs vary quite a bit around the expected figure. For example, the authors show that threshold effects exist and that the minimum IQ needed to achieve a rating of 2000 is around 85-90. This is 30-35 IQ points lower than the expected IQ. Also from his peak rating Garry Kasparov’s expected IQ is 167 (and wild claims of 180+ have been made) but his actual IQ was measured at 135 (in a test sponsored by Stern magazine), some 32 IQ points lower.
I suppose one could derive a rule that the minimum IQ required for a peak rating is some 32 IQ points (or 2 SDs) below the expected IQ. Alternatively, it means that if you have a combination of memory and industry in line with elite professional chess players, your peak rating is likely to be 600 ELO points higher than it would be if you were like an average chess player in these respects. Your chances of winning could be as much as 31.6 fold higher than your IQ suggests – or that much lower. That says something about the relative value of focused application.
Assuming that the distribution of combined effort and memory is symmetrical, it also means that a 64 IQ point advantage can not be overcome - even if the brighter player is also among the very laziest with a bad memory, and the less intelligent player has a superb memory and is among the most dedicated.
Even after accounting for IQ and work the predicted ratings are still a little fuzzy so perhaps random factors play a role too.
IQ and ELO rating differences in other domains
Here I look at converting the effect of IQ gaps to ELO rating differences, across a variety of domains
Let’s get back to the IQ to ELO rating conversion. Recall that the equation is
Ratings difference = 173.7178*a*(IQ difference).
All that remains is to find “a” for everything we are interesting in.
Clear Objective problems
The obvious place to start is IQ test items. The “a” coefficient for more fuzzy IQ test items tend to be around 0.046, and around 0.086 for really efficient IQ test items. That means that for fuzzy items each additional IQ point is worth 8 ELO points, and it’s worth 15 ELO points for good IQ items. If these items are used in a weird tournament where players compete to solve puzzles instead of play games, and we set the bar at a 3 to 1 win ratio (a 200 point ELO rating difference), then fuzzy items will require a 25 point IQ gap, and efficient items a 13-14 point IQ gap.
Physics mastery
Using information from an article by Steve Hsu (2) I worked out that a 3 fold advantage at “winning” at a physics exam – where a ‘win’ is an A in the exam when your opponent failed to get an A – requires an IQ gap of 12 points. If however a win is defined as a 3.5 GPA (where your opponent fails to attain this), then a mere 6 IQ points will provide a 3 fold advantage.
Crime
We could view cops and killers as being involved in a grim contest. In the USA around 65% of all murders are solved. That converts to an average “murder” ELO rating difference between police and murderers of 108 ELO points. It is also known that the mean IQs of murderers and policemen are 87 and 102, respectively. So successfully solving murders is a puzzle then the “a” coefficient is 0.041, and each IQ point difference is worth 7.2 ELO points. A 3 fold advantage could be had with a 28 point gap between cops and killers. In other words some 31% of outstanding murders could be solved if the USA selected its policemen to have an average IQ of 125 i.e. to be as smart as an average lawyer. I’m not sure if that’s worth it but maybe some cost benefit analysis would help. Such an analysis would have to take into account the drop in murder rates (with a life currently being valued at $2 million) due to the greater odds of being caught, and the opportunity cost of taking professional level IQs out of the pool for other professions, where they might be even more productive.
Controversial issues
Finally we get back to real arguments – disagreements over controversial topics. According to my Smart Vote concept (see http://garthzietsman.blogspot.com/2011/10/smart-vote-concept.html), if proportionally more smart people systematically favor an alternative then that alternative is likely to be correct or better. Using that definition of “correct”, and information in the General Social Survey, I calculated the “a” coefficients and ELO to IQ ratios etc for a few controversial questions. Typically it would take an IQ difference of 30-50 points to gain a 3 fold advantage in a rational argument.
Tasks with a high level of uncertainty
For comparison I looked at an intellectual game that includes a large element of chance – backgammon. A rating equation gives
p(player a)/p(player b) = 10**Rating diff/2000
for individual games. The difference from chess is that it divides the rating difference by 2000 instead of 400. It would take the result of a string of 21-22 games to provide the same test of relative skill as does a single game of chess. Controversial questions are less fuzzy, less uncertain, or more tractable to intelligence, than backgammon. If people ‘played’ a series of arguments over controversial questions instead of a series of backgammon games, then it would take maybe 5-6 such arguments to provide the same test of relative skill/wisdom in arguments, as a single game does in chess.
A summary table
Some meaningful IQ or ELO rating differences
Some research shows that friends and spouses have an average IQ difference of 12 points, that for IQ differences less than 20 points a reciprocal intellectual relationship is the rule, for IQ differences between 20-30 points the intellectual relationship tends to be one way, and that IQ differences greater than 30 points tend to create real barriers to communication.
An IQ gap of 12 points implies a roughly 67-72% chance of winning an argument over a clear objective issue, like verbal or math problems, and close to a 57-61% chance of winning an argument over a controversial question. A 30 point IQ gap implies an 87-91% chance of winning a verbal/math item argument, and a 63-67% chance of winning an argument over controversial issues. It seems as though there is a very fine line between intimacy and incomprehension on controversial issues (a mere 6% difference) but a fair gap on more objective issues (a 20% difference).
Perhaps it isn’t so much the IQ gap that matters to people, as the proportion of differences of opinion they win or lose i.e. the ELO rating difference. That in turn depends on the balance of clear objective, uncertain, and controversial issues in their disagreements. In general however, it seems that people don’t like to lose more than 2 in 3 disagreements, and when they lose more than 3 in 4 of them they feel like they aren’t on the same planet anymore. Those proportions correspond to ELO ratings differences of 100 and 200 respectively. An ELO rating difference of less than 100 feels tolerable and reciprocal while a difference of more than 200 feels unfair or unbalanced. If that theory is right, when most of the issues are fuzzy or uncertain the larger IQ differences should occur between friends and spouses, but when they are mostly clear objective issues then those IQ differences will be smaller.
References
1. Individual differences in chess expertise: A psychometric investigation. Roland H Grabner, Elsbeth Stern & Aljoscha C Neubauer. Acta psychological (2006) but I got it from www.sciencedirect.com.
2. Non-linear psychometric Thresholds for Physics and Mathematics. Steven D.H. Hsu & James Shombert. http://arxiv.org/abs/1011.0663
3. Genius in Chess. Jonathan Levitt.



