My point is that you have two distinct variables or percentages. It's unfortunat...

kragen · on July 10, 2011

There are an infinite number of such pairs of numbers. 87, 25 may be one of them; perhaps 83, 22 is another, and 88, 28 is another, all describing the same distribution. But of all of those pairs of numbers, there is exactly one pair that sums to 100. If you want to compare the inequality of two distributions of wealth (or whatever), that pair of numbers is a reasonable candidate metric: an 80, 20 distribution is more equal than a 90, 10 distribution, and more unequal than a 70, 30 distribution.

The Gini coefficient is another scalar that can be used to rank distributions by inequality.

jamesbritt · on July 10, 2011

There are an infinite number of such pairs of numbers. 87, 25 may be one of them; perhaps 83, 22 is another, and 88, 28 is another, all describing the same distribution.

But they don't, and that's a key point. The numbers are, in fact (in this example) 25% and 87%. Different numbers would describe a different situation. They are not percentages of the same thing; they are percentages of two different things.

kragen · on July 11, 2011

You are mistaken in your assertion that those sets of numbers necessarily describe different distributions or situations.

I will explain this more carefully so that you can understand what the key points actually are. Please take the time to read and understand the explanation below.

You are correct that they are percentages of two different things.

However, if 87% of the wealth, whatever that is, belongs to the richest 25% of the population, then it's entirely possible for 88% of the wealth to simultaneously belong to the richest 28% of the population. That would just mean that 1% of the wealth belongs to the 3% of the population between the 72nd and 75th percentile, which is an entirely plausible state of affairs.

Consider, for any number X from 0 to 100, you can find a number Y that makes the statement "The richest X% of the population owns Y% of the wealth" true, without changing the distribution. Y is continuous and increases monotonically with X; and when X=0, Y=0; and when X=100, Y=100. Under those conditions, there is guaranteed to be exactly one point in [0, 100] where X = 100-Y.

If you want to compare two different distributions, it's helpful to oversimplify them to scalars, since otherwise you have vectors in an infinite-dimensional Hilbert space, which are tricky to compare. If you know that in the US, the richest 25% of the population controls 87% of the wealth, while in Argentina, the richest 10% of the population controls 70% of the wealth (it doesn't), you don't know which country is more unequal. It could be that the richest 10% of the population in the US controls 87% of the wealth, or 34.8% of the wealth, or anything in between. Furthermore, it could simultaneously be the case that, in the US, the richest 10% of the population controls 80% of the wealth (making the US seem more unequal), while in Argentina, the richest 25% of the population controls 90% of the wealth (making Argentina seem more unequal).

There are lots of possible choices of scalar. The smallest percentage of the population that controls 50% of the wealth is one reasonable candidate. The percentage of the wealth controlled by the richest 50% of the population is another. The Gini coefficient is a third. And that unique point of intersection where the richest X% of the population controls (100-X)% of the wealth is a fourth.

Does that clarify matters?

jamesbritt · on July 12, 2011

Thanks very much for taking the time and trouble to write this. Hopefully I'll learn something, and perhaps also make clear my own points.

However, if 87% of the wealth, whatever that is, belongs to the richest 25% of the population, then it's entirely possible for 88% of the wealth to simultaneously belong to the richest 28% of the population.

Sure. And you can look through the data to find assorted pairs like that, if you have all the data. If you don't then you're guessing.

Consider, for any number X from 0 to 100, you can find a number Y that makes the statement "The richest X% of the population owns Y% of the wealth" true, without changing the distribution. Y is continuous and increases monotonically with X; and when X=0, Y=0; and when X=100, Y=100. Under those conditions, there is guaranteed to be exactly one point in [0, 100] where X = 100-Y.

Sure, and I understand the value of normalizing data in order compare like things. What I do not see is that all expressions of the Pareto Principle must be given in some normalized form that asumes complete knowldge of the distribution. That knowledge may not be available. That doesn't mean one cannot observe and convey an instance of the Pareto Principle.

Basically, my points are that a) examples of the Pareto Principle do not have sum to 100. For example, if I have a team of five hackers, and one writes 90% of the code, then 20% of my team does 90% of the work. It's a 90/20 thing, and that's a valid example of the Pareto Principle as given.

Could this be adjusted to some X/Y such that X+Y == 100? Here's where I may be missing your point; If all know is that one person, hacker #1, is writing 90% of the code, how can I know the actual distributions from 20% to 40% to 60%, etc.? Suppose hacker #2 is writing the other 10% of the code; then I have a 100/40 situation. If hackers 2 and 3 are each writing 5% of the code then that's 95/40, 100/60 (and 100/80 as well).

This is why I say shifting 90/20 to some X+Y==100 formulation is expressing something different (albeit possibly a true one). So, point b) is that some sets of numbers are both more accurate and more germane to making a particular point. E.g. 90/20 is more striking, and accurate, than perhaps 74/26, which may reflect a truth about the distribution but fails to convey anything salient (though it may be handy for comparison with some other data).

In other words, there's an important difference between making an arbitrarily true statement about a distribution, and pointing out a uniquely interesting aspect of that distribution.

My apologies if I'm still being dense, or missing your point entirely, and I do appreciate your explanation. I get the sense we've been talking past each other.

kragen · on July 12, 2011

Agreed.