|
ZIPF'S
LAW
|

The
Quark and the Jaguar
by Murray Gell-Mann
|
This is an excerpt from
The Quark and the Jaguar
by Murray Gell-Mann,
Freeman & Co, 1994 |
... Often, however, we
encounter less than ideal cases. We may find regularities, predict
that similar regularities will occur elsewhere, discover that the
prediction is confirmed, and thus identify a robust pattern: however,
it may be a pattern for which the explanation continues to elude
us. In such a case we speak of an "empirical" or "phenomenological"
theory, using fancy words to mean basically that we see what is
going on but do not yet understand it. There are many such empirical
theories that connect together facts encountered in everyday life.
Suppose we pick up a
book of statistical facts, like the World Almanac. Looking inside,
we find a list of U.S. metropolitan areas in order of decreasing
population, together with the population figures. There may also
be corresponding lists for the cities in individual states and in
other countries. In each list every city can be assigned a rank,
equal to 1 for the most populous city, 2 for the next most populous,
and so on. Is there a general rule for all these lists that describes
how the population decreases as the rank increases? Roughly speaking,
yes. With fair accuracy, the population is inversely proportional
to the rank; in other words, the successive populations are roughly
proportional to 1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1 /9, 1 /10,
1/11, and so on.
Now let us look at the
list of the largest business firms in decreasing order of volume
of business (say the monetary value of sales during a given year).
Is there an approximate rule that describes how the sales figures
of the firms vary with their ranks? Yes, and it is the same rule
as for populations. The volume of business is approximately in inverse
proportion to the rank of the firm.
How about the exports
from a given country in a given year in decreasing order of monetary
value? Again, we find the same rule is a fair approximation.
An interesting consequence
of that rule is easily verified by perusing any of the lists mentioned,
for example a list of cities with their populations. First let us
look at, say, the third digit of each population figure. As expected,
the third digit is randomly distributed; the numbers of 0s, 1s,
2s, 3s, etc. in the third place are all roughly equal. A totally
different situation obtains for the distribution of first digits,
however. There is an overwhelming preponderance of 1s, followed
by 2s, and so forth. The percentage of population figures with initial
9s is extremely small. That behavior of the first digit is predicted
by the rule, which, if exactly obeyed, would give a proportion of
initial 1s to initial 9s of 45 to 1.
Rank
n |
|
City |
Population
(1990) |
Unmodified
Zipf's law
10,000,000
divided by n |
Modified
Zipf's Law
5,000,000
divided by
(n - 2/5)3/4 |
| 1 |
|
NewYork |
7,322,564 |
10,000,000 |
7,334,265 |
| 7 |
|
Detroit |
1,027,974 |
1,428,571 |
1,214,261 |
| 13 |
|
Baltimore |
736,014 |
769,231 |
747,639 |
| 19 |
|
Washington, D.C. |
606,900 |
526,316 |
558,258 |
| 25 |
|
New Orleans |
496,938 |
400,000 |
452,656 |
| 31 |
|
Kansas City, Mo. |
434,829 |
322,581 |
384,308 |
| 37 |
|
Virginia Beach, Va. |
393,089 |
270,270 |
336,015 |
| 49 |
|
Toledo |
332,943 |
204,082 |
271,639 |
| 61 |
|
Arlington'Texas |
261,721 |
163,934 |
230,205 |
| 73 |
|
Baton Rouge, La. |
219,531 |
136,986 |
201,033 |
| 85 |
|
Hialeah, Fla. |
188,008 |
117,647 |
179,243 |
| 97 |
|
Bakersfield, Calif. |
174,820 |
103,093 |
162,270 |
| Populations of U.S. cities from
the 1994 World Almanac compared with Zipf's original
law and a modified version of it. |
What if we put down the
World Almanac and pick up a book on secret codes, containing a list
of the most common words in a certain kind of English text arranged
in decreasing order of frequency of occurrence? What is the approximate
rule for the frequency of occurrence of each word as a function
of its rank? Again, we encounter the same rule, which works for
other languages as well.
Many of these relationships
were noticed in the early 1930s by a certain George Kingsley Zipf,
who taught German at Harvard, and they are all aspects of what is
now called Zipf's law. Today, we would say that Zipf's law is one
of many examples of so-called scaling laws or power laws, encountered
in many places in the physical, biological, and behavioral sciences.
But in the 1930s such laws were still something of a novelty.
In Zipf's law the quantity
under study is inversely proportional to the rank, that is, proportional
to 1, 1/2, 1/3, 1/4, etc. Benoit Mandelbrot has shown that a more
general power law (nearly the most general) is obtained by subjecting
this sequence successively to two kinds of modification. The first
alteration is to add a constant to the rank, giving 1/(1 + constant),
1/(2 + constant), 1/(3 + constant), 1/(4 + constant), etc. The further
change allows, instead of these fractions, their squares or their
cubes or their square roots or any other powers of them. The choice
of the squares, for instance, would yield the sequence 1/(1 + constant)
2 1/(2 + constant)2, 1(3 + constant)2, 1(4
+ constant)2 etc. The power in the more general power
law is 1 for Zipf's law, 2 for the squares, 3 for the cubes, 1/2
for the square roots, and so on. Mathematics gives a meaning to
intermediate values of the power as well, such as 3/4 or 1.0237.
In general, we can think of the power as 1 plus a second constant.
just as the first constant was added to the rank, so the second
one is added to the power. Zipf's law is then the special case in
which those two constants are zero.
Mandelbrot's generalization
of Zipf's law is still very simple: the additional complexity lies
only in the introduction of the two new adjustable constants, a
number added to the rank and a number added to the power 1. (An
adjustable constant, by the way, is called a "parameter," a word
that has been widely misused lately, perhaps under the influence
of the somewhat similar word "perimeter." The modified power law
has two additional parameters.) In any given case, instead of comparing
data with Zipf's original law, one can introduce those two constants
and adjust them for an optimal fit to the data. We can see in the
chart on page 94 how a slightly modified version of Zipf's law fits
some population data significantly better than Zipf's original rule
(with both constants set equal to zero), which already works fairly
well. "Slightly modified" means that the new constants have rather
small values in the altered power law used for the comparison. (The
constants in the chart were chosen by mere inspection of the data.
An optimal fit would have yielded even better agreement with the
actual populations.)
....
|