Archive for the ‘Statistics’ Category

Erdos Online

Thursday, January 15th, 2009

The collected works of Paul Erdõs are now available online. That’s 2% of the twentieth-century mathematical literature right there.

Via God Plays Dice.

What is a Statistic?

Saturday, November 1st, 2008

From the request thread, I was hoping for a nice easy softball, maybe from an undergraduate or mathematical amateur. Apparently, though, I have finally scared off anyone other than procrastinating professional mathematicians, who want me to actually write the posts I promised.

In the comments here I promised a post explaining why most statistics satisfy the Central Limit Theorem. I thought I’d start slowly with an explanation of what a statistic is.

A statistic is just something you compute from the data. This definition is so uninteresting that statistics books are a little apologetic about how contentless the definition sounds. (This usage of the term “statistic” was coined by Fisher. There is a cutting quote by Pearson on the terminology that is impossible to Google for, since all I remember is that it’s about the word statistic, and it involves Fisher and Pearson, who are probably the two most famous statisticians.)

Probability distributions are mathematical abstractions, while statistics are numbers we compute from actual data. If we believe that we can model that data as if it is generated by a random variable, then we have to relate the statistic to some property of the probability distribution. Usually, we are interested in some property of the underlying distribution, and using statistics to estimate it. For example, we may be interested in the mean of the underlying random variable, which we can approximate by using the mean of data.

Approximating the mean of the random variable in this way is a special case of a general technique to compute a property of a random variable. A random sample drawn from a probability distribution can be thought of as a (discrete) probability distribution in its own right. The property for the sample distribution can be used as an estimate of the property for the true distribution — this is known as the plug-in estimate for the property. An analog of the law of large numbers shows that this estimate converges to the true value.

Next time: the analogue of the central limit theorem.

Tremellius and Naibod

Sunday, September 7th, 2008

God Plays Dice has a post that answers a question I’ve long had about the Mathematics Geneology Project: just how far back can you go? The answer is 1572, when Immanuel Tremellius and Valentine Naibod advised Rudolph Snellius. Snellius was the father of Willebrord Snellius, who discovered Snell’s law.

Tremellius was a Bible translator who was briefly jailed for being a Calvinist. It sounds like he was forced to move frequently as the prevailing winds for Protestants changed. (This was the early Reformation.) Naibod was an astrologer who had a book banned by the Catholic Church. An astrological prediction told him that his life was in danger, so he tried holing up in his house until the danger passed. Since the house showed no external signs of life, thieves thought the house was abandoned and broke in. Discovering Naibod, they murdered him. Apparently astrology works after all.

The Geneology Project has a page dedicated to what it calls extrema. I would support a campaign to rename the Guinness Book of World Records the Guinness Book of Extrema.

Update. In between when I hit “Post” and now, the Mathematical Geneology site updated their database, making this post completely obsolete.

Looting the Library

Tuesday, March 18th, 2008

I promised a while back to write a post describing why so many statistics have a central limit theorem. I went to the library to look up the result I had in mind, to refresh my memory as to the details. The book I wanted was checked out. I thought about requesting the book, but it seemed a bit much to request a book just for a blog post. A couple of days later, I found out who had the book checked out: me.