Archive for randomness

Poisson (d’Avril) Point Processes

Posted in Uncategorized with tags , , , on April 2, 2019 by telescoper

I was very unimpressed by yesterday’s batch of April Fool jokes. Some of them were just too obvious:

I’m glad I didn’t try to do one.

Anyway, I noticed that an old post of mine was getting some traffic and when I investigated I found that some of the links to pictures were dead. So I’ve decided to refresh it and post again.

–0–

I’ve got a thing about randomness. For a start I don’t like the word, because it covers such a multitude of sins. People talk about there being randomness in nature when what they really mean is that they don’t know how to predict outcomes perfectly. That’s not quite the same thing as things being inherently unpredictable; statements about the nature of reality are ontological, whereas I think randomness is only a useful concept in an epistemological sense. It describes our lack of knowledge: just because we don’t know how to predict doesn’t mean that it can’t be predicted.

Nevertheless there are useful mathematical definitions of randomness and it is also (somtimes) useful to make mathematical models that display random behaviour in a well-defined sense, especially in situations where one has to take into account the effects of noise.

I thought it would be fun to illustrate one such model. In a point process, the random element is a “dot” that occurs at some location in time or space. Such processes occur in wide range of contexts: arrivals of buses at a bus stop, photons in a detector, darts on a dartboard, and so on.

Let us suppose that we think of such a process happening in time, although what follows can straightforwardly be generalised to things happening over an area (such a dartboard) or within some higher-dimensional region. It is also possible to invest the points with some other attributes; processes like this are sometimes called marked point processes, but I won’t discuss them here.

The “most” random way of constructing a simple point process is to assume that each event happens independently of every other event, and that there is a constant probability per unit time of an event happening. This type of process is called a Poisson process, after the French mathematician Siméon-Denis Poisson, who was born in 1781. He was one of the most creative and original physicists of all time: besides fundamental work on electrostatics and the theory of magnetism for which he is famous, he also built greatly upon Laplace’s work in probability theory. His principal result was to derive a formula giving the number of random events if the probability of each one is very low. The Poisson distribution, as it is now known and which I will come to shortly, is related to this original calculation; it was subsequently shown that this distribution amounts to a limiting of the binomial distribution. Just to add to the connections between probability theory and astronomy, it is worth mentioning that in 1833 Poisson wrote an important paper on the motion of the Moon.

In a finite interval of duration T the mean (or expected) number of events for a Poisson process will obviously just be proportional to the product of the rate per unit time and T itself; call this product λ.

The full distribution is then of the form:

This gives the probability that a finite interval contains exactly x events. It can be neatly derived from the binomial distribution by dividing the interval into a very large number of very tiny pieces, each one of which becomes a Bernoulli trial. The probability of success (i.e. of an event occurring) in each trial is extremely small, but the number of trials becomes extremely large in such a way that the mean number of successes is l. In this limit the binomial distribution takes the form of the above expression. The variance of this distribution is interesting: it is alsol.  This means that the typical fluctuations within the interval are of order the square root of l on a mean level of l, so the fractional variation is of the famous “one over root n” form that is a useful estimate of the expected variation in point processes.  Indeed, it’s a useful rule-of-thumb for estimating likely fluctuation levels in a host of statistical situations.

If football were a Poisson process with a mean number of goals per game of, say, 2 then would expect must games to have 2 plus or minus 1.4 (the square root of 2)  goals, i.e. between about 0.6 and 3.4. That is actually not far from what is observed and the distribution of goals per game in football matches is actually quite close to a Poisson distribution.

This idea can be straightforwardly extended to higher dimensional processes. If points are scattered over an area with a constant probability per unit area then the mean number in a finite area will also be some number l and the same formula applies.

As a matter of fact I first learned about the Poisson distribution when I was at school, doing A-level mathematics (which in those days actually included some mathematics). The example used by the teacher to illustrate this particular bit of probability theory was a two-dimensional one from biology. The skin of a fish was divided into little squares of equal area, and the number of parasites found in each square was counted. A histogram of these numbers accurately follows the Poisson form. For years I laboured under the delusion that it was given this name because it was something to do with fish, but then I never was very quick on the uptake.

This is all very well, but point processes are not always of this Poisson form. Points can be clustered, so that having one point at a given position increases the conditional probability of having others nearby. For example, galaxies like those shown in the nice picture are distributed throughout space in a clustered pattern that is very far from the Poisson form. But it’s very difficult to tell from just looking at the picture. What is needed is a rigorous statistical analysis.

 

The statistical description of clustered point patterns is a fascinating subject, because it makes contact with the way in which our eyes and brain perceive pattern. I’ve spent a large part of my research career trying to figure out efficient ways of quantifying pattern in an objective way and I can tell you it’s not easy, especially when the data are prone to systematic errors and glitches. I can only touch on the subject here, but to see what I am talking about look at the two patterns below:

pointbpointa

You will have to take my word for it that one of these is a realization of a two-dimensional Poisson point process and the other contains correlations between the points. One therefore has a real pattern to it, and one is a realization of a completely unstructured random process.

I show this example in popular talks and get the audience to vote on which one is the random one. The vast majority usually think that the top  is the one that is random and the bottom one is the one with structure to it. It is not hard to see why. The top pattern is very smooth (what one would naively expect for a constant probability of finding a point at any position in the two-dimensional space) , whereas the bottom one seems to offer a profusion of linear, filamentary features and densely concentrated clusters.

In fact, it’s the bottom  picture that was generated by a Poisson process using a  Monte Carlo random number generator. All the structure that is visually apparent is imposed by our own sensory apparatus, which has evolved to be so good at discerning patterns that it finds them when they’re not even there!

The top  process is also generated by a Monte Carlo technique, but the algorithm is more complicated. In this case the presence of a point at some location suppresses the probability of having other points in the vicinity. Each event has a zone of avoidance around it; the points are therefore anticorrelated. The result of this is that the pattern is much smoother than a truly random process should be. In fact, this simulation has nothing to do with galaxy clustering really. The algorithm used to generate it was meant to mimic the behaviour of glow-worms which tend to eat each other if they get  too close. That’s why they spread themselves out in space more uniformly than in the random pattern.

Incidentally, I got both pictures from Stephen Jay Gould’s collection of essays Bully for Brontosaurus and used them, with appropriate credit and copyright permission, in my own book From Cosmos to Chaos. I forgot to say this in earlier versions of this post.

The tendency to find things that are not there is quite well known to astronomers. The constellations which we all recognize so easily are not physical associations of stars, but are just chance alignments on the sky of things at vastly different distances in space. That is not to say that they are random, but the pattern they form is not caused by direct correlations between the stars. Galaxies form real three-dimensional physical associations through their direct gravitational effect on one another.

People are actually pretty hopeless at understanding what “really” random processes look like, probably because the word random is used so often in very imprecise ways and they don’t know what it means in a specific context like this.  The point about random processes, even simpler ones like repeated tossing of a coin, is that coincidences happen much more frequently than one might suppose.

I suppose there is an evolutionary reason why our brains like to impose order on things in a general way. More specifically scientists often use perceived patterns in order to construct hypotheses. However these hypotheses must be tested objectively and often the initial impressions turn out to be figments of the imagination, like the canals on Mars.

Now, I think I’ll complain to wordpress about the widget that links pages to a “random blog post”. I’m sure it’s not really random….

 

 

What happens if you ask people to pick a number `at random’ between 1 and 100?

Posted in Bad Statistics with tags on April 11, 2018 by telescoper

I saw this circulating on Twitter and thought I would share it here; it was originally posted on reddit.

The graph shows the results obtained when 6750 people were asked to pick an integer `at random’ between 1 and 100. You might naively expect the histogram to be flat (give or take some Poisson errors), consistent with each number having the same probability of being picked, but there are clearly some numbers that are more likely to be chosen than a constant probability would imply. The most popular picks are in fact 69, 77 and 7 (in descending order).

It’s well known amongst purveyors of conjuring tricks and the like that if you ask people to pick a number between 1 and 10, far more people choose 7 than any other number. And I suppose 77 is an extension of that. More interestingly, however, the top result implies that, given the choice, more people seem to prefer a 69 to anything else…

Anyway, it proves a point that I’ve made more than a few times on this blog, namely that people generally have a very poor idea of what randomness is and are particularly bad at making random choices or generating random sequences.

P.S. Please direct any criticism of the graph (e.g. why the x-axis goes up to 104 or why the x-values are given to two decimal places) to the reddit page…

A Question of Entropy

Posted in Bad Statistics with tags , , on August 10, 2015 by telescoper

We haven’t had a poll for a while so here’s one for your entertainment.

An article has appeared on the BBC Website entitled Web’s random numbers are too weak, warn researchers. The piece is about the techniques used to encrypt data on the internet. It’s a confusing piece, largely because of the use of the word “random” which is tricky to define; see a number of previous posts on this topic. I’ll steer clear of going over that issue again. However, there is a paragraph in the article that talks about entropy:

An unshuffled pack of cards has a low entropy, said Mr Potter, because there is little surprising or uncertain about the order the cards would be dealt. The more a pack was shuffled, he said, the more entropy it had because it got harder to be sure about which card would be turned over next.

I won’t prejudice your vote by saying what I think about this statement, but here’s a poll so I can try to see what you think.

Of course I also welcome comments via the box below…

Uncertainty, Risk and Probability

Posted in Bad Statistics, Science Politics with tags , , , , , , , , on March 2, 2015 by telescoper

Last week I attended a very interesting event on the Sussex University campus, the Annual Marie Jahoda Lecture which was given this year by Prof. Helga Nowotny a distinguished social scientist. The title of the talk was A social scientist in the land of scientific promise and the abstract was as follows:

Promises are a means of bringing the future into the present. Nowhere is this insight by Hannah Arendt more applicable than in science. Research is a long and inherently uncertain process. The question is open which of the multiple possible, probable or preferred futures will be actualized. Yet, scientific promises, vague as they may be, constitute a crucial link in the relationship between science and society. They form the core of the metaphorical ‘contract’ in which support for science is stipulated in exchange for the benefits that science will bring to the well-being and wealth of society. At present, the trend is to formalize scientific promises through impact assessment and measurement. Against this background, I will present three case studies from the life sciences: assisted reproductive technologies, stem cell research and the pending promise of personalized medicine. I will explore the uncertainty of promises as well as the cunning of uncertainty at work.

It was a fascinating and wide-ranging lecture that touched on many themes. I won’t try to comment on all of them, but just pick up on a couple that struck me from my own perspective as a physicist. One was the increasing aversion to risk demonstrated by research funding agencies, such as the European Research Council which she helped set up but described in the lecture as “a clash between a culture of trust and a culture of control”. This will ring true to any scientist applying for grants even in “blue skies” disciplines such as astronomy: we tend to trust our peers, who have some control over funding decisions, but the machinery of control from above gets stronger every day. Milestones and deliverables are everything. Sometimes I think in order to get funding you have to be so confident of the outcomes of your research to that you have to have already done it, in which case funding isn’t even necessary. The importance of extremely speculative research is rarely recognized, although that is where there is the greatest potential for truly revolutionary breakthroughs.

Another theme that struck me was the role of uncertainty and risk. This grabbed my attention because I’ve actually written a book about uncertainty in the physical sciences. In her lecture, Prof. Nowotny referred to the definition (which was quite new to me) of these two terms by Frank Hyneman Knight in a book on economics called Risk, Uncertainty and Profit. The distinction made there is that “risk” is “randomness” with “knowable probabilities”, whereas “uncertainty” involves “randomness” with “unknowable probabilities”. I don’t like these definitions at all. For one thing they both involve a reference to “randomness”, a word which I don’t know how to define anyway; I’d be much happier to use “unpredictability”. Even more importantly, perhaps, I find the distinction between “knowable” and “unknowable” probabilities very problematic. One always knows something about a probability distribution, even if that something means that the distribution has to be very broad. And in any case these definitions imply that the probabilities concerned are “out there”, rather being statements about a state of knowledge (or lack thereof). Sometimes we know what we know and sometimes we don’t, but there are more than two possibilities. As the great American philosopher and social scientist Donald Rumsfeld (Shurely Shome Mishtake? Ed) put it:

“…as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.”

There may be a proper Bayesian formulation of the distinction between “risk” and “uncertainty” that involves a transition between prior-dominated (uncertain) and posterior-dominated (risky), but basically I don’t see any qualititative difference between the two from such a perspective.

Anyway, it was a very interesting lecture that differed from many talks I’ve attended about the sociology of science in that the speaker clearly understood a lot about how science actually works. The Director of the Science Policy Research Unit invited the Heads of the Science Schools (including myself) to dinner with the speaker afterwards, and that led to the generation of many interesting ideas about how we (I mean scientists and social scientists) might work better together in the future, something we really need to do.

When random doesn’t seem random..

Posted in Crosswords, The Universe and Stuff with tags , , , , , on February 21, 2015 by telescoper

A few months have passed since I last won a dictionary as a prize in the Independent Crossword competition. That’s nothing remarkable in itself, but since my average rate of dictionary accumulation has been about one a month over the last few years, it seems a bit of a lull.  Have I forgotten how to do crosswords and keep sending in wrong solutions? Is the Royal Mail intercepting my post? Has the number of correct entries per week suddenly increased, reducing my odds of winning? Have the competition organizers turned against me?

In fact, statistically speaking, there’s nothing significant in this gap. Even if my grids are all correct, the number of correct grids has remained constant, and the winner is pulled at random  from those submitted (i.e. in such a way that all correct entries are equally likely to be drawn) , then a relatively long unsuccessful period such as I am experiencing at the moment is not at all improbable. The point is that such runs are far more likely in a truly random process than most people imagine, as indeed are runs of successes. Chance coincidences happen more often than you think.

I try this out in lectures sometimes, by asking a member of the audience to generate a random sequence of noughts and ones in their head. It seems people are very conscious that the number of ones should be roughly equal to the number of noughts that they impose that as they go along. Almost universally, the supposedly random sequences people produce only have very short runs of 1s or 0s because, say, a run like ‘00000’ just seems too unlikely. Well, it is unlikely, but that doesn’t mean it won’t happen. In a truly random binary sequence like this (i.e. one in which 1 and 0 both have a probability of 0.5 and each selection is independent of the others), coincidental runs of consecutive 0s and 1s happen with surprising frequency. Try it yourself, with a coin.

Coincidentally, the subject of randomness was suggested to me independently yesterday by an anonymous email correspondent by the name of John Peacock as I have blogged about it before; one particular post on this topic is actually one of this blog’s most popular articles).  What triggered this was a piece about music players such as Spotify (whatever that is) which have a “random play” feature. Apparently people don’t accept that it is “really random” because of the number of times the same track comes up. To deal with this “problem”, experts are working at algorithms that don’t actually play things randomly but in such a way that accords with what people think randomness means.

I think this fiddling is a very bad idea. People understand probability so poorly anyway that attempting to redefine the word’s meaning is just going to add confusion. You wouldn’t accept a casino that used loaded dice, so why allow cheating in another context? Far better for all concerned for the general public to understand what randomness is and, perhaps more importantly, what it looks like.

I have to confess that I don’t really like the word “randomness”, but I haven’t got time right now for a rant about it. There are, however, useful mathematical definitions of randomness and it is also (sometimes) useful to make mathematical models that display random behaviour in a well-defined sense, especially in situations where one has to take into account the effects of noise.

I thought it would be fun to illustrate one such model. In a point process, the random element is a “dot” that occurs at some location in time or space. Such processes can be defined in one or more dimensions and relate to a wide range of situations: arrivals of buses at a bus stop, photons in a detector, darts on a dartboard, and so on.

The statistical description of clustered point patterns is a fascinating subject, because it makes contact with the way in which our eyes and brain perceive pattern. I’ve spent a large part of my research career trying to figure out efficient ways of quantifying pattern in an objective way and I can tell you it’s not easy, especially when the data are prone to systematic errors and glitches. I can only touch on the subject here, but to see what I am talking about look at the two patterns below:

pointbpointa

You will have to take my word for it that one of these is a realization of a two-dimensional Poisson point process and the other contains correlations between the points. One therefore has a real pattern to it, and one is a realization of a completely unstructured random process.

I show this example in popular talks and get the audience to vote on which one is the random one. In fact, I did this just a few weeks ago during a lecture in our module Quarks to Cosmos, which attempts to explain scientific concepts to non-science students. As usual when I do this, I found that the vast majority thought  that the top one is random and the bottom one is the one with structure to it. It is not hard to see why. The top pattern is very smooth (what one would naively expect for a constant probability of finding a point at any position in the two-dimensional space) , whereas the bottom one seems to offer a profusion of linear, filamentary features and densely concentrated clusters.

In fact, it’s the bottom  picture that was generated by a Poisson process using a  Monte Carlo random number generator. All the structure that is visually apparent in the second example is imposed by our own sensory apparatus, which has evolved to be so good at discerning patterns that it finds them when they’re not even there!

The top  process is also generated by a Monte Carlo technique, but the algorithm is more complicated. In this case the presence of a point at some location suppresses the probability of having other points in the vicinity. Each event has a zone of avoidance around it; the points are therefore anticorrelated. The result of this is that the pattern is much smoother than a truly random process should be. In fact, this simulation has nothing to do with galaxy clustering really. The algorithm used to generate it was meant to mimic the behaviour of glow-worms which tend to eat each other if they get  too close. That’s why they spread themselves out in space more uniformly than in the “really” random pattern.

I assume that Spotify’s non-random play algorithm will have the effect of producing a one-dimensional version of the top pattern, i.e. one with far too few coincidences to be genuinely random.

Incidentally, I got both pictures from Stephen Jay Gould’s collection of essays Bully for Brontosaurus and used them, with appropriate credit and copyright permission, in my own book From Cosmos to Chaos.

The tendency to find things that are not there is quite well known to astronomers. The constellations which we all recognize so easily are not physical associations of stars, but are just chance alignments on the sky of things at vastly different distances in space. That is not to say that they are random, but the pattern they form is not caused by direct correlations between the stars. Galaxies form real three-dimensional physical associations through their direct gravitational effect on one another.

People are actually pretty hopeless at understanding what “really” random processes look like, probably because the word random is used so often in very imprecise ways and they don’t know what it means in a specific context like this.  The point about random processes, even simpler ones like repeated tossing of a coin, is that coincidences happen much more frequently than one might suppose.

I suppose there is an evolutionary reason why our brains like to impose order on things in a general way. More specifically scientists often use perceived patterns in order to construct hypotheses. However these hypotheses must be tested objectively and often the initial impressions turn out to be figments of the imagination, like the canals on Mars.

Perhaps I should complain to WordPress about the widget that links pages to a “random blog post”. I’m sure it’s not really random….

Galaxies, Glow-worms and Chicken Eyes

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , , , , on February 26, 2014 by telescoper

I just came across a news item based on a research article in Physical Review E by Jiao et al. with the abstract:

Optimal spatial sampling of light rigorously requires that identical photoreceptors be arranged in perfectly regular arrays in two dimensions. Examples of such perfect arrays in nature include the compound eyes of insects and the nearly crystalline photoreceptor patterns of some fish and reptiles. Birds are highly visual animals with five different cone photoreceptor subtypes, yet their photoreceptor patterns are not perfectly regular. By analyzing the chicken cone photoreceptor system consisting of five different cell types using a variety of sensitive microstructural descriptors, we find that the disordered photoreceptor patterns are “hyperuniform” (exhibiting vanishing infinite-wavelength density fluctuations), a property that had heretofore been identified in a unique subset of physical systems, but had never been observed in any living organism. Remarkably, the patterns of both the total population and the individual cell types are simultaneously hyperuniform. We term such patterns “multihyperuniform” because multiple distinct subsets of the overall point pattern are themselves hyperuniform. We have devised a unique multiscale cell packing model in two dimensions that suggests that photoreceptor types interact with both short- and long-ranged repulsive forces and that the resultant competition between the types gives rise to the aforementioned singular spatial features characterizing the system, including multihyperuniformity. These findings suggest that a disordered hyperuniform pattern may represent the most uniform sampling arrangement attainable in the avian system, given intrinsic packing constraints within the photoreceptor epithelium. In addition, they show how fundamental physical constraints can change the course of a biological optimization process. Our results suggest that multihyperuniform disordered structures have implications for the design of materials with novel physical properties and therefore may represent a fruitful area for future research.

The point made in the paper is that the photoreceptors found in the eyes of chickens possess a property called disordered hyperuniformity which means that the appear disordered on small scales but exhibit order over large distances. Here’s an illustration:

chicken_eyes

It’s an interesting paper, but I’d like to quibble about something it says in the accompanying news story. The caption with the above diagram states

Left: visual cell distribution in chickens; right: a computer-simulation model showing pretty much the exact same thing. The colored dots represent the centers of the chicken’s eye cells.

Well, as someone who has spent much of his research career trying to discern and quantify patterns in collections of points – in my case they tend to be galaxies rather than photoreceptors – I find it difficult to defend the use of the phrase “pretty much the exact same thing”. It’s notoriously difficult to look at realizations of stochastic point processes and decided whether they are statistically similar or not. For that you generally need quite sophisticated mathematical analysis.  In fact, to my eye, the two images above don’t look at all like “pretty much the exact same thing”. I’m not at all sure that the model works as well as it is claimed, as the statistical analysis presented in the paper is relatively simple: I’d need to see some more quantitative measures of pattern morphology and clustering, especially higher-order correlation functions, before I’m convinced.

Anyway, all this reminded me of a very old post of mine about the difficulty of discerning patterns in distributions of points. Take the two (not very well scanned)  images here as examples:

points

You will have to take my word for it that one of these is a realization of a two-dimensional Poisson point process (which is, in a well-defined sense completely “random”) and the other contains spatial correlations between the points. One therefore has a real pattern to it, and one is a realization of a completely unstructured random process.

I sometimes show this example in popular talks and get the audience to vote on which one is the random one. The vast majority usually think that the one on the right is the one that is random and the left one is the one with structure to it. It is not hard to see why. The right-hand pattern is very smooth (what one would naively expect for a constant probability of finding a point at any position in the two-dimensional space) , whereas the  left one seems to offer a profusion of linear, filamentary features and densely concentrated clusters.

In fact, it’s the left picture that was generated by a Poisson process using a Monte Carlo random number generator. All the structure that is visually apparent is imposed by our own sensory apparatus, which has evolved to be so good at discerning patterns that it finds them when they’re not even there!

The right process is also generated by a Monte Carlo technique, but the algorithm is more complicated. In this case the presence of a point at some location suppresses the probability of having other points in the vicinity. Each event has a zone of avoidance around it; the points are therefore anticorrelated. The result of this is that the pattern is much smoother than a truly random process should be. In fact, this simulation has nothing to do with galaxy clustering really. The algorithm used to generate it was meant to mimic the behaviour of glow-worms (a kind of beetle) which tend to eat each other if they get too close. That’s why they spread themselves out in space more uniformly than in the random pattern. In fact, the tendency displayed in this image of the points to spread themselves out more smoothly than a random distribution is in in some ways reminiscent of the chicken eye problem.

The moral of all this is that people are actually pretty hopeless at understanding what “really” random processes look like, probably because the word random is used so often in very imprecise ways and they don’t know what it means in a specific context like this. The point about random processes, even simpler ones like repeated tossing of a coin, is that coincidences happen much more frequently than one might suppose. By the same token, people are also pretty hopeless at figuring out whether two distributions of points resemble each other in some kind of statistical sense, because that can only be made precise if one defines some specific quantitative measure of clustering pattern, which is not easy to do.

The Monkey Complex

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , on November 15, 2009 by telescoper

There’s an old story that if you leave a set of monkeys hammering on typewriters for a sufficiently long time then they will eventually reproduce the entire text of Shakespeare’s play Hamlet. It comes up in a variety of contexts, but the particular generalisation of this parable in cosmology is to argue that if we live in an enormously big universe (or “multiverse“), in which the laws of nature (as specified by the relevant fundamental constants) vary “sort of randomly” from place to place, then there will be a domain in which they have the right properties for life to evolve. This is one way of explaining away the apparent fine-tuning of the laws of physics: they’re not finely tuned, but we just live in a place where they allowed us to evolve. Although it may seem an easy step from monkeys to the multiverse, it always seemed to me a very shaky one.

For a start, let’s go back to the monkeys. The supposition that given an infinite time the monkeys must produce everything that’s possible in a finite sequence, is not necessarily true even if one does allow an infinite time. It depends on how they type. If the monkeys were always to hit two adjoining keys at the same time then they would never produce a script for Hamlet, no matter how long they typed for, as the combinations QW or ZX do not appear anywhere in that play. To guarantee what we need the kind their typing has to be ergodic, a very specific requirement not possessed by all “random” sequences.

A more fundamental problem is what is meant by randomness in the first place. I’ve actually commented on this before, in a post that still seems to be collecting readers so I thought I’d develop one or two of the ideas a little.

 It is surprisingly easy to generate perfectly deterministic mathematical sequences that behave in the way we usually take to characterize indeterministic processes. As a very simple example, consider the following “iteration” scheme:

 X_{j+1}= 2 X_{j} \mod(1)

If you are not familiar with the notation, the term mod(1) just means “drop the integer part”.  To illustrate how this works, let us start with a (positive) number, say 0.37. To calculate the next value I double it (getting 0.74) and drop the integer part. Well, 0.74 does not have an integer part so that’s fine. This value (0.74) becomes my first iterate. The next one is obtained by putting 0.74 in the formula, i.e. doubling it (1.48) and dropping  the integer part: result 0.48. Next one is 0.96, and so on. You can carry on this process as long as you like, using each output number as the input state for the following step of the iteration.

Now to simplify things a little bit, notice that, because we drop the integer part each time, all iterates must lie in the range between 0 and 1. Suppose I divide this range into two bins, labelled “heads” for X less than ½ and “tails” for X greater than or equal to ½. In my example above the first value of X is 0.37 which is “heads”. Next is 0.74 (tails); then 0.48 (heads), 0.96(heads), and so on.

This sequence now mimics quite accurately the tossing of a fair coin. It produces a pattern of heads and tails with roughly 50% frequency in a long run. It is also difficult to predict the next term in the series given only the classification as “heads” or “tails”.

However, given the seed number which starts off the process, and of course the algorithm, one could reproduce the entire sequence. It is not random, but in some respects  looks like it is.

One can think of “heads” or “tails” in more general terms, as indicating the “0” or “1” states in the binary representation of a number. This method can therefore be used to generate the any sequence of digits. In fact algorithms like this one are used in computers for generating what are called pseudorandom numbers. They are not precisely random because computers can only do arithmetic to a finite number of decimal places. This means that only a finite number of possible sequences can be computed, so some repetition is inevitable, but these limitations are not always important in practice.

The ability to generate  random numbers accurately and rapidly in a computer has led to an entirely new way of doing science. Instead of doing real experiments with measuring equipment and the inevitable errors, one can now do numerical experiments with pseudorandom numbers in order to investigate how an experiment might work if we could do it. If we think we know what the result would be, and what kind of noise might arise, we can do a random simulation to discover the likelihood of success with a particular measurement strategy. This is called the “Monte Carlo” approach, and it is extraordinarily powerful. Observational astronomers and particle physicists use it a great deal in order to plan complex observing programmes and convince the powers that be that their proposal is sufficiently feasible to be allocated time on expensive facilities. In the end there is no substitute for real experiments, but in the meantime the Monte Carlo method can help avoid wasting time on flawed projects:

…in real life mistakes are likely to be irrevocable. Computer simulation, however, makes it economically practical to make mistakes on purpose.

(John McLeod and John Osborne, in Natural Automata and Useful Simulations).

So is there a way to tell whether a set of numbers is really random? Consider the following sequence:

1415926535897932384626433832795028841971

Is this a random string of numbers? There doesn’t seem to be a discernible pattern, and each possible digit seems to occur with roughly the same frequency. It doesn’t look like anyone’s phone number or bank account. Is that enough to make you think it is random?

Actually this is not at all random. If I had started it with a three and a decimal place you might have cottoned on straight away. “3.1415926..” is the first few digits in the decimal representation of p. The full representation goes on forever without repeating. This is a sequence that satisfies most naïve definitions of randomness. It does, however, provide something of a hint as to how we might construct an operational definition, i.e. one that we can apply in practice to a finite set of numbers.

The key idea originates from the Russian mathematician Andrei Kolmogorov, who wrote the first truly rigorous mathematical work on probability theory in 1933. Kolmogorov’s approach was considerably ahead of its time, because it used many concepts that belong to the era of computers. In essence, what he did was to provide a definition of the complexity of an N-digit sequence in terms of the smallest amount of computer memory it would take to store a program capable of generating the sequence. Obviously one can always store the sequence itself, which means that there is always a program that occupies about as many bytes of memory as the sequence itself, but some numbers can be generated by codes much shorter than the numbers themselves. For example the sequence

111111111111111111111111111111111111

can be generated by the instruction to “print 1 35 times”, which can be stored in much less memory than the original string of digits. Such a sequence is therefore said to be algorithmically compressible.

There are many ways of calculating the digits of π numerically also, so although it may look superficially like a random string it is most definitely not random. It is algorithmically compressible.

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. When I studied it at school I certainly wished it were a little shorter…

The complexity of a sequence can be defined to be the length of the shortest program capable of generating it. If no algorithm can be found that compresses the sequence into a program shorter than itself then it is maximally complex and can suitably be defined as random. This is a very elegant description, and has good intuitive appeal.  

I’m not sure how compressible Hamlet is, but it’s certainly not entirely random. At any rate, when I studied it at school, I certainly wished it were a little shorter…

However, this still does not provide us with a way of testing rigorously whether a given finite sequence has been produced “randomly” or not.

If an algorithmic compression can be found then that means we declare the given sequence not to be  random. However we can never be sure if the next term in the sequence would fit with what our algorithm would predict. We have to argue, inferentially, that if we have fit a long sequence with a simple algorithm then it is improbable that the sequence was generated randomly.

On the other hand, if we fail to find a suitable compression that doesn’t mean it is random either. It may just mean we didn’t look hard enough or weren’t clever enough.

Human brains are good at finding patterns. When we can’t see one we usually take the easy way out and declare that none exists. We often model a complicated system as a random process because it is  too difficult to predict its behaviour accurately even if we know the relevant laws and have  powerful computers at our disposal. That’s a very reasonable thing to do when there is no practical alternative. 

It’s quite another matter, however,  to embrace randomness as a first principle to avoid looking for an explanation in the first place. For one thing, it’s lazy, taking the easy way out like that. And for another it’s a bit arrogant. Just because we can’t find an explanation within the framework of our current theories doesn’t mean more intelligent creatures than us won’t do so. We’re only monkeys, after all.

Random Thoughts: Points and Poisson (d’Avril)

Posted in The Universe and Stuff with tags , , , on April 4, 2009 by telescoper

I’ve got a thing about randomness. For a start I don’t like the word, because it covers such a multitude of sins. People talk about there being randomness in nature when what they really mean is that they don’t know how to predict outcomes perfectly. That’s not quite the same thing as things being inherently unpredictable; statements about the nature of reality are ontological, whereas I think randomness is only a useful concept in an epistemological sense. It describes our lack of knowledge: just because we don’t know how to predict doesn’t mean that it can’t be predicted.

Nevertheless there are useful mathematical definitions of randomness and it is also (somtimes) useful to make mathematical models that display random behaviour in a well-defined sense, especially in situations where one has to take into account the effects of noise.

I thought it would be fun to illustrate one such model. In a point process, the random element is a “dot” that occurs at some location in time or space. Such processes occur in wide range of contexts: arrivals of buses at a bus stop, photons in a detector, darts on a dartboard, and so on.

Let us suppose that we think of such a process happening in time, although what follows can straightforwardly be generalised to things happening over an area (such a dartboard) or within some higher-dimensional region. It is also possible to invest the points with some other attributes; processes like this are sometimes called marked point processes, but I won’t discuss them here.

The “most” random way of constructing a simple point process is to assume that each event happens independently of every other event, and that there is a constant probability per unit time of an event happening. This type of process is called a Poisson process, after the French mathematician Siméon-Denis Poisson, who was born in 1781. He was one of the most creative and original physicists of all time: besides fundamental work on electrostatics and the theory of magnetism for which he is famous, he also built greatly upon Laplace’s work in probability theory. His principal result was to derive a formula giving the number of random events if the probability of each one is very low. The Poisson distribution, as it is now known and which I will come to shortly, is related to this original calculation; it was subsequently shown that this distribution amounts to a limiting of the binomial distribution. Just to add to the connections between probability theory and astronomy, it is worth mentioning that in 1833 Poisson wrote an important paper on the motion of the Moon.

In a finite interval of duration T the mean (or expected) number of events for a Poisson process will obviously just be proportional to the product of the rate per unit time and T itself; call this product l.
 
The full distribution is then

This gives the probability that a finite interval contains exactly x events. It can be neatly derived from the binomial distribution by dividing the interval into a very large number of very tiny pieces, each one of which becomes a Bernoulli trial. The probability of success (i.e. of an event occurring) in each trial is extremely small, but the number of trials becomes extremely large in such a way that the mean number of successes is l. In this limit the binomial distribution takes the form of the above expression. The variance of this distribution is interesting: it is alsol.  This means that the typical fluctuations within the interval are of order the square root of l on a mean level of l, so the fractional variation is of the famous “one over root n” form that is a useful estimate of the expected variation in point processes.  Indeed, it’s a useful rule-of-thumb for estimating likely fluctuation levels in a host of statistical situations.

If football were a Poisson process with a mean number of goals per game of, say, 2 then would expect must games to have 2 plus or minus 1.4 (the square root of 2)  goals, i.e. between about 0.6 and 3.4. That is actually not far from what is observed and the distribution of goals per game in football matches is actually quite close to a Poisson distribution.

This idea can be straightforwardly extended to higher dimensional processes. If points are scattered over an area with a constant probability per unit area then the mean number in a finite area will also be some number l and the same formula applies.

As a matter of fact I first learned about the Poisson distribution when I was at school, doing A-level mathematics (which in those days actually included some mathematics). The example used by the teacher to illustrate this particular bit of probability theory was a two-dimensional one from biology. The skin of a fish was divided into little squares of equal area, and the number of parasites found in each square was counted. A histogram of these numbers accurately follows the Poisson form. For years I laboured under the delusion that it was given this name because it was something to do with fish, but then I never was very quick on the uptake.

This is all very well, but point processes are not always of this Poisson form. Points can be clustered, so that having one point at a given position increases the conditional probability of having others nearby. For example, galaxies like those shown in the nice picture are distributed throughout space in a clustered pattern that is very far from the Poisson form. But it’s very difficult to tell from just looking at the picture. What is needed is a rigorous statistical analysis.

The statistical description of clustered point patterns is a fascinating subject, because it makes contact with the way in which our eyes and brain perceive pattern. I’ve spent a large part of my research career trying to figure out efficient ways of quantifying pattern in an objective way and I can tell you it’s not easy, especially when the data are prone to systematic errors and glitches. I can only touch on the subject here, but to see what I am talking about look at the two patterns below:

 

pointbpointa

You will have to take my word for it that one of these is a realization of a two-dimensional Poisson point process and the other contains correlations between the points. One therefore has a real pattern to it, and one is a realization of a completely unstructured random process.

I show this example in popular talks and get the audience to vote on which one is the random one. The vast majority usually think that the top  is the one that is random and the bottom one is the one with structure to it. It is not hard to see why. The top pattern is very smooth (what one would naively expect for a constant probability of finding a point at any position in the two-dimensional space) , whereas the bottom one seems to offer a profusion of linear, filamentary features and densely concentrated clusters.

In fact, it’s the bottom  picture that was generated by a Poisson process using a  Monte Carlo random number generator. All the structure that is visually apparent is imposed by our own sensory apparatus, which has evolved to be so good at discerning patterns that it finds them when they’re not even there!

The top  process is also generated by a Monte Carlo technique, but the algorithm is more complicated. In this case the presence of a point at some location suppresses the probability of having other points in the vicinity. Each event has a zone of avoidance around it; the points are therefore anticorrelated. The result of this is that the pattern is much smoother than a truly random process should be. In fact, this simulation has nothing to do with galaxy clustering really. The algorithm used to generate it was meant to mimic the behaviour of glow-worms which tend to eat each other if they get  too close. That’s why they spread themselves out in space more uniformly than in the random pattern.

Incidentally, I got both pictures from Stephen Jay Gould’s collection of essays Bully for Brontosaurus and used them, with appropriate credit and copyright permission, in my own book From Cosmos to Chaos. I forgot to say this in earlier versions of this post.

The tendency to find things that are not there is quite well known to astronomers. The constellations which we all recognize so easily are not physical associations of stars, but are just chance alignments on the sky of things at vastly different distances in space. That is not to say that they are random, but the pattern they form is not caused by direct correlations between the stars. Galaxies form real three-dimensional physical associations through their direct gravitational effect on one another.

People are actually pretty hopeless at understanding what “really” random processes look like, probably because the word random is used so often in very imprecise ways and they don’t know what it means in a specific context like this.  The point about random processes, even simpler ones like repeated tossing of a coin, is that coincidences happen much more frequently than one might suppose.

I suppose there is an evolutionary reason why our brains like to impose order on things in a general way. More specifically scientists often use perceived patterns in order to construct hypotheses. However these hypotheses must be tested objectively and often the initial impressions turn out to be figments of the imagination, like the canals on Mars.

Now, I think I’ll complain to wordpress about the widget that links pages to a “random blog post”.

I’m sure it’s not really random….