Archive for Big Bang

A Little Bit of Bayes

Posted in Bad Statistics, The Universe and Stuff with tags , , , , , , on November 21, 2010 by telescoper

I thought I’d start a series of occasional posts about Bayesian probability. This is something I’ve touched on from time to time but its perhaps worth covering this relatively controversial topic in a slightly more systematic fashion especially with regard to how it works in cosmology.

I’ll start with Bayes’ theorem which for three logical propositions (such as statements about the values of parameters in theory) A, B and C can be written in the form

P(B|AC) = K^{-1}P(B|C)P(A|BC) = K^{-1} P(AB|C)

where

K=P(A|C).

This is (or should be!)  uncontroversial as it is simply a result of the sum and product rules for combining probabilities. Notice, however, that I’ve not restricted it to two propositions A and B as is often done, but carried throughout an extra one (C). This is to emphasize the fact that, to a Bayesian, all probabilities are conditional on something; usually, in the context of data analysis this is a background theory that furnishes the framework within which measurements are interpreted. If you say this makes everything model-dependent, then I’d agree. But every interpretation of data in terms of parameters of a model is dependent on the model. It has to be. If you think it can be otherwise then I think you’re misguided.

In the equation,  P(B|C) is the probability of B being true, given that C is true . The information C need not be definitely known, but perhaps assumed for the sake of argument. The left-hand side of Bayes’ theorem denotes the probability of B given both A and C, and so on. The presence of C has not changed anything, but is just there as a reminder that it all depends on what is being assumed in the background. The equation states  a theorem that can be proved to be mathematically correct so it is – or should be – uncontroversial.

Now comes the controversy. In the “frequentist” interpretation of probability, the entities A, B and C would be interpreted as “events” (e.g. the coin is heads) or “random variables” (e.g. the score on a dice, a number from 1 to 6) attached to which is their probability, indicating their propensity to occur in an imagined ensemble. These things are quite complicated mathematical objects: they don’t have specific numerical values, but are represented by a measure over the space of possibilities. They are sort of “blurred-out” in some way, the fuzziness representing the uncertainty in the precise value.

To a Bayesian, the entities A, B and C have a completely different character to what they represent for a frequentist. They are not “events” but  logical propositions which can only be either true or false. The entities themselves are not blurred out, but we may have insufficient information to decide which of the two possibilities is correct. In this interpretation, P(A|C) represents the degree of belief that it is consistent to hold in the truth of A given the information C. Probability is therefore a generalization of the “normal” deductive logic expressed by Boolean algebra: the value “0” is associated with a proposition which is false and “1” denotes one that is true. Probability theory extends  this logic to the intermediate case where there is insufficient information to be certain about the status of the proposition.

A common objection to Bayesian probability is that it is somehow arbitrary or ill-defined. “Subjective” is the word that is often bandied about. This is only fair to the extent that different individuals may have access to different information and therefore assign different probabilities. Given different information C and C′ the probabilities P(A|C) and P(A|C′) will be different. On the other hand, the same precise rules for assigning and manipulating probabilities apply as before. Identical results should therefore be obtained whether these are applied by any person, or even a robot, so that part isn’t subjective at all.

In fact I’d go further. I think one of the great strengths of the Bayesian interpretation is precisely that it does depend on what information is assumed. This means that such information has to be stated explicitly. The essential assumptions behind a result can be – and, regrettably, often are – hidden in frequentist analyses. Being a Bayesian forces you to put all your cards on the table.

To a Bayesian, probabilities are always conditional on other assumed truths. There is no such thing as an absolute probability, hence my alteration of the form of Bayes’s theorem to represent this. A probability such as P(A) has no meaning to a Bayesian: there is always conditioning information. For example, if  I blithely assign a probability of 1/6 to each face of a dice, that assignment is actually conditional on me having no information to discriminate between the appearance of the faces, and no knowledge of the rolling trajectory that would allow me to make a prediction of its eventual resting position.

In tbe Bayesian framework, probability theory  becomes not a branch of experimental science but a branch of logic. Like any branch of mathematics it cannot be tested by experiment but only by the requirement that it be internally self-consistent. This brings me to what I think is one of the most important results of twentieth century mathematics, but which is unfortunately almost unknown in the scientific community. In 1946, Richard Cox derived the unique generalization of Boolean algebra under the assumption that such a logic must involve associated a single number with any logical proposition. The result he got is beautiful and anyone with any interest in science should make a point of reading his elegant argument. It turns out that the only way to construct a consistent logic of uncertainty incorporating this principle is by using the standard laws of probability. There is no other way to reason consistently in the face of uncertainty than probability theory. Accordingly, probability theory always applies when there is insufficient knowledge for deductive certainty. Probability is inductive logic.

This is not just a nice mathematical property. This kind of probability lies at the foundations of a consistent methodological framework that not only encapsulates many common-sense notions about how science works, but also puts at least some aspects of scientific reasoning on a rigorous quantitative footing. This is an important weapon that should be used more often in the battle against the creeping irrationalism one finds in society at large.

I posted some time ago about an alternative way of deriving the laws of probability from consistency arguments.

To see how the Bayesian approach works, let us consider a simple example. Suppose we have a hypothesis H (some theoretical idea that we think might explain some experiment or observation). We also have access to some data D, and we also adopt some prior information I (which might be the results of other experiments or simply working assumptions). What we want to know is how strongly the data D supports the hypothesis H given my background assumptions I. To keep it easy, we assume that the choice is between whether H is true or H is false. In the latter case, “not-H” or H′ (for short) is true. If our experiment is at all useful we can construct P(D|HI), the probability that the experiment would produce the data set D if both our hypothesis and the conditional information are true.

The probability P(D|HI) is called the likelihood; to construct it we need to have   some knowledge of the statistical errors produced by our measurement. Using Bayes’ theorem we can “invert” this likelihood to give P(H|DI), the probability that our hypothesis is true given the data and our assumptions. The result looks just like we had in the first two equations:

P(H|DI) = K^{-1}P(H|I)P(D|HI) .

Now we can expand the “normalising constant” K because we know that either H or H′ must be true. Thus

K=P(D|I)=P(H|I)P(D|HI)+P(H^{\prime}|I) P(D|H^{\prime}I)

The P(H|DI) on the left-hand side of the first expression is called the posterior probability; the right-hand side involves P(H|I), which is called the prior probability and the likelihood P(D|HI). The principal controversy surrounding Bayesian inductive reasoning involves the prior and how to define it, which is something I’ll comment on in a future post.

The Bayesian recipe for testing a hypothesis assigns a large posterior probability to a hypothesis for which the product of the prior probability and the likelihood is large. It can be generalized to the case where we want to pick the best of a set of competing hypothesis, say H1 …. Hn. Note that this need not be the set of all possible hypotheses, just those that we have thought about. We can only choose from what is available. The hypothesis may be relatively simple, such as that some particular parameter takes the value x, or they may be composite involving many parameters and/or assumptions. For instance, the Big Bang model of our universe is a very complicated hypothesis, or in fact a combination of hypotheses joined together,  involving at least a dozen parameters which can’t be predicted a priori but which have to be estimated from observations.

The required result for multiple hypotheses is pretty straightforward: the sum of the two alternatives involved in K above simply becomes a sum over all possible hypotheses, so that

P(H_i|DI) = K^{-1}P(H_i|I)P(D|H_iI),

and

K=P(D|I)=\sum P(H_j|I)P(D|H_jI)

If the hypothesis concerns the value of a parameter – in cosmology this might be, e.g., the mean density of the Universe expressed by the density parameter Ω0 – then the allowed space of possibilities is continuous. The sum in the denominator should then be replaced by an integral, but conceptually nothing changes. Our “best” hypothesis is the one that has the greatest posterior probability.

From a frequentist stance the procedure is often instead to just maximize the likelihood. According to this approach the best theory is the one that makes the data most probable. This can be the same as the most probable theory, but only if the prior probability is constant, but the probability of a model given the data is generally not the same as the probability of the data given the model. I’m amazed how many practising scientists make this error on a regular basis.

The following figure might serve to illustrate the difference between the frequentist and Bayesian approaches. In the former case, everything is done in “data space” using likelihoods, and in the other we work throughout with probabilities of hypotheses, i.e. we think in hypothesis space. I find it interesting to note that most theorists that I know who work in cosmology are Bayesians and most observers are frequentists!


As I mentioned above, it is the presence of the prior probability in the general formula that is the most controversial aspect of the Bayesian approach. The attitude of frequentists is often that this prior information is completely arbitrary or at least “model-dependent”. Being empirically-minded people, by and large, they prefer to think that measurements can be made and interpreted without reference to theory at all.

Assuming we can assign the prior probabilities in an appropriate way what emerges from the Bayesian framework is a consistent methodology for scientific progress. The scheme starts with the hardest part – theory creation. This requires human intervention, since we have no automatic procedure for dreaming up hypothesis from thin air. Once we have a set of hypotheses, we need data against which theories can be compared using their relative probabilities. The experimental testing of a theory can happen in many stages: the posterior probability obtained after one experiment can be fed in, as prior, into the next. The order of experiments does not matter. This all happens in an endless loop, as models are tested and refined by confrontation with experimental discoveries, and are forced to compete with new theoretical ideas. Often one particular theory emerges as most probable for a while, such as in particle physics where a “standard model” has been in existence for many years. But this does not make it absolutely right; it is just the best bet amongst the alternatives. Likewise, the Big Bang model does not represent the absolute truth, but is just the best available model in the face of the manifold relevant observations we now have concerning the Universe’s origin and evolution. The crucial point about this methodology is that it is inherently inductive: all the reasoning is carried out in “hypothesis space” rather than “observation space”.  The primary form of logic involved is not deduction but induction. Science is all about inverse reasoning.

For comments on induction versus deduction in another context, see here.

So what are the main differences between the Bayesian and frequentist views?

First, I think it is fair to say that the Bayesian framework is enormously more general than is allowed by the frequentist notion that probabilities must be regarded as relative frequencies in some ensemble, whether that is real or imaginary. In the latter interpretation, a proposition is at once true in some elements of the ensemble and false in others. It seems to me to be a source of great confusion to substitute a logical AND for what is really a logical OR. The Bayesian stance is also free from problems associated with the failure to incorporate in the analysis any information that can’t be expressed as a frequency. Would you really trust a doctor who said that 75% of the people she saw with your symptoms required an operation, but who did not bother to look at your own medical files?

As I mentioned above, frequentists tend to talk about “random variables”. This takes us into another semantic minefield. What does “random” mean? To a Bayesian there are no random variables, only variables whose values we do not know. A random process is simply one about which we only have sufficient information to specify probability distributions rather than definite values.

More fundamentally, it is clear from the fact that the combination rules for probabilities were derived by Cox uniquely from the requirement of logical consistency, that any departure from these rules will generally speaking involve logical inconsistency. Many of the standard statistical data analysis techniques – including the simple “unbiased estimator” mentioned briefly above – used when the data consist of repeated samples of a variable having a definite but unknown value, are not equivalent to Bayesian reasoning. These methods can, of course, give good answers, but they can all be made to look completely silly by suitable choice of dataset.

By contrast, I am not aware of any example of a paradox or contradiction that has ever been found using the correct application of Bayesian methods, although method can be applied incorrectly. Furthermore, in order to deal with unique events like the weather, frequentists are forced to introduce the notion of an ensemble, a perhaps infinite collection of imaginary possibilities, to allow them to retain the notion that probability is a proportion. Provided the calculations are done correctly, the results of these calculations should agree with the Bayesian answers. On the other hand, frequentists often talk about the ensemble as if it were real, and I think that is very dangerous…


Share/Bookmark

The Cosmic Web

Posted in The Universe and Stuff with tags , , , , , on November 23, 2009 by telescoper

When I was writing my recent  (typically verbose) post about chaos  on a rainy saturday afternoon, I cut out a bit about astronomy because I thought it was too long even by my standards of prolixity. However, walking home this evening I realised I could actually use it in a new post inspired by a nice email I got after my Herschel lecture in Bath. More of that in a minute, but first the couple of paras I edited from the chaos item…

Astronomy provides a nice example that illustrates how easy it is to make things too complicated to solve. Suppose we have two massive bodies orbiting in otherwise empty space. They could be the Earth and Moon, for example, or a binary star system. Each of the bodies exerts a gravitational force on the other that causes it to move. Newton himself showed that the orbit followed by each of the bodies is an ellipse, and that both bodies orbit around their common centre of mass. The Earth is much more massive than the Moon, so the centre of mass of the Earth-Moon system is rather close to the centre of the Earth. Although the Moon appears to do all the moving, the Earth orbits too. If the two bodies have equal masses, they each orbit the mid-point of the line connecting them, like two dancers doing a waltz.

Now let us add one more body to the dance. It doesn’t seem like too drastic a complication to do this, but the result is a mathematical disaster. In fact there is no known mathematical solution for the gravitational three-body problem, apart from a few special cases where some simplifying symmetry helps us out. The same applies to the N-body problem for any N bigger than 2. We cannot solve the equations for systems of gravitating particles except by using numerical techniques and very big computers. We can do this very well these days, however, because computer power is cheap.

Computational cosmologists can “solve” the N-body problem for billions of particles, by starting with an input list of positions and velocities of all the particles. From this list the forces on each of them due to all the other particles can be calculated. Each particle is then moved a little according to Newton’s laws, thus advancing the system by one time-step. Then the forces are all calculated again and the system inches forward in time. At the end of the calculation, the solution obtained is simply a list of the positions and velocities of each of the particles. If you would like to know what would have happened with a slightly different set of initial conditions you need to run the entire calculation again. There is no elegant formula that can be applied for any input: each laborious calculation is specific to its initial conditions.

Now back to the Herschel lecture I gave, called The Cosmic Web, the name given to the frothy texture of the large-scale structure of the Universe revealed by galaxy surveys such as the 2dFGRS:

One of the points I tried to get across in the lecture was that we can explain the pattern – quite accurately – in the framework of the Big Bang cosmology by a process known as gravitational instability. Small initial irregularities in the density of the Universe tend to get amplified as time goes on. Regions just a bit denser than average tend to pull in material from their surroundings faster, getting denser and denser until they collapse in on themselves, thus forming bound objects.

This  Jeans instability  is the dominant mechanism behind star formation in molecular clouds, and it leads to the rapid collapse of blobby extended structures  to tightly bound clumps. On larger scales relevant to cosmological structure formation we have to take account of the fact that the universe is expanding. This means that gravity has to fight against the expansion in order to form structures, which slows it down. In the case of a static gas cloud the instability grows exponentially with time, whereas in an expanding background it is a slow power-law.

This actually helps us in cosmology because the process of structure formation is not so fast that it destroys all memory of the initial conditions, which is what happens when stars form. When we look at the large-scale structure of the galaxy distribution we are therefore seeing something which contains a memory of where it came from. I’ve blogged before about what started the whole thing off here.

Here’s a (very low-budget) animation of the formation of structure in the expanding universe as computed by an N-body code. The only subtlety in this is that it is in comoving coordinates, which expand with the universe: the box should really be getting bigger but is continually rescaled with the expansion to keep it the same size on the screen.

You can see that filaments form in profusion but these merge and disrupt in such a way that the characteristic size of the pattern evolves with time. This is called hierarchical clustering.

One of the questions I got by email after the talk was basically that if the same gravitational instability produced stars and large-scale structure, why wasn’t the whole universe just made of enormous star-like structures rather than all these strange filaments and things?

Part of the explanation is that the filaments are relatively transient things. The dominant picture is one in which the filaments and clusters
become incorporated in larger-scale structures but really dense concentrations, such as the spiral galaxies, which do
indeed look a bit like big solar systems, are relatively slow to form.

When a non-expanding cloud of gas collapses to form a star there is also some transient filamentary structure  but the processes involved go so rapidly that it is all swept away quickly. Out there in the expanding universe we can still see the cobwebs.

A Well Placed Lecture

Posted in The Universe and Stuff with tags , on September 18, 2009 by telescoper

I noticed that the UK government has recently dropped its ban on product placement in television programmes. I wanted to take this opportunity to state Virgin Airlines that I will not be taking this as a Carling cue to introduce subliminal Coca Cola advertising of any Corby Trouser Press form into this blog.

This week I’ve been giving Marks and Spencer lectures every AIG afternoon to groups of 200 sixth form Samsung students on the subject of the Burger King Big Bang. The talks seemed to go down BMW quite well although I had Betfair trouble sometimes cramming all the Sainsbury things I wanted to talk about in the Northern Rock 30 minutes I was allotted. Anyway, I went through the usual stuff about the Carlsberg cosmic microwave background (CMB), even showing the noise on a Sony television screen to explain that a bit of the Classic FM signal came from the edge of the Next Universe.  The CMB played an Emirates important role in the talk as it is the Marlboro smoking gun of the Big Bang and established our Standard Life model of L’Oreal cosmology.

The timing of these lectures was Goodfella’s Pizza excellent because I was able to include Crown Paints references to the Hubble Ultra Deep Kentucky Fried Chicken Field and the Planck First Direct initial results that I’ve blogged about in the past week or so.

Now that’s all over, Thank God It’s Friday and  I’m getting ready to go to the Comet Sale Now On Opera. ..

Lessening Anomalies

Posted in Cosmic Anomalies, The Universe and Stuff with tags , , , , , on September 15, 2009 by telescoper

An interesting paper caught my eye on today’s ArXiv and I thought I’d post something here because it relates to an ongoing theme on this blog about the possibility that there might be anomalies in the observed pattern of temperature fluctuations in the cosmic microwave background (CMB). See my other posts here, here, here, here and here for related discussions.

One of the authors of the new paper, John Peacock, is an occasional commenter on this blog. He was also the Chief Inquisitor at my PhD (or rather DPhil) examination, which took place 21 years ago. The four-and-a-half hours of grilling I went through that afternoon reduced me to a gibbering wreck but the examiners obviously felt sorry for me and let me pass anyway. I’m not one to hold a grudge so I’ll resist the temptation to be churlish towards my erstwhile tormentor.

The most recent paper is about the possible  contribution of  the integrated Sachs-Wolfe (ISW) effect to these anomalies. The ISW mechanism generates temperature variations in the CMB because photons travel along a line of sight through a time-varying gravitational potential between the last-scattering surface and the observer. The integrated effect is zero if the potential does not evolve because the energy shift falling into a well exactly balances that involved in climbing out of one. If in transit the well gets a bit deeper, however, there is a net contribution.

The specific thing about the ISW effect that makes it measurable is that the temperature variations it induces should correlate with the pattern of structure in the galaxy distribution, as it is these that generate the potential fluctuations through which CMB photons travel. Francis & Peacock try to assess the ISW contribution using data from the 2MASS all-sky survey of galaxies. This in itself contains important cosmological clues but in the context of this particular question it is a nuisance, like any other foreground contamination, so they subtract it off the maps obtained from the Wilkinson Microwave Anisotropy Probe (WMAP) in an attempt to get a cleaner map of the primordial CMB sky.

The results are shown in the picture below which presents  the lowest order spherical harmonic modes, the quadrupole (left) and octopole (right) for the  ISW component (top) , WMAP data (middle) and at the bottom we have the cleaned CMB sky (i.e. the middle minus the top). The ISW subtraction doesn’t make a huge difference to the visual appearance of the CMB maps but it is enough to substantially reduce to the statistical significance of at least some of the reported anomalies I mentioned above. This reinforces how careful we have to be in analysing the data before jumping to cosmological conclusions.

peacock

There should also be a further contribution from fluctuations beyond the depth of the 2MASS survey (about 0.3 in redshift).  The actual ISW effect could therefore  be significantly larger than this estimate.

Back Early…

Posted in The Universe and Stuff with tags , , , , , on September 11, 2009 by telescoper

As a very quick postscript to my previous post about the amazing performance of Hubble’s spanking new camera, let me just draw attention to a fresh paper on the ArXiv by Rychard Bouwens and collaborators, which discusses the detection of galaxies with redshifts around 8 in the Hubble Ultra Deep Field (shown below in an earlier image) using WFC3/IR observations that reveal galaxies fainter than the previous detection limits.

Amazing. I remember the days when a redshift z=0.5 was a big deal!

To put this in context and to give some idea of its importance, remember that the redshift z is defined in such a way that 1+z is the factor by which the wavelength of light is stretched out by the expansion of the Universe. Thus, a photon from a galaxy at redshift 8 started out on its journey towards us (or, rather, the Hubble Space Telescope) when the Universe was compressed in all directions relative to its present size by a factor of 9. The average density of stuff then was a factor 93=729 larger, so the Universe was a much more crowded place then compared to what it’s like now.

Translating the redshift into a time is trickier because it requires us to know how the expansion rate of the Universe varies with cosmic epoch. The requires solving the equations of a cosmological model or, more realistically for a Friday afternoon, plugging the numbers into Ned Wright’s famous cosmology calculator.

Using the best-estimate parameters for the current concordance cosmology reveals that at redshift 8, the Universe was only about 0.65 billion years old (i.e. light from the distant galaxies seen by HST set out only 650 million years after the Big Bang). Since the current age of the Universe is about 13.7 billion years (according to the same model), this means that the light Hubble detected set out on its journey towards us an astonishing 13 billion years ago.

More importantly for theories of galaxy formation and evolution, this means that at least some galaxies must have formed very early on, relatively speaking, in the first 5% of the time the Universe has been around for until now.

These observations are by no means certain as the redshifts have been determined only approximately using photometric techniques rather than the more accurate spectroscopic methods, but if they’re correct they could be extremely important.

At the very least they provide even stronger motivation for getting on with the next-generation space telescope, JWST.

Beginning Again

Posted in Books, Talks and Reviews, The Universe and Stuff with tags , on August 19, 2009 by telescoper

I keep finding old forgotten bits and pieces – especially book reviews – on my computer. This one is about five years old but I thought I might as well put it on here to save having to think of anything else for today. It’s also a little bit topical because the author, Simon Singh, has recently been the subject of much discussion on this blog (here and here).

This piece was eventually published in an edited form as as Nature 432, 953-954 (23 December 2004) | doi:10.1038/432953b; Published online 22 December 2004.

BOOK REVIEWEDBig Bang: The Most Important Scientific Discovery of All Time and Why You Need to Know About It

by Simon Singh
Fourth Estate: 2004. 544 pp. £20, $27.95

When the British astrophysicist Fred Hoyle coined the phrase ‘Big Bang’ to describe the rival to his beloved ‘steady state’ theory of the Universe, he meant it to be disparaging. It was bad enough for Hoyle that his pet theory turned out to disagree with astronomical observations, but it must have been especially galling that his cosmological adversaries embraced his derisive name. The tag has since spread into the wider cultural domain — nowadays even politicians have heard of the Big Bang.

But what is the Big Bang? In a nutshell, it is the idea that our Universe — space, time and all its matter content — was born in a primordial fireball, from which the whole caboodle has been expanding and cooling ever since. Pioneering theorists such as Aleksander Friedmann and Georges Lemaître derived mathematical solutions of Einstein’s field equations that could be used to describe the evolution of a Big Bang Universe. These models involve a creation event, in which space-time and matter-energy sprang into existence to form our Universe. We are still in the dark about how this happened, but we think it took place about 14 billion years ago.

Edwin Hubble’s discovery of the recession of distant galaxies gave support to the idea that the Universe was expanding, but the notion that it might be evolving from a hot beginning was rejected by many theorists, including Hoyle. He favoured a model in which the origin of matter was not a single event but a continuous process in which atoms were created to fill in the gaps created by cosmic expansion. The battle between these competing views of creation raged until the accidental discovery in 1965 of the cosmic microwave background radiation, which marked the beginning of the end for the steady-state theory.

This conflict between the two theories plays a central role in Simon Singh’s book Big Bang. His previous books, Fermat’s Last Theorem and The Code Book, succeeded admirably in bringing difficult mathematical subjects to a popular readership, using a combination of accessible prose, a liberal sprinkling of jokes and a strong flavouring of biographical anecdotes. The recipe for his new book is similar.

In Big Bang, Singh uses the historical development of modern cosmological theory as a case study for how scientific theories are conceived, and how they win or lose acceptance. He rightly points out that science rarely proceeds in an objective, linear fashion. Correct theories are often favoured for the wrong reasons; observations and experiments are frequently misinterpreted; and sometimes force of personality holds sway over analytic reason. Because cosmology has such ambitious goals — to find a coherent explanation for the entire system of things and how it has evolved — these peculiarities are often exaggerated. In particular, cosmology has more than its fair share of eccentric characters, providing ample illustration of the role of personal creativity in scientific progress.

This very well written book conveys the ideas underpinning cosmological theory with great clarity. Taking nothing for granted of his readership, Singh delves into the background of every key scientific idea he discusses. This involves going into the history of astronomical observation, as well as explaining in non-technical language the principles of basic nuclear physics and relativity. The numerous snippets of biographical information are illuminating as well as amusing, and the narrative is driven along by the author’s own engaging personality.

However, even as a fan of Singh’s previous books, I have to admit that, although this one has many strengths, I found it ultimately rather disappointing. For one thing, there isn’t anything in this book that could be described as new. The book follows a roughly historical thread from pre-classical mythology to the middle of the twentieth century. This is a well-worn path for popular cosmology, and the whole thing is rather formulaic. Each chapter I read gave me the impression that I had read most of it somewhere before. It certainly lacks the ground-breaking character of Fermat’s Last Theorem.

The past ten years in cosmology have witnessed a revolution in observation that has, among many other things, convinced us of the existence of dark energy in the Universe. Theory has also changed radically over this period, largely through the introduction of ideas from high-energy physics, such as superstring theory. Indeed, some contemporary Big Bang models bear a remarkable resemblance to the steady-state universe, involving the continuous creation not of mere atoms, but of entire universes.

Frustratingly, virtually all the exciting recent developments are missing from this book, which leaves off just when things started to get interesting, with the COBE satellite in 1992. Readers who want to know what is going on now in this field should definitely look elsewhere. The processes of cosmic discovery and controversy are ongoing, not just relics of the past.

Why the Big Bang is Wrong…

Posted in Biographical, The Universe and Stuff with tags , , on July 7, 2009 by telescoper

I suspect that I’m not the only physicist who has a filing cabinet filled with unsolicited correspondence from people with wacky views on everything from UFOs to Dark Matter. Being a cosmologist, I probably get more of this stuff than those working in less speculative branches of physics. Because I’ve written a few things that appeared in the public domain (and even appeared on TV and radio a few times), I probably even get more than most cosmologists (except the really  famous ones of course).

I would estimate that I get two or three items of correspondence of this kind per week. Many “alternative” cosmologists have now discovered email, but there are still a lot who send their ideas through regular post. In fact, whenever I get a envelope with an address on it that has been typed by an old-fashioned typewriter it’s usually a dead giveaway that it’s going to be one of  those. Sometimes they are just letters (typed or handwritten), but sometimes they are complete manuscripts often with wonderfully batty illustrations. I have one in front of me now called Dark Matter, The Great Pyramid and the Theory of Crystal Healing. I might even go so far as to call that one bogus. I have an entire filing cabinet in my office at work filled with things like it. I could make a fortune if I set up a journal for these people. Alarmingly, electrical engineers figure prominently in my files. They seem particularly keen to explain why Einstein was wrong…

I never reply, of course. I don’t have time, for one thing.  I’m also doubtful whether there’s anything useful to be gained by trying to engage in a scientific argument with people whose grip on the basic concepts is so tenuous (as perhaps it is on reality). Even if they have some scientific training, their knowledge and understanding of physics is usually pretty poor.

I should explain that, whenever I can, if someone writes or emails with a genuine question about physics or astronomy – which often happens – I always reply. I think that’s a responsibility for anyone who gets taxpayers’ money. However, I don’t reply to letters that are confrontational or aggressive or which imply that modern science is some sort of conspiracy to conceal the real truth.

One particular correspondent started writing to me after the publication of my little book, Cosmology: A Very Short Introduction. I won’t gave his name, but he was an individual who had some scientific training (not an electrical engineer, I hasten to add). This chap sent a terse letter to me pointing out that the Big Bang theory was obviously completely wrong.  The reason was  obvious to anyone who understood thermodynamics. He had spent a lifetime designing high-quality refrigeration equipment  and therefore knew what he was talking about (or so he said).

His point was that, according to  the Big Bang theory, the Universe cools as it expands. Its current temperature is about 3 Kelvin (-270 Celsius or therabouts) but it is now expanding. Turning the clock back gives a Universe that was hotter when it was younger. He thought this was all wrong.

The argument is false, my correspondent asserted, because the Universe – by definition –  hasn’t got any surroundings and therefore isn’t expanding into anything. Since it isn’t pushing against anything it can’t do any work. The internal energy of the gas must therefore remain constant and since the internal energy of an ideal gas is only a function of its temperature, the expansion of the Universe must therefore be at a constant temperature (i.e. isothermal, rather than adiabatic, as in the Big Bang theory). He backed up his argument with bona fide experimental results on the free expansion of gases.

I didn’t reply and filed the letter away. Another came, and I did likewise. Increasingly overcome by some form of apoplexy his letters got ruder and ruder, eventually blaming me for the decline of the British education system and demanding that I be fired from my job. Finally, he wrote to the President of the Royal Society demanding that I be “struck off” – not that I’ve ever been “struck on” – and forbidden (on grounds of incompetence) ever to teach thermodynamics in a University.

Actually, I’ve never taught thermodynamics in any University anyway, but I’ve kept the letter (which was cc-ed to me) in case I am ever asked. It’s much better than a sick note….

This is a good example of a little knowledge being a dangerous thing. My correspondent clearly knew something about thermodynamics. But, obviously, I don’t agree with him that the Big Bang is wrong.

Although I never actually replied to this question myself, I thought it might be fun to turn this into a little competition, so here’s a challenge for you: provide the clearest and most succint explanation of why the temperature of the expanding Universe does fall with time, despite what my correspondent thought.

Answers via the comment box please, in language suitable for a nutter non-physicist.

How Loud was the Big Bang?

Posted in The Universe and Stuff with tags , , , , , , on April 26, 2009 by telescoper

The other day I was giving a talk about cosmology at Cardiff University’s Open Day for prospective students. I was talking, as I usually do on such occasions, about the cosmic microwave background, what we have learnt from it so far and what we hope to find out from it from future experiments, assuming they’re not all cancelled.

Quite a few members of staff listened to the talk too and, afterwards, some of them expressed surprise at what I’d been saying, so I thought it would be fun to try to explain it on here in case anyone else finds it interesting.

As you probably know the Big Bang theory involves the assumption that the entire Universe – not only the matter and energy but also space-time itself – had its origins in a single event a finite time in the past and it has been expanding ever since. The earliest mathematical models of what we now call the  Big Bang were derived independently by Alexander Friedman and George Lemaître in the 1920s. The term “Big Bang” was later coined by Fred Hoyle as a derogatory description of an idea he couldn’t stomach, but the phrase caught on. Strictly speaking, though, the Big Bang was a misnomer.

Friedman and Lemaître had made mathematical models of universes that obeyed the Cosmological Principle, i.e. in which the matter was distributed in a completely uniform manner throughout space. Sound consists of oscillating fluctuations in the pressure and density of the medium through which it travels. These are longitudinal “acoustic” waves that involve successive compressions and rarefactions of matter, in other words departures from the purely homogeneous state required by the Cosmological Principle. The Friedman-Lemaitre models contained no sound waves so they did not really describe a Big Bang at all, let alone how loud it was.

However, as I have blogged about before, newer versions of the Big Bang theory do contain a mechanism for generating sound waves in the early Universe and, even more importantly, these waves have now been detected and their properties measured.

The above image shows the variations in temperature of the cosmic microwave background as charted by the Wilkinson Microwave Anisotropy Probe about five years ago. The average temperature of the sky is about 2.73 K but there are variations across the sky that have an rms value of about 0.08 milliKelvin. This corresponds to a fractional variation of a few parts in a hundred thousand relative to the mean temperature. It doesn’t sound like much, but this is evidence for the existence of primordial acoustic waves and therefore of a Big Bang with a genuine “Bang” to it.

A full description of what causes these temperature fluctuations would be very complicated but, roughly speaking, the variation in temperature you corresponds directly to variations in density and pressure arising from sound waves.

So how loud was it?

The waves we are dealing with have wavelengths up to about 200,000 light years and the human ear can only actually hear sound waves with wavelengths up to about 17 metres. In any case the Universe was far too hot and dense for there to have been anyone around listening to the cacophony at the time. In some sense, therefore, it wouldn’t have been loud at all because our ears can’t have heard anything.

Setting aside these rather pedantic objections – I’m never one to allow dull realism to get in the way of a good story- we can get a reasonable value for the loudness in terms of the familiar language of decibels. This defines the level of sound (L) logarithmically in terms of the rms pressure level of the sound wave Prms relative to some reference pressure level Pref

L=20 log10[Prms/Pref]

(the 20 appears because of the fact that the energy carried goes as the square of the amplitude of the wave; in terms of energy there would be a factor 10).

There is no absolute scale for loudness because this expression involves the specification of the reference pressure. We have to set this level by analogy with everyday experience. For sound waves in air this is taken to be about 20 microPascals, or about 2×10-10 times the ambient atmospheric air pressure which is about 100,000 Pa.  This reference is chosen because the limit of audibility for most people corresponds to pressure variations of this order and these consequently have L=0 dB. It seems reasonable to set the reference pressure of the early Universe to be about the same fraction of the ambient pressure then, i.e.

Pref~2×10-10 Pamb

The physics of how primordial variations in pressure translate into observed fluctuations in the CMB temperature is quite complicated, and the actual sound of the Big Bang contains a mixture of wavelengths with slightly different amplitudes so it all gets a bit messy if you want to do it exactly, but it’s quite easy to get a rough estimate. We simply take the rms pressure variation to be the same fraction of ambient pressure as the averaged temperature variation are compared to the average CMB temperature,  i.e.

Prms~ a few ×10-5Pamb

If we do this, scaling both pressures in logarithm in the equation in proportion to the ambient pressure, the ambient pressure cancels out in the ratio, which turns out to be a few times 10-5.

AudiogramsSpeechBanana

With our definition of the decibel level we find that waves corresponding to variations of one part in a hundred thousand of the reference level  give roughly L=100dB while part in ten thousand gives about L=120dB. The sound of the Big Bang therefore peaks at levels just over  110 dB. As you can see in the Figure above, this is close to the threshold of pain,  but it’s perhaps not as loud as you might have guessed in response to the initial question. Many rock concerts are actually louder than the Big Bang, at least near the speakers!

A useful yardstick is the amplitude  at which the fluctuations in pressure are comparable to the mean pressure. This would give a factor of about 1010 in the logarithm and is pretty much the limit that sound waves can propagate without distortion. These would have L≈190 dB. It is estimated that the 1883 Krakatoa eruption produced a sound level of about 180 dB at a range of 100 miles. By comparison the Big Bang was little more than a whimper.

PS. If you would like to read more about the actual sound of the Big Bang, have a look at John Cramer’s webpages. You can also download simulations of the actual sound. If you listen to them you will hear that it’s more of  a “Roar” than a “Bang” because the sound waves don’t actually originate at a single well-defined event but are excited incoherently all over the Universe.

From Here to Eternity

Posted in Books, Talks and Reviews, The Universe and Stuff with tags , , , , on February 3, 2009 by telescoper

I posted an item about astronomy and poetry a couple of days ago that used a phrase I vaguely remember having used somewhere else before. I’ve only just remembered where. It was in this book review I did for Nature some time ago. Since I’m quite keen on recycling, I’d thought I’d put it on here.

How do physicists cope with the concept of infinity in an expanding Universe?

BOOK REVIEWED – The Infinite Cosmos: Questions from the Frontiers of Cosmology

by Joseph Silk

Oxford University Press: 2006. 256 pp. £18.99, $29.95

Scientists usually have an uncomfortable time coping with the concept of infinity. Over the past century, physicists have had a particularly difficult relationship with the notion of boundlessness. In most cases this has been symptomatic of deficiencies in the theoretical foundations of the subject. Think of the ‘ultraviolet catastrophe’ of classical statistical mechanics, in which the electromagnetic radiation produced by a black body at a finite temperature is calculated to be infinitely intense at infinitely short wavelengths; this signalled the failure of classical statistical mechanics and ushered in the era of quantum mechanics about a hundred years ago. Quantum field theories have other forms of pathological behaviour, with mathematical components of the theory tending to run out of control to infinity unless they are healed using the technique of renormalization. The general theory of relativity predicts that singularities in which physical properties become infinite occur in the centre of black holes and in the Big Bang that kicked our Universe into existence. But even these are regarded as indications that we are missing a piece of the puzzle, rather than implying that somehow infinity is a part of nature itself.

The exception to this rule is the field of cosmology. Somehow it seems natural at least to consider the possibility that our cosmos might be infinite in extent or duration. If the Universe is defined as everything that exists, why should it necessarily be finite? Why should there be some underlying principle that restricts it to a size our human brains can cope with?

But even if cosmologists are prepared to ponder the reality of endlessness, and to describe it mathematically, they still have problems finding words to express these thoughts. Physics is fundamentally prosaic, but physicists have to resort to poetry when faced with the measureless grandeur of the heavens.

In The Infinite Cosmos, Joe Silk takes us on a whistle-stop tour of modern cosmology, focusing on what we have learned about the size and age of the Universe, how it might have begun, and how it may or may not end. This is a good time to write this book, because these most basic questions may have been answered by a combination of measurements from satellites gathering the static buzz of microwaves left over from the Big Bang, from telescopes finding and monitoring the behaviour of immensely distant supernova explosions, and from painstaking surveys of galaxy positions yielding quantitative information about the fallout from the primordial fireball. Unless we are missing something of fundamental importance, these observations indicate that our expanding Universe is about 14 billion years old, contains copious quantities of dark matter in some unidentified form, and is expanding at an accelerating rate.

According to the standard model of cosmology that emerges, the Universe has a finite past and (perhaps) an infinite future. But is our observable Universe (our ‘Hubble bubble’) typical of all there is? Perhaps there is much more to the cosmos than will ever meet our eyes. Our local patch of space-time may have its origin in just one of an infinite and timeless collection of Big Bangs, so the inferences we draw from observations of our immediate neighbourhood may never tell us anything much about the whole thing, even if we correctly interpret all the data available to us.

What is exciting about this book is not so much that it is anchored by the ramifications of infinity, but that it packs so much into a decidedly finite space. Silk covers everything you might hope to find in a book by one of the world’s leading cosmologists, and much more besides. Black holes, galaxy formation, dark matter, time travel, string theory and the cosmic microwave background all get a mention.

The style is accessible and informative. The book also benefits from having a flexible structure, free from the restrictions of the traditional historical narrative. Instead there are 20 short chapters arranged in a way that brings out the universality of the underlying physical concepts without having too much of a textbook feel. The explanations are nicely illustrated and do not involve any mathematics, so the book is suitable for the non-specialist.

If I have any criticisms of this book at all, they are only slight ones. The conflation of the ‘expanding Universe’ concept with the Big Bang theory, as opposed to its old ‘steady state’ rival, is both surprising and confusing. The steady-state model also describes an expanding Universe, but one in which there is continuous creation of matter to maintain a constant density against the diluting effect of the expansion. In the Big Bang, there is only one creation event, so the density of the expanding Universe changes with time. I also found the chapter about God in cosmology to be rather trite, but then my heart always sinks when I find myself lured into theological territory in which I am ill-equipped to survive.

A New Theory of the Universe

Posted in The Universe and Stuff with tags , , , , on January 24, 2009 by telescoper

Yesterday I went on the train to London to visit my old friends in Mile End. I worked at the place that is now called Queen Mary, University of London for nearly a decade and missed it quite a lot when I moved to Nottingham. More recently I’ve had a bit more time and plausible excuses to visit London, including yesterday’s invitation to give a seminar at the Astronomy Unit. Although we were a bit late starting, owing to extremely slow service in the restaurant where we had lunch before the talk, it all seemed to go quite well. Afterwards we had a few beers and a nice chat before I took the train back to Cardiff again.

In the pub (which was the Half Moon, formerly the Half Moon Theatre,  a place of great historical interest) I remembered a joke I sometimes make during cosmology talks but had forgotten to do in the one I had just given.  I’m not sure it will work in written form, but here goes anyway.

I’ve blogged before about the current state of cosmology, but it’s probably a good idea to give a quick reminder before going any further. We have a standard cosmological model, known as the concordance cosmology, which accounts for most relevant observations in a pretty convincing way and is based on the idea that the Universe began with a Big Bang.  However, there are a few things about this model that are curious, to say the least.

First, there is the spatial geometry of the Universe. According to Einstein’s general theory of relativity, universes come in three basic shapes: closed, open and flat. These are illustrated to the right. The flat space has “normal” geometry in which the interior angles of a triangle add up to 180 degrees. In a closed space the sum of the angles is greater than 180 degrees, and  in an open space it is less. Of course the space we live in is three-dimensional but the pictures show two-dimensional surfaces.

But you get the idea.

The point is that the flat space is very special. The two curved spaces are much more general because they can be described by a parameter called their curvature which could in principle take any value (either positive for a closed space, or negative for an open space). In other words the sphere at the top could have any radius from very small (large curvature) to very large (small curvature). Likewise with the “saddle” representing an open space. The flat space must have exactly zero curvature. There are many ways to be curved, but only one way to be flat.

Yet, as near as dammit, our Universe appears to be flat. So why, with all the other options theoretically available to it, did the Universe decide to choose the most special one, which also happens in my opinion to be also the most boring?

Then there is the way the Universe is put together. In order to be flat there must be an exact balance between the energy contained in the expansion of the Universe (positive kinetic energy) and the energy involved in the gravitational interactions between everything in it (negative potential energy). In general relativity, you see, the curvature relates to the total amount of energy.

On the left you can see the breakdown of the various components involved in the standard model with the whole pie representing a flat Universe. You see there’s a vary strange mixture dominated by dark energy (which we don’t understand) and dark mattter (which we don’t understand). The bit we understand a little bit better (because we can sometimes see it directly) is only 4% of the whole thing. The proportions look very peculiar.

And then finally, there is the issue that I talked about in my seminar in London and have actually blogged about (here and there) previously, which is why the Universe appears to be a bit lop-sided and asymmetrical when we’d like it to be a bit more aesthetically pleasing.

All these curiosities are naturally accounted for in my New Theory of the Universe, which asserts that the Divine Creator actually bought  the entire Cosmos  in IKEA.

This hypothesis immediately explains why the Universe is flat. Absolutely everything in IKEA comes in flat packs. Curvature is not allowed.

But this is not the only success of my theory. When God got home he obviously opened the flat pack, found the instructions and read the dreaded words “EASY SELF-ASSEMBLY”. Even the omnipotent would struggle to follow the bizarre set of cartoons and diagrams that accompany even the simplest IKEA furniture. The result is therefore predictable: strange pieces that don’t seem to fit together, bits left over whose purpose is not at all clear, and an overall appearance that is not at all like one would have expected.

It’s clear  where the lop-sidedness comes in too. Probably some of the parts were left out so the whole thing isn’t  held together properly and is probably completely unstable. This sort of thing happens all the time with IKEA stuff. And why is it you can never find the right size Allen Key to sort it out?

So there you have it. My new Theory of the Universe. Some details need to be worked out, but it is as good an explanation of these issues as I have heard. I claim my Nobel Prize.

If anything will ever get me a trip to Sweden, this will.