Yesterday, CERN scientists announced that they had "clear signs of a new particle," after a lengthy search for confirmation of the Higgs boson. Although scientists don't know how all of the properties of this new particle will match up with theoretical predictions, everyone's excited about this discovery.

Researchers from both the CMS and ATLAS experiments emphasized that these were "5-sigma signals," at 125 and 126 GeV for CMS and ATLAS, respectively. But what does a 5-sigma result mean, and why do particle physicists use this as a benchmark for discoveries?

To answer these questions, we'll have to look at one of the statistician's oldest friends and C-student's worst enemies: the normal distribution or bell curve.

The normal distribution is the most familiar class of statistical distributions, and it consistently rears its head in a number of sciences including physics. When scientists announce 3-sigma (or 4-sigma or 5-sigma) results, they're talking about standard deviations from an expected value.

As you can see in the graph above, the normal distribution takes the shape of a bell that quickly tapers off. The curve has a mean (or average) value at the center, and as you move from away from the curve, the likelihood of seeing these outer values decreases. The standard deviation is a way of measuring this likelihood.

There's an approximately 68 percent chance that a value will fall within the first standard deviation of the distribution – or 1-sigma; the darkest shaded blue section in the graph above represents this first range. Here's the corresponding approximate percentages for the next sigmas:

**2-sigma:**95.5 percent

**3-sigma:**99.73 percent

**4-sigma:**99.993 percent

**5-sigma:**99.99994 percent

So that means that purely statistical fluctuations will give you a result way out in the 5-sigma range 0.00006 percent of the time.

When physicists announce that they have a 5-sigma result, that means that there's a 1 in 3.5 million chance that it was the result of a statistical fluctuation over the spectrum of experiments they performed. Particle physicists working on the CMS and ATLAS experiments are looking for "bumps" in their data that stand out from the background. When these bumps reach the 5-sigma level, they have very good reason to believe that they've discovered or observed a new particle.

In fact, physicists have relied on this benchmark for years to assign varying degrees of confidence to their results. For publication in

*Physical Review Letters*, for instance, 5-sigma results are generally called "observations" or "discoveries," according to Jack Sandweiss, an editor of the journal and physicist at Yale University.

Sandweiss added that 3-sigma results usually warrant calling the result "evidence" of a new particle, but scientists need more statistical certainty to include terms like "discovery" in a technical paper. Previously, scientists may have placed more weight in 3-sigma results, but many promising 3-sigma results have turned out to be nothing more than statistical fluctuations. When more data was eventually collected, many 3-sigma results have disappeared into the background.

So do the 5-sigma Higgs-like particle results imply that scientists are 99.99994 percent sure that they've found the Higgs boson? Not exactly.

**More than Sigmas**

Although statistical significance can be a good guideline for many physics experiments, scientists can't base their results solely on these benchmarks. In fact, other errors can creep into the data and contaminate entire datasets, even very promising ones.

Remember when neutrinos were supposedly traveling faster than light late last year? That result reached a six-sigma level of confidence – even higher than the 5-sigma level convention required for new particle discoveries. But we learned earlier this year that neutrinos indeed obey the universal speed limit, so what went wrong?

Most crucially, the faster-than-light neutrino experiment suffered from a systematic error that affected all of the data; faulty cables consistently gave the researchers bad readings. No matter how many times physicists repeated the experiments, they would get the same yet inaccurate results.

This situation is akin to measuring someone's height with a meter stick that is several inches longer than it should be. Even if you take hundreds of measurements and average all of the tiny human errors and approximations, you'll never avoid the fact that your meter stick is giving you consistently bad results.

So how do scientists make sure they avoid this problem when statistical analyses can't account for it? Part of the answer is using independent experiments, like CMS and ATLAS, because systematic errors are less likely to affect experiments with different designs.

This is part of the reason why scientists are so excited about the recent results. Scientists are seeing not only very high sigma bumps in the data but also similar bumps from two independent experiments.

For now, plenty of evidence has piled up to support the discovery of a Higgs-like particle. Now scientists have to look at the data more closely to uncover this new particle's properties. Maybe they'll find something entirely unexpected.

-------------------------------------------------------------------------------------

If you want to keep up with Hyperspace, AKA Brian, you can follow him on Twitter.

Even if the experiment is conducted perfectly, there is no inference from 5-sigma to 99.99994 percent confidence. 5-sigma gives you the probability of getting certain data on the assumption that the null hypothesis is true (that it is just a statistical aberration), not the probability of the experimental hypothesis being true.

ReplyDeleteI understand statistics, but I don't see how this graph could possibly show a five sigma result.

ReplyDeletehttp://blogs.discovermagazine.com/badastronomy/2012/07/04/higgs/

If the black data points are supposed to be on the red curve, then the green bound cannot possibly be two sigmas; the points have much too much average deviation for that.

What they are saying 1 sigma and 2sigma is the uncertainty on the expected background.

DeleteYou can say that to get this much signal events, you need 5 sigma deviation at that point..

Now, I think it should be OK. Also, you should look at getting 2 sigma fluctuation is very easy by 5 sigma will be really a bad luck, as it has been mention in the text.

Daaaaaaaaaaaayuuuuuuuuuuuuuuuuum... BIIIIIIIIIITCHED™!

ReplyDelete@Eric C. - You're correct, however the "certain data" you're talking about is key, because that data gives them a probability percentage that the null hypothesis is correct.

ReplyDeleteSo let's say that they get a probability of, oh, I don't know, 0.00006 that the null hypothesis (this is not a higgs boson) is true...wouldn't that indicate that they are 99.99994% sure that the null hypothesis is invalid, and that they have in fact discovered (in this case) the higgs boson?

<~ Actually studying physics, instead of just reading wikipedia and spouting it back onto the internet.

friend, instead of studying physics, best you study some statistics. your description of a p-value as the probability of the null hypothesis being correct is a profound error, and one i doubt any statistics textbook in the world fails to take time and effort to dispel.

DeleteDoes anyone know exactly how many two-photon events where seen?

ReplyDeleteWhat is the current Sigma?

ReplyDelete