Skip to main content

Data Mining for Dark Matter

They say that to see is to believe. Nowhere is that more true than for astronomers who, despite their ongoing attempts to understand what dark matter is, have witnessed countless examples of the elusive stuff in action.

What astronomers have observed so far is that the only way in which dark matter reveals itself is through its gravitational influence on visible matter. Therefore, to better understand dark matter, astronomers are in the process of mapping where in our universe it disrupts the visible matter around it.

Considering that dark matter accounts for about 26 percent of the mass in our universe, there are quite a few locations where it leaves its gravitational mark – too many for an individual to count.

Computers can make quick work of large data sets, but some of the tracking software for spotting evidence of dark matter is not as efficient as astronomers might like and by no means is ready for the deluge of data that will come from telescopes like the Large Synoptic Survey Telescope. In an attempt to solve this problem, astronomer David Harvey approached a company called Kaggle.

The long streaks of light stretched across the photo are images of galaxies whose light has been bent due to the influence of gravity. The process of gravity bending light and distorting how distant objects appear from Earth is called gravitational lensing. Dark matter is one of the causes of gravitational lensing. Credit: NASA Goddard Spaceflight Center

Kaggle deals in data mining and crowdsourcing. It establishes competitions, daring data scientists across the globe to design improved software programs than what companies like Allstate, Ford, Facebook, GE and NASA currently use to analyze certain data sets.

How Kaggle works is that a company approaches them with a data set and instructions on what they want to extract from that data. Kaggle then uploads the data and instructions to their website where anyone can download it.

Kaggle awards a cash prize, which can reach up to tens of thousands of dollars, to who can develop a model that beats the current software. In a way, Kaggle competitions are similar to Galaxy Zoo in that they use crowdsourcing. But, in other ways, it’s different explained Harvey who is a postgraduate student at the University of Edinburg in Edinburgh, UK.

“It’s like citizen crowdsourcing science but it’s with experts,” Harvey said. “Not…everyone can do it, it requires computer literature to do it. The person who won it had a PhD in physics and the person who was second was an expert in computer science.”

Harvey, with help from colleagues in the US, Netherlands and Portugal, launched the Kaggle competition “Observing Dark Worlds” in October 2012. The competitor who submitted the best overall code was handsomely rewarded a $20,000 prize.

Competitors mined through synthetic images of galaxy clusters that Harvey had simulated. The images were not just of galaxy clusters, however. They included the gravitational footprints of dark matter halos.

Dark matter halos are large collections of dark matter that astronomers think encompass galaxies and galaxy clusters -- like a bubble surrounding a pocket of air. If light from a distant object passes too close to a galaxy or cluster of galaxies, the gravitational tug from the dark matter halo surrounding that galaxy or cluster (as well as the visible matter) will bend the light’s path.

This process is called gravitational lensing and distorts images of objects as they appear from Earth. A similar trick of the eye occurs when light bends, or refracts, upon entering water, causing submerged objects to look larger.

Examples of images in the Kaggle competition "Observing Dark Worlds." Captions and images courtesy of David Harvey of the University of Edinburgh. 
Some times, gravitational lensing will make galaxies appear more elliptical than they already are, and it is these distortions that Kaggle competitors were after. Their goal was to develop an algorithm that could reconstruct the positions of dark matter halos from simulated images of clusters of galaxies. A scoring system helped competitors monitor the accuracy and progress of their models.

The two-month-long, comparatively popular competition to other Kaggle challenges attracted 357 competitors who worked from synthetic images of 120 simulated galaxy clusters. The top three algorithms, which Harvey and his colleagues discuss in their recent publication in the journal Astronomy and Computing, improved upon one of the benchmark gravitational lensing software, LENSTOOL, by 30 percent.

Harvey said that although the top codes that came from the Kaggle competition are an improvement, they are not ready to replace tools like LENSTOOL, yet. One of the reasons being that the code is based from simulated images instead of the real thing. Moreover, the codes sometimes achieve the desired end result but through a process that would not necessarily work on real data.

“The extra work between the end of a competition and getting algorithms through takes lots of time and effort and collaboration… I’m not sure if the algorithm will turn into something,” said Harvey who explained that the contacts he made from the competition were as equally important as the code he and his colleagues gained. “It’s about getting contacts as well as algorithms,” he said.

Harvey recently finished another crowdsourcing project with Galaxy Zoo, the data from which he hopes to use to develop algorithms that classify galaxies.


  1. Dark Matter is nothing but aether which is proved to be existing in the published paper "Michelson-Morley Experiment: A Misconceived & Misinterpreted Experiment" available at journal site Besides Big Bang Theory including Cosmic Inflation has been mathematically, theoretically & experimentally proved as baseless in the published paper "Experimental & Theoretical Evidences of Fallacy of Space-time Concept and Actual State of Existence of the Physical Universe" which is available at the journal site at
    Here is the beginning of revolution in physics. Physics, according to which God cannot exist, shall have to be soon discarded. Read
    Absence of Black Holes means Stephen Hawking has finally accepted that there are serious problems with both Newton's perspective of Gravity & Einstein's General Theory of Relativity because both require Black Holes at the center of the galaxies.
    This justifies standing open challenge to the adopted paradigm of physics which is at

  2. Incorrect, as Hawking does not actually dismiss black holes, but rather revaluates the event horizon.

    As for your other reference, the Indian Journal of Science and Technology is not considered a reputable journal.

  3. Interesting thing can be seen, watching collision of galaxies known as the Bullet Cluster. Maybe this is the proof of that dark matter exists. Within the cluster there are galaxies and gasses. During the collision, the gasses slowed more than the galaxies. Measurements showed that large amounts of mass that that should have fallen behind with the gasses continued ahead with the galaxies.


Post a Comment

Popular Posts

How 4,000 Physicists Gave a Vegas Casino its Worst Week Ever

What happens when several thousand distinguished physicists, researchers, and students descend on the nation’s gambling capital for a conference? The answer is "a bad week for the casino"—but you'd never guess why.

Ask a Physicist: Phone Flash Sharpie Shock!

Lexie and Xavier, from Orlando, FL want to know: "What's going on in this video ? Our science teacher claims that the pain comes from a small electrical shock, but we believe that this is due to the absorption of light. Please help us resolve this dispute!"

The Science of Ice Cream: Part One

Even though it's been a warm couple of months already, it's officially summer. A delicious, science-filled way to beat the heat? Making homemade ice cream. (We've since updated this article to include the science behind vegan ice cream. To learn more about ice cream science, check out The Science of Ice Cream, Redux ) Image Credit: St0rmz via Flickr Over at Physics@Home there's an easy recipe for homemade ice cream. But what kind of milk should you use to make ice cream? And do you really need to chill the ice cream base before making it? Why do ice cream recipes always call for salt on ice?