Exploring the Universe: U of T Students’ Journey with the Galaxy Zoo Project

August 7, 2024 by Kal Romain

 

The University of Toronto's involvement in the groundbreaking Galaxy Zoo project offers a unique opportunity for undergraduate students to delve into the world of cutting-edge astronomical research. This collaborative effort, involving the Euclid space satellite, aims to map and classify millions of galaxies, and our students, Khalid Edris and Junbo Li, are at the forefront of this exciting venture.

Project Overview and Student Involvement

Khalid and Junbo were introduced to the project by Mike Walmsley, a postdoctoral fellow at the Dunlap Institute for Astronomy and Astrophysics, and Professor Josh Speagle, a cross-appointed researcher with the Department of Statistical Sciences. The Euclid telescope has captured millions of galaxy images, a task far too vast for manual annotation. Instead, volunteers will annotate around 100,000 galaxies, training an AI model to label the remaining millions.

Khalid explained, "Our job is to figure out which 100,000 of these galaxies the volunteers should label to optimize the machine-learning process when labelling the remaining Euclid galaxies. We have been working on a variety of different active learning acquisition functions to determine which one performs the best in similar conditions."

Working on this project has significantly enhanced the academic experience for both students. Junbo shared, "It’s given us a chance to apply the theoretical knowledge we've learned in class to real-world situations. Before this, we spent a lot of time on theory and sometimes wondered, ‘Why are we learning this? What’s the point?’ But now, we’re using what we’ve learned to solve real problems."

Support and Resources at U of T

Both students highlighted the supportive research culture in pursuing an undergraduate research project. Khalid mentioned, "It is very easy to ask for help from your supervisors who are always online and respond right away. In biweekly meetings with other NSERC students, everyone is friendly and wants to learn more about your projects, making the environment very welcoming."

Junbo added, "Since I am an undergraduate statistics student, I was introduced to this opportunity through an email from the Department of Statistical Science. Otherwise, I would recommend going to or emailing professors you might know and asking them if there are any opportunities they might have for you." More information on summer undergraduate research opportunities can be found on the Department of Statistical Sciences website.

Impact on Future Careers

The experience has opened new avenues for both students. Khalid shared, "This was the first time that either of us was involved in any machine learning work before. So completing this project has not only opened our eyes to this field but also helped shape our future. I am now planning to continue my studies in statistics but now with a focus on machine learning." Junbo echoed similar sentiments, "I am now looking forward to graduate opportunities that are more involved with machine learning."

 

Galaxy Zoo Project

 

Thanks to a new Galaxy Zoo project launched earlier this month, you can help identify the shapes of thousands of galaxies in images taken by the Euclid telescope. These classifications will help astronomers answer questions about how the shapes of galaxies have changed over time, and what caused these changes.

What is Galaxy Zoo?

The Galaxy Zoo project was first launched in 2007 and asked members of the public to help classify the shapes of a million galaxies from images taken by the Sloan Digital Sky Survey. In the past 17 years, more than 400,000 people have classified images from telescopes including the Hubble Space Telescope and the James Webb Space Telescope. These classifications revolutionized astrophysicists' ideas on how galaxies evolve. The classifications are not only useful for their immediate scientific potential but also as a training set for machine learning AI algorithms. Without being taught what to look for by humans, AI algorithms struggle to classify galaxies, but together, humans and AI can accurately classify limitless numbers of galaxies.
​​​

What is the Euclid Space Telescope?

The European Space Agency’s (ESA) Euclid space telescope launched in July 2023 and has begun to take its survey of the sky. Splitting the sky up into chunks, Euclid aims to take an image of each chunk and mosaic them together to produce the most detailed map of the Universe ever obtained. Euclid has been designed to look at a much larger region of the sky than the Hubble Space Telescope or the James Webb Space Telescope, meaning it can capture a wide range of different objects all in the same image – from faint to bright, from distant to nearby, from the most massive of galaxy clusters to the smallest nearby stars. With Euclid, we will get both a very detailed and very wide view (more than one-third of the sky) all at once. The 2600+ members of the Euclid Collaboration aim to use this data to constrain the nature of dark matter and dark energy, and to answer fundamental questions on how galaxies form and evolve.

 

In November 2023 and May 2024, the world got its first glimpse at the quality of Euclid’s images with Euclid’s Early Release Observations which targeted a variety of astronomical objects, from nearby nebulae to distant clusters of galaxies. Galaxy Zoo: Euclid is the first chance for the public to see images from Euclid’s main survey. Euclid captures so many images (the Galaxy Zoo team selected and prepared 820,000 through the ESA Datalabs digital platform) that almost every image will have never been seen before. The scientists hope that among these images are new and strange discoveries waiting to be found.

 

A photo in this storyCredit: ESA–S. Corvaja

 

A big data problem: AI and humans working together

Euclid is set to send back 100GB of data per day for six years. It would take a century for humans alone to label that much data. So the team at Galaxy Zoo have developed an AI algorithm, called Zoobot, which learns from Galaxy Zoo volunteers to predict what they would say for new galaxies. After being trained on human answers, Zoobot will provide detailed classifications for hundreds of millions of galaxies, creating the largest detailed galaxy catalogue to date and enabling groundbreaking scientific analysis on topics like supermassive black holes, merging galaxies, and more.

 

Zoobot will sift through the Euclid images first in order to classify the “easier” galaxies that we already have a lot of examples of from previous telescopes. However, for the galaxies where Zoobot is not confident in its classification, perhaps because the galaxy is unusual, it will send those images to volunteers on Galaxy Zoo to get their human classifications, which then help Zoobot to learn more. The Galaxy Zoo: Euclid project will therefore see AI and humans working together to learn more about our Universe.

A photo in this story

Caption: A small selection of the 820,000 galaxy images taken by the Euclid Space Telescope now showing on Galaxy Zoo (galaxyzoo.org) for volunteers to classify. Galaxy Zoo volunteers will be the first people to see these images.

Credit: ESA/Euclid/Euclid Consortium/NASA.

Additional aspect ratios for the image above are available here.

A photo in this story

Caption: This incredible snapshot from Euclid is a revolution for astronomy. The image shows 1000 galaxies belonging to the Perseus Cluster, and more than 100,000 additional galaxies further away in the background, each containing up to hundreds of billions of stars. Many of these faint galaxies were previously unseen. Some of them are so distant that their light has taken 10 billion years to reach us. The Galaxy Zoo: Euclid project asks members of the public to classify the shapes of these galaxies in Euclid images such as this one. By mapping the distribution and shapes of these galaxies, astrophysicists will be able to find out more about the processes that shaped the Universe that we see today.

Original image credit: ESA/Euclid/Euclid Consortium/NASA, image processing by J.-C. Cuillandre (CEA Paris-Saclay), G. Anselmi. Zoom rectangle added by Euclid ECEPO team.