Understanding Images Through Programming

Finding the spiral galaxies

One of your favorite songs plays out in the background on your radio while you work in your lab. The lyrics sing out, “Cuz in a sky, cuz in a sky full of stars........I think I see you.” You’re an astrophysicist and the song just fits the task that has recently been assigned to you by the senior scientist. Your task is to study the sky full of stars and classify the spiral galaxies. There’s a definite bulge of old stars at its center, with spiral arms protruding out of it. Here’s one spiral galaxyGalaxy M61. Credit: ESA/Hubble & NASA, ESO, J. Lee and the PHANGS-HST Team:

Press + to interact
An image of a spiral galaxy
An image of a spiral galaxy

You know how to code in Python

Being a physicist is one thing, but this is a different problem. You’ll be given an image of a galaxy. You'll have to tell if the galaxy is spiral or not. That’s easy for one image. But what if there are hundreds of thousands of galaxies to be classified as spiral or not. This is where you can put your Python coding skills to the test.

The idea that has come to your mind is that your Python program picks one definite image of a spiral galaxy and takes another image of another galaxy. If the two images match, you classify that other galaxy as a spiral as well.

You’re very excited about your idea. However, how can we compare two images? As a problem solver, you're confident that you’ll not be inconvenienced by what you don’t know yet. You want to first try out by comparing two small lists. If you can write code to match two lists, it might help you figure out how to compare two images as well.

Comparing two lists

We have a list of numbers: list1 = [1, 2, 3, 4, 5]. We want to compare other lists of the same length with this one. Let’s say list2 = [1, 2, 3, 5, 4]. As a problem solver, you have the following algorithm in your head:

Press + to interact
canvasAnimation-image
1 of 4

Do it yourself

Complete the code for the compare_lists function that matches two lists:

Press + to interact
def compare_lists(list_a, list_b):
"""
Compare two lists element-wise and return True if they are identical,
False otherwise.
Args:
list_a (list): The first list to compare.
list_b (list): The second list to compare.
Returns:
bool: True if the lists are identical, False otherwise.
"""
# Write your code here
pass
def main():
# Define the main list
main_list = [1, 2, 3, 4, 5]
# Define the test lists
test_list_1 = [5, 4, 3, 2, 1]
test_list_2 = [1, 2, 3, 4, 5]
# Compare main_list with test_list_1
comparison_result = compare_lists(main_list, test_list_1)
if comparison_result == True:
print("test_list_1 and main_list are identical.")
else:
print("test_list_1 and main_list are different.")
# Compare main_list with test_list_2
comparison_result = compare_lists(main_list, test_list_2)
if comparison_result == True:
print("test_list_2 and main_list are identical.")
else:
print("test_list_2 and main_list are different.")
main()

Comparing 2D lists

So, this gives you confidence, but you know that an image is not a one-dimensional object. It’s got rows and columns. It’s a two-dimensional object. If we can compare and match 1D lists, can we also match 2D lists? If we can, we must be able to automate the classification of galaxies through image matching too!

Press + to interact
canvasAnimation-image
1 of 10

Let’s write code for comparing two 2D lists. The above can be achieved in a Pythonic way using a == operator.

Press + to interact
def compare_2d(list_a, list_b):
"""
Function to compare two 2D lists element-wise.
Args:
list_a (list): First 2D list.
list_b (list): Second 2D list.
Returns:
bool: True if all elements are equal, False otherwise.
"""
if list_a == list_b:
return True
else:
return False
def main():
# Define the main list
main_list_2d = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
# Define two test lists
test_list_2d_1 = [[4, 5, 6], [7, 8, 9], [10, 11, 12]]
test_list_2d_2 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
# Compare the main list with the first test list
comparison_result = compare_2d(main_list_2d, test_list_2d_1)
if comparison_result:
print("The two 2D lists are identical")
else:
print("The two 2D lists are not equal")
# Compare the main list with the second test list
comparison_result = compare_2d(main_list_2d, test_list_2d_2)
if comparison_result:
print("The two 2D lists are identical")
else:
print("The two 2D lists are not equal")
main()

Viola, it works! You’re so happy that your choice to cover the basics of Python programming is paying off in your lab work.

A programmer ready to find spiral galaxies

It turns out that images are just a 2D arrangement of pixels. Each pixel can be represented by a number, and hence, an image can be represented by a 2D list. Comparing two images is as easy as comparing two 2D lists!

Press + to interact
Image as a collection of pixels
Image as a collection of pixels
1 of 2

Successful matching

For the proof of concept, we'll use the exact same function named compare_2d() and pass it original_image and comparison_image. We’ll use the same image to test if the function works.

Press + to interact
Press + to interact
import matplotlib.pyplot as plt
def compare_2d(list_a, list_b):
# Iterate over each row in list_a
for i in range(len(list_a)):
# Iterate over each element in the row
for j in range(len(list_a[i])):
# Check if the elements are equal
if list_a[i][j] == list_b[i][j]:
pass
else:
# If not equal, return False
return False
# If all elements are equal, return True
return True
# Opening the two images
original_image = open_image_bw('galaxy_m61.png')
comparison_image = open_image_bw('galaxy_m61.png')
# Compare the two images using the 2D list comparison
comparison_result = compare_2d(original_image, comparison_image)
# Print the result of comparison.
if comparison_result == True:
print("The image contains a spiral galaxy.")
else:
print("The image does not contain a spiral galaxy.")
# Display the two images compared.
plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)
plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)

Yay! It works. We have solved our problem, it seems.

One star has gone missing

You showed the results from your code to your lab partner, and they gave you another image to test. It appears to be the same spiral galaxy, something you could easily recognize, even blindfolded. On a very close inspection, you observed that the test image on the right had only one bright star missing near the lower right side. However, your code should be intelligent enough, you thought to yourself.

Press + to interact

You confidently click the "Run" button, passing it the one-less-starred-but-definitely-still-a-spiral galaxy’s image.

Press + to interact
import matplotlib.pyplot as plt
def compare_2d(list_a, list_b):
# Iterate over each row in list_a
for i in range(len(list_a)):
# Iterate over each element in the row
for j in range(len(list_a[i])):
# Check if the elements are equal
if list_a[i][j] == list_b[i][j]:
pass
else:
# If not equal, return False
return False
# If all elements are equal, return True
return True
# Opening the two images in black and white mode for comparison.
original_image = open_image_bw('galaxy_m61.png')
comparison_image = open_image_bw('m61_no_star.png')
# Compare the two 2D lists representing the images.
comparison_result = compare_2d(original_image, comparison_image)
# Print the result of comparison.
if comparison_result == True:
print("The image contains a spiral galaxy.")
else:
print("The image does not contain a spiral galaxy.")
# Display the two images compared.
plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)
plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)

But no! One tiny star has gone missing, and your program does not classify the clearly spiral galaxy! What? Was that the most important star that made the galaxy spiral? You kept thinking. As a human, you could easily tell that it was a spiral galaxy, but why couldn’t the computer and your code? It made you think.

Galaxy has gone a little dim

Maybe it was just a one-off. Let’s not be too hard on ourselves. You pick another image from the dataset. It was obviously another spiral galaxy. Actually, it is the same galaxy, just a little less bright than the original image. You can still tell, even if half-blindfolded, that this is still a spiral galaxy. But can your code?

Press + to interact

This time, you click the "Run" button, but your enthusiasm is also a bit dampened, like the galaxy's.

Press + to interact
import matplotlib.pyplot as plt
def compare_2d(list_a, list_b):
# Iterate over each row in list_a
for i in range(len(list_a)):
# Iterate over each element in the row
for j in range(len(list_a[i])):
# Check if the elements are equal
if list_a[i][j] == list_b[i][j]:
pass
else:
# If not equal, return False
return False
# If all elements are equal, return True
return True
# Opening the two images in black and white mode for comparison.
original_image = open_image_bw('galaxy_m61.png')
comparison_image = open_image_bw('m61_dark.png')
# Compare the two 2D lists representing the images.
comparison_result = compare_2d(original_image, comparison_image)
# Print the result of comparison.
if comparison_result == True:
print("The image contains a spiral galaxy.")
else:
print("The image does not contain a spiral galaxy.")
# Display the two images compared.
plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)
plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)

Oh no! Same result again. Your code on your lab’s supercomputer failed to detect what even a child would be easily able to spot as a spiral galaxy.

Solving pattern-identification problem using programming

If we closely consider the problem of identifying the type of galaxy from the image, we'll come to notice that it is, in fact, a pattern identification problem. All the galaxy images above display a similar spiral pattern with tentacle-like structures rotating around a bright central blob of light. As humans, it is a trivial task to name the type of galaxy by looking at the images above. However, our simple comparison program fails to identify the galaxy even if the image is slightly changed. This is because our comparison program uses simple if-else logic to match the input image pixel values with those of the one already stored in its memory.

When the pixel values of the input image were changed, which happened when we removed the star from the image and reduced the brightness of the image, our program got different image values, and the comparison resulted in a "Not identical" response. This is a bit of an anticlimax because images of the same galaxy captured in real-time through the Hubble telescopeA space telescope that was launched by NASA in the low Earth orbit in 1990. It is still operational and continues to supply us with fascinating stellar and galactic images. at different times, can have slight variations in color and position also.

Therefore, we have come to know that identifying patterns in images and recognizing similar-looking objects might be the easiest for us humans. Our brain is designed to find similar-looking structures in what we see and observe, which helps us correctly identify similar objects around us, even if they look a bit different from each other. Simple logic-based computer programs, on the other hand, fail to do this since they look for exact matches instead of finding patterns.

To help our program understand images like humans do, we first need to figure out how humans learn from their surroundings. Perhaps then, we can make our machines and computers learn, too!