Understanding Images Through Programming
Learn how to identify the type of galaxy from a given set of images and write a program to identify the type of galaxy based on its image.
We'll cover the following
Finding the spiral galaxies
One of your favorite songs plays out in the background on your radio while you work in your lab. The lyrics sing out, “Cuz in a sky, cuz in a sky full of stars........I think I see you.” You’re an astrophysicist and the song just fits the task that has recently been assigned to you by the senior scientist. Your task is to study the sky full of stars and classify the spiral galaxies. There’s a definite bulge of old stars at its center, with spiral arms protruding out of it. Here’s one
You know how to code in Python
Being a physicist is one thing, but this is a different problem. You’ll be given an image of a galaxy. You'll have to tell if the galaxy is spiral or not. That’s easy for one image. But what if there are hundreds of thousands of galaxies to be classified as spiral or not. This is where you can put your Python coding skills to the test.
The idea that has come to your mind is that your Python program picks one definite image of a spiral galaxy and takes another image of another galaxy. If the two images match, you classify that other galaxy as a spiral as well.
You’re very excited about your idea. However, how can we compare two images? As a problem solver, you're confident that you’ll not be inconvenienced by what you don’t know yet. You want to first try out by comparing two small lists. If you can write code to match two lists, it might help you figure out how to compare two images as well.
Comparing two lists
We have a list of numbers: list1 = [1, 2, 3, 4, 5]
. We want to compare other lists of the same length with this one. Let’s say list2 = [1, 2, 3, 5, 4]
. As a problem solver, you have the following algorithm in your head:
Do it yourself
Complete the code for the compare_lists
function that matches two lists:
def compare_lists(list_a, list_b):"""Compare two lists element-wise and return True if they are identical,False otherwise.Args:list_a (list): The first list to compare.list_b (list): The second list to compare.Returns:bool: True if the lists are identical, False otherwise."""# Write your code herepassdef main():# Define the main listmain_list = [1, 2, 3, 4, 5]# Define the test liststest_list_1 = [5, 4, 3, 2, 1]test_list_2 = [1, 2, 3, 4, 5]# Compare main_list with test_list_1comparison_result = compare_lists(main_list, test_list_1)if comparison_result == True:print("test_list_1 and main_list are identical.")else:print("test_list_1 and main_list are different.")# Compare main_list with test_list_2comparison_result = compare_lists(main_list, test_list_2)if comparison_result == True:print("test_list_2 and main_list are identical.")else:print("test_list_2 and main_list are different.")main()
Comparing 2D lists
So, this gives you confidence, but you know that an image is not a one-dimensional object. It’s got rows and columns. It’s a two-dimensional object. If we can compare and match 1D lists, can we also match 2D lists? If we can, we must be able to automate the classification of galaxies through image matching too!
Let’s write code for comparing two 2D lists. The above can be achieved in a Pythonic way using a ==
operator.
def compare_2d(list_a, list_b):"""Function to compare two 2D lists element-wise.Args:list_a (list): First 2D list.list_b (list): Second 2D list.Returns:bool: True if all elements are equal, False otherwise."""if list_a == list_b:return Trueelse:return Falsedef main():# Define the main listmain_list_2d = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]# Define two test liststest_list_2d_1 = [[4, 5, 6], [7, 8, 9], [10, 11, 12]]test_list_2d_2 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]# Compare the main list with the first test listcomparison_result = compare_2d(main_list_2d, test_list_2d_1)if comparison_result:print("The two 2D lists are identical")else:print("The two 2D lists are not equal")# Compare the main list with the second test listcomparison_result = compare_2d(main_list_2d, test_list_2d_2)if comparison_result:print("The two 2D lists are identical")else:print("The two 2D lists are not equal")main()
Viola, it works! You’re so happy that your choice to cover the basics of Python programming is paying off in your lab work.
A programmer ready to find spiral galaxies
It turns out that images are just a 2D arrangement of pixels. Each pixel can be represented by a number, and hence, an image can be represented by a 2D list. Comparing two images is as easy as comparing two 2D lists!
Successful matching
For the proof of concept, we'll use the exact same function named compare_2d()
and pass it original_image
and comparison_image
. We’ll use the same image to test if the function works.
import matplotlib.pyplot as pltdef compare_2d(list_a, list_b):# Iterate over each row in list_afor i in range(len(list_a)):# Iterate over each element in the rowfor j in range(len(list_a[i])):# Check if the elements are equalif list_a[i][j] == list_b[i][j]:passelse:# If not equal, return Falsereturn False# If all elements are equal, return Truereturn True# Opening the two imagesoriginal_image = open_image_bw('galaxy_m61.png')comparison_image = open_image_bw('galaxy_m61.png')# Compare the two images using the 2D list comparisoncomparison_result = compare_2d(original_image, comparison_image)# Print the result of comparison.if comparison_result == True:print("The image contains a spiral galaxy.")else:print("The image does not contain a spiral galaxy.")# Display the two images compared.plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)
Yay! It works. We have solved our problem, it seems.
One star has gone missing
You showed the results from your code to your lab partner, and they gave you another image to test. It appears to be the same spiral galaxy, something you could easily recognize, even blindfolded. On a very close inspection, you observed that the test image on the right had only one bright star missing near the lower right side. However, your code should be intelligent enough, you thought to yourself.
You confidently click the "Run" button, passing it the one-less-starred-but-definitely-still-a-spiral galaxy’s image.
import matplotlib.pyplot as pltdef compare_2d(list_a, list_b):# Iterate over each row in list_afor i in range(len(list_a)):# Iterate over each element in the rowfor j in range(len(list_a[i])):# Check if the elements are equalif list_a[i][j] == list_b[i][j]:passelse:# If not equal, return Falsereturn False# If all elements are equal, return Truereturn True# Opening the two images in black and white mode for comparison.original_image = open_image_bw('galaxy_m61.png')comparison_image = open_image_bw('m61_no_star.png')# Compare the two 2D lists representing the images.comparison_result = compare_2d(original_image, comparison_image)# Print the result of comparison.if comparison_result == True:print("The image contains a spiral galaxy.")else:print("The image does not contain a spiral galaxy.")# Display the two images compared.plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)
But no! One tiny star has gone missing, and your program does not classify the clearly spiral galaxy! What? Was that the most important star that made the galaxy spiral? You kept thinking. As a human, you could easily tell that it was a spiral galaxy, but why couldn’t the computer and your code? It made you think.
Galaxy has gone a little dim
Maybe it was just a one-off. Let’s not be too hard on ourselves. You pick another image from the dataset. It was obviously another spiral galaxy. Actually, it is the same galaxy, just a little less bright than the original image. You can still tell, even if half-blindfolded, that this is still a spiral galaxy. But can your code?
This time, you click the "Run" button, but your enthusiasm is also a bit dampened, like the galaxy's.
import matplotlib.pyplot as pltdef compare_2d(list_a, list_b):# Iterate over each row in list_afor i in range(len(list_a)):# Iterate over each element in the rowfor j in range(len(list_a[i])):# Check if the elements are equalif list_a[i][j] == list_b[i][j]:passelse:# If not equal, return Falsereturn False# If all elements are equal, return Truereturn True# Opening the two images in black and white mode for comparison.original_image = open_image_bw('galaxy_m61.png')comparison_image = open_image_bw('m61_dark.png')# Compare the two 2D lists representing the images.comparison_result = compare_2d(original_image, comparison_image)# Print the result of comparison.if comparison_result == True:print("The image contains a spiral galaxy.")else:print("The image does not contain a spiral galaxy.")# Display the two images compared.plt.imsave('./output/original.png', original_image, cmap=plt.cm.gray_r)plt.imsave('./output/comparison.png', comparison_image, cmap=plt.cm.gray_r)
Oh no! Same result again. Your code on your lab’s supercomputer failed to detect what even a child would be easily able to spot as a spiral galaxy.
Solving pattern-identification problem using programming
If we closely consider the problem of identifying the type of galaxy from the image, we'll come to notice that it is, in fact, a pattern identification problem. All the galaxy images above display a similar spiral pattern with tentacle-like structures rotating around a bright central blob of light. As humans, it is a trivial task to name the type of galaxy by looking at the images above. However, our simple comparison program fails to identify the galaxy even if the image is slightly changed. This is because our comparison program uses simple if-else logic to match the input image pixel values with those of the one already stored in its memory.
When the pixel values of the input image were changed, which happened when we removed the star from the image and reduced the brightness of the image, our program got different image values, and the comparison resulted in a "Not identical" response. This is a bit of an anticlimax because images of the same galaxy captured in real-time through the
Therefore, we have come to know that identifying patterns in images and recognizing similar-looking objects might be the easiest for us humans. Our brain is designed to find similar-looking structures in what we see and observe, which helps us correctly identify similar objects around us, even if they look a bit different from each other. Simple logic-based computer programs, on the other hand, fail to do this since they look for exact matches instead of finding patterns.
To help our program understand images like humans do, we first need to figure out how humans learn from their surroundings. Perhaps then, we can make our machines and computers learn, too!