How to do Merge Sort in Haskell
Overview
Haskell is a functional programming language. It is a programming paradigm where we change data through expressions that don’t have side effects. These expressions are often called pure functions. Let’s see how we can implement one of the most popular sorting algorithms, Merge Sort, in Haskell.
Algorithm
Merge Sort uses the divide and conquer technique for sorting. We divide the input list into smaller lists and solve the sorting problem for each of them. Then, we combine the results to form the final solution. We use the following algorithm:
- Divide the input list into two halves until we reach the base case (list of size 1).
- Conquer by trying to sort the smaller lists when the base has reached.
- Combine the two sorted lists into one.
Program
Let’s translate the algorithm above into a Haskell program:
-- Divide a list into two halves (even and odd positions)divide :: [Int] -> ([Int], [Int])divide = \list ->case list of[] -> ([], [])x:xs ->let (odds, evens) = divide xsin (x:evens, odds)-- Merge two lists in a sorted fashionmerge :: [Int] -> [Int] -> [Int]merge = \s1 -> \s2 ->case s1 of[] -> s2x:xs ->case s2 of[] -> s1y:ys | x>y -> y:merge s1 ys_ -> x:merge xs s2-- Sort the list using the merge sort algorithmmergesort :: [Int] -> [Int]mergesort = \list ->case list of[] -> [][x] -> [x]_ ->let (evens, odds) = divide listin merge (mergesort evens) (mergesort odds)-- main = print (mergesort [4,4,1,8,9,2])main = print (merge [1,4] []) -- Testing merge
Explanation
The divide function
The divide function divides a list into two halves based on odd and even index positions.
- Line 2:
divide :: [Int] -> ([Int], [Int])is the type annotation for our function calleddivide. Haskell is statically typed, and every function or variable has a type. In this case,divideis a function that takes a list of theInttype and returns a tuple of lists (typeInt). - Line 3: We have the function definition, which shows that our function takes a list. It also takes a
caseexpression on the list that implements the recursive formulation of the problem. - Line 5: The first equation shows that if there is any empty list, then return a tuple of empty lists is returned.
- Line 6: The second equation implements the general case.
x:xsis a syntax in Haskell that neatly provides us with the head of the list,x, and the remainder listxs. We do a recursive call of thedividefunction on the listxs. We can safely assume it correctly provides the solution inoddsandevens. For the complete listx:xs, we attach the headxwithevensand return the tuple(evens, odds)for the final solution.
The merge function
The merge function solves our problem’s ‘conquer’ step. It merges two lists into one such that the returned list is sorted.
-
Line 11:
merge :: [Int] -> [Int] -> [Int]is the type annotation for the functionmerge. The type annotation shows that our function takes two lists of theInttype as arguments and returns a list of theInttype (the merged list). -
Line 12: We have the function definition, which shows that our functions takes two lists,
s1s2. This is followed by a nestedcaseexpression on these lists that implements the problem’s recursive formulation. -
Line 14: The first equation tells that if the first list is empty, we return the other list,
s2. This is because there’s nothing to merge. -
Line 17: We have the general case where the first list is not empty. This is followed by a
caseexpression on the second list,s2. If the second list is empty, the first lists1is returned. For the cases where both lists are not empty, we compare the head of both listsxandy, respectively. Ifxis greater thany, thenyshould be the head of the merged list. We call themergefunction again on lists1andys(the remainder of the second list) for the remaining elements. -
The last equation represents the flip case which follows the same idea as above.
xis the head of the merged list. We call themergefunction on the listss2andxs(the remainder of the second list) for the remaining elements.
Note: In this explanation, we avoid visualizing what happens during the program execution of a recursive function. Instead, we treat the recursive function as a black box to solve minor valid problems.
Visualizing the recursive workflow
The Merge Sort function
Let’s put all the pieces together in the final function.
-
Line 22:
mergesort :: [Int] -> [Int]is the type annotation for themergesortfunction. The type annotation shows that our function takes a list of theInttype as an argument. It also returns a list of theInttype (the sorted list). -
Line 23: We have the function definition that shows that our function takes a list. This is followed by a
caseexpression on the list that implements the problem’s recursive formulation. The first two equations implement simple cases. If the input list is empty, return an empty list. If a list contains a single element, the same list is returned. -
Line 27: The final equation represents the general solution for all lists whose size is greater than 1. We call our
dividefunction, which divides the list into two halves. We call ourmergesortfunction recursively on each list. We can safely assume that it correctly sorts each individual list. Finally, we call ourmergefunction that combines the two sorted lists into a single list and returns it.