How to remove duplicates by recursion in a Python list
It is quite common for Python lists to have duplicates. For some particular use cases, a user might need to remove them.
The illustration below shows the visual representation of how to remove duplicates from the list.
We can easily remove duplicates through recursion, whether it be a list of strings, integers, floating-point numbers, or any other data type. In this shot, we will remove duplicates through recursion.
Code
def remove_duplicates_recursion (dupList, temp):if len(dupList) == 0: #condition 0 --> base casereturn dupListif dupList[0] in temp: #condition 1temp.append(dupList[0])return remove_duplicates_recursion(dupList[1:], temp)else: #condition 2temp.append(dupList[0])return [dupList[0]] + remove_duplicates_recursion(dupList[1:], temp)def main():nums = [1,1,2,2]strings = ['hello', 'world', 'hello', 'world', 'hi']emptyList = []print("number list with duplicates", nums)print("number list without duplicates", remove_duplicates_recursion(nums, emptyList))print("strings list with duplicates", strings)print("string list without duplicates", remove_duplicates_recursion(strings, emptyList))main()
Explanation
The Python file below includes a function named remove_duplicates_recursion which takes two arguments: dupList and temp. dupList is the list possibly containing duplicate elements, and temp is initially an empty list.
The function first checks if the length of dupList is 1 or 0. If so, it would trivially imply that there is no possibility of any duplicates occurring.
Then it checks whether the first element of dupList, i.e. dupList[0], is present in the array temp.
- If the element is present then it will be appended at the end of
tempand the function will be recursively called without the first element ofdupList. - If the element is not present in
temp, it will be added to the list that the function has to return.
As the function proceeds recursively, it will not allow any element in temp to be stored in the list that it has to return, and temp will keep filling up with successive elements in dupList.
Example
For a simulation exercise, take the example of the array dupList = [1, 1, 2, 2].
Iteration 1
-
dupList = [1, 1, 2, 2] -
dupList[0] = 1 -
temp = []
Since the length of dupList is >= 1, and duplist[0] is not in temp, the block under condition # 2 will execute. It will add the first element of dupList in the list to return and call the remove_duplicates_recursion function again with dupList = [1, 2, 2] and temp = [1].
Iteration 2
-
dupList = [1, 2, 2] -
temp = [1] -
returning list = [1]
dupList[0] = 1
Now, as the value present in dupList[0] is also present in temp, condition # 1 will execute. The value in dupList[0] will not be added to the returning list, but the function will be called with all elements in dupList except for duplist[0]. This ensures that if a duplicate is found, it will not be added to the returning list, as the function checks for more duplicates further down dupList.
These same processes will repeat for iteration no. 3 and iteration no. 4 when the function gets an empty dupList as input. The returning list inside the stack will be stored in the nums variable in main.
This code is general for all data types.