Trusted answers to developer questions

What is the Python collections module?

Get Started With Machine Learning

Learn the fundamentals of Machine Learning with this free course. Future-proof your career by adding ML skills to your toolkit — or prepare to land a job in AI or Data Science.

Collections in Python are containers used for storing data and are commonly known as data structures, such as lists, tuples, arrays, dictionaries, etc.

Python has a built-in collections module providing additional data structures for collections of data.


Collection Modules

There are 6 most commonly used data structures in the collections modules.

svg viewer

1. Defaultdict

Defaultdict is exactly like a dictionary in python. The only difference is that it does not give an exception/key error when you try to access the non-existent key.

In the following code, even though the 4th index was not initialized, the compiler still returns a value, 0, when we try to access it.

Example:

from collections import defaultdict
nums = defaultdict(int)
nums['one'] = 1
nums['two'] = 2
nums['three'] = 3
print(nums['four'])

2. Counter

Counter is a built-in data structure which is used to count the occurrence of each value present in an array or list.

The following code is counting the number of occurrences of the value 2 in the given list.

Example:

from collections import Counter
list = [1,2,3,4,1,2,6,7,3,8,1,2,2]
answer=Counter()
answer = Counter(list)
print(answer[2])

3. Deque

Deque is an optimal version of list used for inserting and removing items. It can add/remove items from either start or the end of the list.

In the following code, z is being added at the end of the given list and g is at the start of the same list.

Example:

from collections import deque
#initialization
list = ["a","b","c"]
deq = deque(list)
print(deq)
#insertion
deq.append("z")
deq.appendleft("g")
print(deq)
#removal
deq.pop()
deq.popleft()
print(deq)

4. Namedtuple()

The Namedtuple() solves a very major problem in the field of computer science. Usual tuples need to remember the index of each field of a tuple object, however, namedtuple() solves this by simply returning with names for each position in the tuple.

In the following code, an index is not required to print the name of a student rather passing an attribute is sufficient for the required output.

Example:

from collections import namedtuple
Student = namedtuple('Student', 'fname, lname, age')
s1 = Student('Peter', 'James', '13')
print(s1.fname)

5. ChainMap

ChainMap combines a lot of dictionaries together and returns a list of dictionaries. ChainMaps basically encapsulates a lot of dictionaries into one single unit with no restriction on the number of dictionaries.

The following program ChainMap to return two dictionaries.

Example:

import collections
dictionary1 = { 'a' : 1, 'b' : 2 }
dictionary2 = { 'c' : 3, 'b' : 4 }
chain_Map = collections.ChainMap(dictionary1, dictionary2)
print(chain_Map.maps)

6. OrderedDict

OrderedDict is a dictionary that ensures its order is maintained. For example, if the keys are inserted in a specific order, then the order is maintained. Even if you change the value of the key later, the position will remain the same.

Example:

from collections import OrderedDict
order = OrderedDict()
order['a'] = 1
order['b'] = 2
order['c'] = 3
print(order)
#unordered dictionary
unordered=dict()
unordered['a'] = 1
unordered['b'] = 2
unordered['c'] = 3
print("Default dictionary", unordered)

RELATED TAGS

collections
module
data structures
python
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?