10 Stunning Issues You Can Do with Python’s collections Module

10 Stunning Issues You Can Do with Python’s collections Module10 Stunning Issues You Can Do with Python’s collections Module
Picture by Editor | ChatGPT

 

Introduction

 
Python’s commonplace library is in depth, providing a variety of modules to carry out frequent duties effectively.

Amongst these, the collections module is a standout instance, which offers specialised container information sorts that may function options to Python’s general-purpose built-in containers like dict, record, set, and tuple. Whereas many builders are acquainted with a few of its parts, the module hosts quite a lot of functionalities which are surprisingly helpful and might simplify code, enhance readability, and increase efficiency.

This tutorial explores ten sensible — and maybe shocking — functions of the Python collections module.

 

1. Counting Hashable Objects Effortlessly with Counter

 
A standard process in virtually any information evaluation mission is counting the occurrences of things in a sequence. The collections.Counter class is designed particularly for this. It is a dictionary subclass the place parts are saved as keys and their counts are saved as values.

from collections import Counter

# Rely the frequency of phrases in a listing
phrases = ['galaxy', 'nebula', 'asteroid', 'comet', 'gravitas', 'galaxy', 'stardust', 'quasar', 'galaxy', 'comet']
word_counts = Counter(phrases)

# Discover the 2 most typical phrases
most_common = word_counts.most_common(2)

# Output outcomes
print(f"Phrase counts: {word_counts}")
print(f"Most typical phrases: {most_common}")

 

Output:

Phrase counts: Counter({'galaxy': 3, 'comet': 2, 'nebula': 1, 'asteroid': 1, 'gravitas': 1, 'stardust': 1, 'quasar': 1})
Most typical phrases: [('galaxy', 3), ('comet', 2)]

 

2. Creating Light-weight Lessons with namedtuple

 
If you want a easy class only for grouping information, with out strategies, a namedtuple is a helpful, memory-efficient choice. It permits you to create tuple-like objects which have fields accessible by attribute lookup in addition to being indexable and iterable. This makes your code extra readable than utilizing a regular tuple.

from collections import namedtuple

# Outline a E book namedtuple
# Fields: title, writer, year_published, isbn
E book = namedtuple('E book', ['title', 'author', 'year_published', 'isbn'])

# Create an occasion of the E book
my_book = E book(
    title="The Hitchhiker"s Information to the Galaxy',
    writer="Douglas Adams",
    year_published=1979,
    isbn='978-0345391803'
)

print(f"E book Title: {my_book.title}")
print(f"Writer: {my_book.writer}")
print(f"Yr Printed: {my_book.year_published}")
print(f"ISBN: {my_book.isbn}")

print("n--- Accessing by index ---")
print(f"Title (by index): {my_book[0]}")
print(f"Writer (by index): {my_book[1]}")
print(f"Yr Printed (by index): {my_book[2]}")
print(f"ISBN (by index): {my_book[3]}")

 

Output:

Accessing e-book information by area identify
Title (by area identify): The Hitchhiker's Information to the Galaxy
Writer (by area identify): Douglas Adams
Yr Printed (by area identify): 1979
ISBN (by area identify): 978-0345391803

Accessing e-book information by index
Title (by index): The Hitchhiker's Information to the Galaxy
Writer (by index): Douglas Adams
Yr Printed (by index): 1979
ISBN (by index): 978-0345391803

 

You may consider a namedtuple as much like a mutable C struct, or as an information class with out strategies. They positively have their makes use of.

 

3. Dealing with Lacking Dictionary Keys Gracefully with defaultdict

 
A standard frustration when working with dictionaries is the KeyError that happens whenever you attempt to entry a key that does not exist. The collections.defaultdict is the right resolution. It is a subclass of dict that calls a manufacturing facility perform to provide a default worth for lacking keys. That is particularly helpful for grouping objects.

from collections import defaultdict

# Group a listing of tuples by the primary component
scores_by_round = [('contestantA', 8), ('contestantB', 7), ('contestantC', 5),
                   ('contestantA', 7), ('contestantB', 7), ('contestantC', 6),
                   ('contestantA', 9), ('contestantB', 5), ('contestantC', 4)]
grouped_scores = defaultdict(record)

for key, worth in scores_by_round:
    grouped_scores[key].append(worth)

print(f"Grouped scores: {grouped_scores}")

 

Output:

Grouped scores: defaultdict(, {'contestantA': [8, 7, 9], 'contestantB': [7, 7, 5], 'contestantC': [5, 6, 4]})

 

4. Implementing Quick Queues and Stacks with deque

 
Python lists can be utilized as stacks and queues, regardless that they don’t seem to be optimized for these operations. Appending and popping from the tip of a listing is quick, however doing the identical from the start is gradual as a result of all different parts must be shifted. The collections.deque (double-ended queue) is designed for quick appends and pops from each ends.

First, this is an instance of a queue utilizing deque.

from collections import deque

# Create a queue
d = deque([1, 2, 3])
print(f"Unique queue: {d}")

# Add to the best
d.append(4)
print("Including merchandise to queue: 4")
print(f"New queue: {d}")

# Take away from the left
print(f"Popping queue merchandise (from left): {d.popleft()}")  

# Output remaining queue
print(f"Remaining queue: {d}")

&nbsp

Output:

Unique queue: deque([1, 2, 3])
Including merchandise to queue: 4
New queue: deque([1, 2, 3, 4])
Popping queue merchandise (from left): 1
Remaining queue: deque([2, 3, 4])

 

And now let’s use deque to create a stack:

from collections import deque

# Create a stack
d = deque([1, 2, 3])
print(f"Unique stack: {d}")

# Add to the best
d.append(5)
print("Including merchandise to stack: 5")
print(f"New stack: {d}")

# Take away from the best
print(f"Popping stack merchandise (from proper): {d.pop()}")

# Output remaining stack
print(f"Remaining stack: {d}")

 

Output:

Unique stack: deque([1, 2, 3])
Including merchandise to stack: 5
New stack: deque([1, 2, 3, 5])
Popping stack merchandise (from proper): 5
Remaining stack: deque([1, 2, 3])

 

5. Remembering Insertion Order with OrderedDict

 
Earlier than Python 3.7, commonplace dictionaries didn’t protect the order during which objects have been inserted. To unravel this, the collections.OrderedDict was used. Whereas commonplace dicts now keep insertion order, OrderedDict nonetheless has distinctive options, just like the move_to_end() methodology, which is helpful for duties like making a easy cache.

from collections import OrderedDict

# An OrderedDict remembers the order of insertion
od = OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3

print(f"Begin order: {record(od.keys())}")

# Transfer 'a' to the tip
od.move_to_end('a')
print(f"Remaining order: {record(od.keys())}")

 

Output:

Begin order: ['a', 'b', 'c']
Remaining order: ['b', 'c', 'a']

 

6. Combining A number of Dictionaries with ChainMap

 
The collections.ChainMap class offers a method to hyperlink a number of dictionaries collectively to allow them to be handled as a single unit. It is usually a lot sooner than creating a brand new dictionary and operating a number of replace() calls. Lookups search the underlying mappings one after the other till a secret is discovered.

Let’s create a ChainMap named chain and question it for keys.

from collections import ChainMap

# Create dictionaries
dict1 = {'a': 1, 'b': 2}
dict2 = {'b': 3, 'c': 4}

# Create a ChainMap
chain = ChainMap(dict1, dict2)

# Print dictionaries
print(f"dict1: {dict1}")
print(f"dict2: {dict2}")

# Question ChainMap for keys and return values
print("nQuerying ChainMap for keys")
print(f"a: {chain['a']}")
print(f"c: {chain['c']}")
print(f"b: {chain['b']}")

 

Output:

dict1: {'a': 1, 'b': 2}
dict2: {'b': 3, 'c': 4}

Querying keys for values
a: 1
c: 4
b: 2

 

Word that, within the above state of affairs, ‘b’ is present in first in dict1, the primary dictionary in chain, and so it’s the worth related to this key that’s returned.

 

7. Conserving a Restricted Historical past with deque’s maxlen

 
A deque could be created with a set most size utilizing the maxlen argument. If extra objects are added than the utmost size, the objects from the other finish are robotically discarded. That is excellent for maintaining a historical past of the final N objects.

from collections import deque

# Hold a historical past of the final 3 objects
historical past = deque(maxlen=3)
historical past.append("cd ~")
historical past.append("ls -l")
historical past.append("pwd")
print(f"Begin historical past: {historical past}")

# Add a brand new merchandise, push out the left-most merchandise
historical past.append("mkdir information")
print(f"Remaining historical past: {historical past}")

 

Output:

Begin historical past: deque(['cd ~', 'ls -l', 'pwd'], maxlen=3)
Remaining historical past: deque(['ls -l', 'pwd', 'mkdir data'], maxlen=3)

 

8. Creating Nested Dictionaries Simply with defaultdict

 
Constructing on defaultdict, you’ll be able to create nested or tree-like dictionaries with ease. By offering a lambda perform that returns one other defaultdict, you’ll be able to create dictionaries of dictionaries on the fly.

from collections import defaultdict
import json

# A perform that returns a defaultdict
def tree():
    return defaultdict(tree)

# Create a nested dictionary
nested_dict = tree()
nested_dict['users']['user1']['name'] = 'Felix'
nested_dict['users']['user1']['email'] = '[email protected]'
nested_dict['users']['user1']['phone'] = '515-KL5-5555'

# Output formatted JSON to console
print(json.dumps(nested_dict, indent=2))

 

Output:

{
  "customers": {
    "user1": {
      "identify": "Felix",
      "e mail": "[email protected]",
      "cellphone": "515-KL5-5555"
    }
  }
}

 

9. Performing Arithmetic Operations on Counters

 
Information flash: you’ll be able to carry out arithmetic operations, similar to addition, subtraction, intersection, and union, on Counter objects. This can be a highly effective instrument for evaluating and mixing frequency counts from completely different sources.

from collections import Counter

c1 = Counter(a=4, b=2, c=0, d=-2)
c2 = Counter(a=1, b=2, c=3, d=4)

# Add counters -> provides counts for frequent keys
print(f"c1 + c2 = {c1 + c2}")

# Subtract counters -> retains solely optimistic counts
print(f"c1 - c2 = {c1 - c2}")

# Intersection -> takes minimal of counts
print(f"c1 & c2 = {c1 & c2}")

# Union -> takes most of counts
print(f"c1 | c2 =  c2")

 

Output:

c1 + c2 = Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2})
c1 - c2 = Counter({'a': 3})
c1 & c2 = Counter({'b': 2, 'a': 1})
c1 | c2 = Counter({'a': 4, 'd': 4, 'c': 3, 'b': 2})

 

10. Effectively Rotating Components with deque

 
The deque object has a rotate() methodology that permits you to rotate the weather effectively. A optimistic argument rotates parts to the best; a adverse, to the left. That is a lot sooner than slicing and re-joining lists or tuples.

from collections import deque

d = deque([1, 2, 3, 4, 5])
print(f"Unique deque: {d}")

# Rotate 2 steps to the best
d.rotate(2)
print(f"After rotating 2 to the best: {d}")

# Rotate 3 steps to the left
d.rotate(-3)
print(f"After rotating 3 to the left: {d}")

 

Output:

Unique deque: deque([1, 2, 3, 4, 5])
After rotating 2 to the best: deque([4, 5, 1, 2, 3])
After rotating 3 to the left: deque([2, 3, 4, 5, 1])

 

Wrapping Up

 
The collections module in Python is a killer assortment of specialised, high-performance container datatypes. From counting objects with Counter to constructing environment friendly queues with deque, these instruments could make your code cleaner, extra environment friendly, and extra Pythonic. By familiarizing your self with these shocking and highly effective options, you’ll be able to resolve frequent programming issues in a extra elegant and efficient manner.
 
 

Matthew Mayo (@mattmayo13) holds a grasp’s diploma in laptop science and a graduate diploma in information mining. As managing editor of KDnuggets & Statology, and contributing editor at Machine Studying Mastery, Matthew goals to make complicated information science ideas accessible. His skilled pursuits embrace pure language processing, language fashions, machine studying algorithms, and exploring rising AI. He’s pushed by a mission to democratize data within the information science group. Matthew has been coding since he was 6 years outdated.