Techno Blender
Digitally Yours.

Python Inheritance: Should You Inherit From dict or UserDict? | by Marcin Kozak | May, 2023

0 9


In his fantastic book Fluent Python. 2nd ed., Luciano Ramalho explains why you should not create custom classes inheriting from dict. The reason behind this rule, strange at the first glance, is simple but critical: dict is a highly optimized type implemented in C, and it wouldn’t call the methods you overload in your subclass of dict.

That would be a nasty surprise, wouldn’t it? Let’s see this in an example. Imagine you want to create a dictionary-like class in which the provided values will be converted to their string representation. Let’s try to do this by subclassing the dict built-in type:

class StringDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, str(value))

This seems like perfectly valid Python code. Let’s see how this works:

>>> class StringDict(dict):
... def __setitem__(self, key, value):
... super().__setitem__(key, str(value))
...
>>> mydict = StringDict(first=1, second=2, third=3)
>>> mydict
{'first': 1, 'second': 2, 'third': 3}

Well, it does not work at all — or rather, this __setitem__ method does not work at all. We wanted to get values converted to strings but they are not. Nonetheless, we don’t see any error; the class itself works somehow — in fact, it works just like a regular dictionary would. (Or rather, it provides the same results but slower; we will discuss this later on.)

To get what you want, you should subclass UserDict:

>>> from collections import UserDict
>>> class StringUserDict(UserDict):
... def __setitem__(self, key, value):
... super().__setitem__(key, str(value))
...
>>> mydict = StringUserDict(first=1, second=2, third=3)
>>> mydict
{'first': '1', 'second': '2', 'third': '3'}

As you see the only thing we changed in the definition is subclassing UserDict instead of dict.

So now you know. It’s enough to use UserDict. Great.

Is it?

Hold on. Let’s think. There are a couple of things we should consider before deciding that using UserDict instead of dict is so great.

First, we know that Python’s built-in types are highly optimized because they are implemented in C, and that this implementation itself is highly optimized.

Second, we know that we shouldn’t subclass dict because its methods implemented in C will not call the overwritten methods implemented in Python.

And third, an easy thing to check, collections.UserDict is implemented in Python. In Linux, you will find its definition here:

usr > lib > python3.9 > collections > __init__.py > UserDict
Localization of UserDict definition in Linux. Screenshot from VS Code. Image by author

In this context, the following question asks itself: If so, will my custom class subclassing UserDict be performant?

My immediate guess is that no, it won’t. The optimization of dict comes from C implementation, and UserDict is implemented in Python. Why should it be optimized whatsoever? We will check this in the following section.

For benchmarks, we will use the standard-library timeit module. You can read about it here:

To make benchmarks a little simpler and structurized, let us define a simple function to run time benchmarks for two or more code snippets:


import rounder
import timeit
import pprint

def compare(
__snippet1,
__snippet2,
*args,
number=10_000_000,
repeat=7,
setup="from collections import UserDict"):
snippets = [__snippet1, __snippet2, *args]
results = {}
for i, snippet in enumerate(snippets):
name = snippet if len(snippet) < 30 else f"snippet {i + 1}"
results[name] = min(timeit.repeat(
snippet, number=number, repeat=repeat, setup=setup
)) / number,
results = rounder.signif_object(results, digits=4)
pprint.pprint(results)

A couple of things:

  • The function uses the rounder package, to round all numbers in the dictionary to four significant digits; you can read more about it here:
  • __snippet1 and __snippet2 are positional-only arguments, so you cannot call them by name. This is thanks to the double-underscore prefix.
  • Thanks to *args after the two snippet arguments, you can provide more snippets, also as positional keywords; you can use as many of them as you want.
  • All the remaining arguments are keyword-only. Here, this is achieved by putting them after *args
  • The function report the results as the mean of all calls from the quickest of seven runs. Hence, all the results are directly comparable, even if the timeit.repeat() function was used with different number values.
  • The function return implicit None and prints a short report of the benchmarks, using the pprint() function from the standard-library pprint module. More often than not, avoid replacing return statements with prints², unless your function/method is a one that prints.

Okay, we’ll see the function in action in a second. Let’s start off with comparing how quickly dict() and UserDict() create instances. Nonetheless, we can instantiate a regular dictionary using two methods, dict() and (significantly faster) {}, so we’ll include them both:

>>> compare("UserDict()", "dict()", "{}")
{'UserDict()': (1.278e-07,), 'dict()': (3.826e-08,), '{}': (1.518e-08,)}

For all benchmarks in this article, I used Python 3.11 on a Windows 10 machine, in WSL 1, 32 GM of RAM and four physical (eight logical) cores. The benchmark shows that to create a new instance, UserDict is twice slower than dict.

As mentioned above, the values we see in the results dictionary represent the time of creating one instance of either UserDict or regular dict (created using two methods). Clearly, to create an instance of UserDict, you need more time, about 1.3e-07 of a second — while {} needs about 1.5e-08. Not a big difference? Note when you need to create a single instance, but imagine creating millions of dictionaries. So, to create a regular dictionary, you need more or less 3–8 times less time than to create a UserDict, depending on the instantiation method.

Let’s see how this works with bigger dictionaries. We’ll create a simple numerical dictionary via a dictionary comprehension. Since UserDict does not enable us to use the dictcomp syntax (another drawback), the only thing we can do is to create first a regular dictionary using the dictcomp syntax and then convert it to a UserDict instance:

>>> compare(
"UserDict({i: i**2 for i in range(1000)})",
"{i: i**2 for i in range(1000)}",
number=100_000)
{'snippet 1': (0.0001316,), 'snippet 2': (5.027e-05,)}

A regular dictionary is about 2.5 times faster. That seems quite amazing given that creating an empty dictionary was even faster. We have to remember that the results of such benchmarks can vary from run to run. But we also have to remember that when we use many repetitions (here, a hundred thousand — we could use more), the variation should be relatively small.

Will the size matter when we compare lookup time? Basically key lookup is unrelated to dictionary size, so supposedly, we should be getting similar results irrespective of dict size.

First, a small dictionary:

>>> setup = """from collections import UserDict
... d = {'x': 1, 'y': 2, 'z': 3}
... ud = UserDict(d)
... """
>>> compare("ud['x']", "d['x']", setup=setup)
{"ud['x']": (4.754e-08,), "d['x']": (1.381e-08,)}

Okay, so about 3.5 times slower. Now, for a bigger one, of 10_000 key-value pairs:

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10_000)}
... ud = UserDict(d)
... """
>>> compare("ud['9999']", "d['9999']", setup=setup, number=1_000_000)
{"ud['9999']": (7.785e-08,), "d['9999']": (2.787e-08,)}

and for one of 10 million elements:

>>> compare("ud['9999']", "d['9999']", setup=setup, number=100_000)
{"ud['9999']": (6.662e-08,), "d['9999']": (2.499e-08,)}

So, the size indeed does not matter, and every time, dict was more or less 3–3.5 times faster. Let’s see, this time just for medium-sized dictionaries, how it works with an unexisting key:

>>> compare(
... "ud.get('a', None)",
... "d.get('a', None)",
... setup=setup,
... number=1_000_000)
{"d.get('a', None)": (4.318e-08,), "ud.get('a', None)": (4.525e-07,)}

This time it’s a bigger difference, dict being over 10 times faster.

What about checking if a key is in the dictionary?

>>> compare("'a' in ud", "'a' in d", setup=setup, number=1_000_000)
{"'a' in d": (1.465e-08,), "'a' in ud": (4.562e-08,)}

So, again 3–3.5 times faster.

Now, let’s benchmark a frequent operation, that is, iterating over a dictionary; again, let’s check various sizes of the dictionary:

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=1_000_000
... )
{'for i, v in d.items(): pass': (1.726e-07,),
'for i, v in ud.items(): pass': (1.235e-06,)}

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10_000)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=10_000
... )
{'for i, v in d.items(): pass': (0.0001255,),
'for i, v in ud.items(): pass': (0.00112,)}

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(100_000)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=10_000
... )
{'for i, v in d.items(): pass': (0.001772,),
'for i, v in ud.items(): pass': (0.01718,)}

Okay, so for small dictionaries, dict is about 7 times faster to loop over its key-value pairs available via the .items() method. For medium-sized dictionaries (of 10 thousand elements in our experiment), about 9 times faster. For bigger dictionaries (of 100_000 elements), the result was similar, so looping itself — once it started — does not seem to depend on the type of dictionary.

Since it was a rather minor benchmark, we can conclude that a regular dictionary should be around 5–10 times faster to loop over its items than UserDict.

To conclude the benchmarks

Maybe let’s stop here. We could conduct more benchmarks, but that’s not the point. I did not want to conduct a full comparison of dict and UserDict in terms of execution time; if you’re interested, you can play with the code to conduct a set of solid benchmarks. Instead, I wanted to shed some light on the issue and check if, as I expected based on the knowledge of UserDict and dict’s implementations, the former is significantly slower than the latter.

And it is — unless you consider 5–10 times slower as an insignificant number. So, if you only can, consider using regular dictionaries, not those subclassing UserDict, unless you must change the behavior of dict.

Ah… Why can’t we just subclass dict?! Why?

Or… Can we?

Perhaps you’ve noticed that the rule of not subclassing dict is related to the dict methods implemented in C, which would not call the built-in dict methods overloaded in Python. But what if you just want to add some functionalities to dict, leaving the methods already implemented in C untouched?

That’s a very good question. And the answer is short and simple: Yes, you can do it! You can inherit from dict; simply don’t overload dict’s methods, that’s all.

The question is, will such a dict-based class be as performant as dict? Or rather as collections.UserDict? To answer this question, we will have to run some more benchmarks.

Let’s imagine we keep some data in a dictionary, and we want to add method .summarize() that calculates some summary statistics for the data. It could be something like this (just an example):

from collections.abc import Sequence
from typing import Callable

def try_calculate(func: Callable, *args, **kwargs):
"""Try calculations; when data are incorrect, return nan."""
try:
return func(*args, **kwargs)
except TypeError:
return float("nan")

class RichDict(dict):
measures = {
"sum": sum,
"n": len,
"mean": lambda x: sum(x) / len(x),
}

def summarize(self):
statistics = {}
for k, v in self.items():
if isinstance(v, str):
statistics[k] = {"n": len(v)}
elif isinstance(v, Sequence):
statistics[k] = {
name: try_calculate(func, v)
for name, func
in self.measures.items()
}
return statistics

RichDict is a dict with one more method: .summarize(). The method does the following:

  • It iterates over key-value pairs of the data (reached via the .items method).
  • When a value is a string, statistics includes only length and is returned as a dictionary with one key, n.
  • When a value is a Sequence, the main summary statistics are calculated. The measures are kept as callables in a class attribute RichDict.measures, which is a dictionary.
  • The method protects calculations: if a measure cannot be calculated, an exception is caught and float("nan") (for not-a-number) is returned as the results of calculations. That way, Python will not throw an error when attempting to calculate the mean of an empty list, for instance.

If you want to add a measure, you can easily do it:

RichDict.measures["min"] = min
RichDict.measures["max"] = max

If the function is more complicated, you can use a lambda function:

RichDict.measures["max-to-mean"] = lambda x: max(x) / min(x)

or, preferably, first define a function and then assign it here:

def max_to_min(x: float) -> float:
return max(x) / min(x)

RichDict.measures["max-to-mean"] = max_to_min

Note that since .measures is a class attribute, all instances of RichDict (those to be created, but also those already existing) will have the extended measures, with min and max statistics.

Just one example of RichDict in action:

>>> d = RichDict(x=[1,4,5,7],
... y=[1,"1",2],
... z="Shout Bamalama!",
... f=10)
>>>
>>> stats = d.summarize()
>>> stats # doctest: NORMALIZE_WHITESPACE
{'x': {'sum': 17, 'n': 4, 'mean': 4.25, 'min': 1, 'max': 7, 'max-to-min': 7},
'y': {'sum': nan, 'n': 3, 'mean': nan, 'min': nan, 'max': nan, 'max-to-min': nan},
'z': {'n': 15}}

Above, the RichDict class has one class attribute with measures to be used for sequence data; for strings, the .summarize() methods calculates only one measure. Update the class so that it has two class attributes, measures_seq and measures_str, designed as measures above. For strings, the .summarize() method should calculate the measures the way it does for sequences, that is, using measures_str.

You will find the solution in Appendix 1.

In the code, I used the standard-library doctests module for documentation testing. If you’re interested in learning more about this interesting module, you can do so from this article:

Okay, now that we know that RichDict works, we know we can subclass dict. What we want to learn now is whether RichDict’s added functionality, defined in Python (not in C, like the base code for dict), adds some overhead time for the regular behavior of dict. To this end, we will benchmark such behavior as creating a new RichDict versus creating a new dict, key lookup, and the like.

Let’s conduct similar benchmarks to those above, conducted for UserDict. You can find the code for them in this GitHib gist. You will find there the setup value used below.

>>> compare("UserDict()", "RichDict()", "dict()", setup=setup)
{'UserDict()': (2.236e-07,), 'RichDict()': (1.073e-07,), 'dict()': (5.892e-08,)}

As shown above, when it comes to creating an empty instance, RichDict is significantly faster than UserDict (about 2 times faster) and slower than dict (about 2 times slower).

>>> compare(
... "UserDict({i: i**2 for i in range(1000)})",
... "RichDict({i: i**2 for i in range(1000)})",
... "{i: i**2 for i in range(1000)}",
... number=100_000,
... setup=setup)
{'snippet 1': (0.0001765,), # UserDict
'snippet 2': (6.845e-05,), # RichDict
'snippet 3': (5.388e-05,)} # dict

This time, RichDict is about 2.5 times faster than UserDict but a little slower (around 1.3 times) than dict.

Below, you will find examples of more benchmarks, separated by blank lines for your convenience:

>>> setup += """d = {'x': 1, 'y': 2, 'z': 3}
... ud = UserDict(d)
... rd = RichDict(d)
... """
>>> compare("ud['x']", "rd['x']", "d['x']", setup=setup)
{"ud['x']": (5.111e-08,), rd['x']": (3.024e-08,), "d['x']": (1.475e-08,)}

>>> compare(
... "'a' in ud",
... "'a' in rd",
... "'a' in d",
... setup=setup,
... number=1_000_000)
{"'a' in d": (1.366e-08,), # dict
"'a' in rd": (2.228e-08,), # RichDict
"'a' in ud": (4.436e-08,)} # UserDict

>>> compare(
... "ud.get('a', None)",
... "rd.get('a', None)",
... "d.get('a', None)",
... setup=setup,
... number=1_000_000)
{"d.get('a', None)": (1.935e-08,), # dict
"rd.get('a', None)": (3.016e-08,), # RichDict
"ud.get('a', None)": (5.125e-07,)} # UserDict

>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in rd.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=1_000
... )
{'for i, v in d.items(): pass': (0.001783,),
'for i, v in rd.items(): pass': (0.001743,),
'for i, v in ud.items(): pass': (0.01627,)}

The numbers themselves say what we need, and so I will leave you with them for the moment.

To conclude the benchmarks

RichDict is usually slower than dict (though sometimes only very little) but faster than UserDict.

Thus, if you just want to add some functionalities to dict, without overwriting its built-in methods, you definitely can subclass dict. I’d say this should be your preferred method, instead of subclassing collections.UserDict, as the latter is significantly slower. Do remember that we’re talking about the situation when you do not need to change the regular behavior of a dictionary, just to add some new behavior.

Remember also that you will have to pay a tax for using a built-in type that way: Your class (in our example, RichDict) will be slower than dict. Still, however, it will be faster than UserDict, which was created for this very aim: to let you inherit from… well, not from dict, so rather to let you create a new type (class) that will have the same behavior as dict. Using UserDict, unfortunately, is rather costly, as its performance is significantly worse than that of dict.

Let’s summarize our discussion about subclassing dict and UserDict. We’ve learned we have three options to choose from:

  1. Inherit from UserDict, when you want to overwrite the built-in behavior of dict. This will be the slowest option.
  2. Inherit from dict, when you don’t want to overwrite the built-in behavior of dict, but to add new functionalities (methods). This will be a faster option than option 1.
  3. Use the built-in dict type, without creating a custom class. If you need custom functionalities, you can implement them in functions that take a dict instance as an argument. This is the fastest option (see below).

We have not discussed the third option yet, as it does not include subclassing. Not much to discuss, though, as this is the simplest approach, one that uses a more procedural approach rather than OOP. On the one hand, using the summarize() function in this approach would be only a little (if any at all) faster than using the RichDict.summarize() method in option 2. This gist contains the corresponding benchmark code; in my machine, it provided a small but stable (from run to run) improvement in performance. On the other hand, all the other behavior of regular dict, as we know, is significantly faster than that of RichDict. Hence, generally, option 3 offers the fastest way to handle a dictionary with additional functionalities.

Thus, if performance matters, the most sensible seems to be the third option — that is, using a regular dictionary and implementing the additional required behavior in external functions. Depending on the situation, it can also be an option with the clearest code, as it does not require a custom data structure but uses a dictionary, one of the most common data structures in Python, combined with a function. More often than not, this means clearer code.

The second option means worse performance, as adding methods to dict leads to overhead time of its behavior. As we know, option 3 gets rid of this overhead by moving the method outside the dictionary.

The first option is definitely the worst one in terms of performance. I think the only situations in which this option makes sense are those in which each of the following three conditions is met:

  • performance does not matter
  • you need to overwrite one or more built-in dict methods
  • the code will be clearer and easier to use thanks to creating a class that will join all the required functionalities

¹ I am planning to write a dedicated article to positional-only and keyword-only arguments. Once it’s published, I will link it here.

² By the way, in an interactive session, a return from the function would have the same effect (of course, when the results are not assigned). I ran the benchmarks in a script, not an interactive session, however.

Solution to the exercise

You can do it in various way. The below solution avoids repetition, but also makes it easy to add yet another type (to Sequence and str) to calculations.

from collections.abc import Sequence
from typing import Callable

class RichDict(dict):
measures_seq = {
"sum": sum,
"n": len,
"mean": lambda x: sum(x) / len(x),
}
measures_str = {
"n": len,
}

def summarize(self):
statistics = {}
for k, v in self.items():
if isinstance(v, str):
measures = self.measures_str
elif isinstance(v, Sequence):
measures = self.measures_seq
else:
continue
statistics[k] = {
name: try_calculate(func, v)
for name, func
in measures.items()
}
return statistics

NaN = float("nan")

def try_calculate(func: Callable, *args, **kwargs):
"""Try calculations and when the data are incorrect, return nan."""
try:
return func(*args, **kwargs)
except TypeError:
return NaN


In his fantastic book Fluent Python. 2nd ed., Luciano Ramalho explains why you should not create custom classes inheriting from dict. The reason behind this rule, strange at the first glance, is simple but critical: dict is a highly optimized type implemented in C, and it wouldn’t call the methods you overload in your subclass of dict.

That would be a nasty surprise, wouldn’t it? Let’s see this in an example. Imagine you want to create a dictionary-like class in which the provided values will be converted to their string representation. Let’s try to do this by subclassing the dict built-in type:

class StringDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, str(value))

This seems like perfectly valid Python code. Let’s see how this works:

>>> class StringDict(dict):
... def __setitem__(self, key, value):
... super().__setitem__(key, str(value))
...
>>> mydict = StringDict(first=1, second=2, third=3)
>>> mydict
{'first': 1, 'second': 2, 'third': 3}

Well, it does not work at all — or rather, this __setitem__ method does not work at all. We wanted to get values converted to strings but they are not. Nonetheless, we don’t see any error; the class itself works somehow — in fact, it works just like a regular dictionary would. (Or rather, it provides the same results but slower; we will discuss this later on.)

To get what you want, you should subclass UserDict:

>>> from collections import UserDict
>>> class StringUserDict(UserDict):
... def __setitem__(self, key, value):
... super().__setitem__(key, str(value))
...
>>> mydict = StringUserDict(first=1, second=2, third=3)
>>> mydict
{'first': '1', 'second': '2', 'third': '3'}

As you see the only thing we changed in the definition is subclassing UserDict instead of dict.

So now you know. It’s enough to use UserDict. Great.

Is it?

Hold on. Let’s think. There are a couple of things we should consider before deciding that using UserDict instead of dict is so great.

First, we know that Python’s built-in types are highly optimized because they are implemented in C, and that this implementation itself is highly optimized.

Second, we know that we shouldn’t subclass dict because its methods implemented in C will not call the overwritten methods implemented in Python.

And third, an easy thing to check, collections.UserDict is implemented in Python. In Linux, you will find its definition here:

usr > lib > python3.9 > collections > __init__.py > UserDict
Localization of UserDict definition in Linux. Screenshot from VS Code. Image by author

In this context, the following question asks itself: If so, will my custom class subclassing UserDict be performant?

My immediate guess is that no, it won’t. The optimization of dict comes from C implementation, and UserDict is implemented in Python. Why should it be optimized whatsoever? We will check this in the following section.

For benchmarks, we will use the standard-library timeit module. You can read about it here:

To make benchmarks a little simpler and structurized, let us define a simple function to run time benchmarks for two or more code snippets:


import rounder
import timeit
import pprint

def compare(
__snippet1,
__snippet2,
*args,
number=10_000_000,
repeat=7,
setup="from collections import UserDict"):
snippets = [__snippet1, __snippet2, *args]
results = {}
for i, snippet in enumerate(snippets):
name = snippet if len(snippet) < 30 else f"snippet {i + 1}"
results[name] = min(timeit.repeat(
snippet, number=number, repeat=repeat, setup=setup
)) / number,
results = rounder.signif_object(results, digits=4)
pprint.pprint(results)

A couple of things:

  • The function uses the rounder package, to round all numbers in the dictionary to four significant digits; you can read more about it here:
  • __snippet1 and __snippet2 are positional-only arguments, so you cannot call them by name. This is thanks to the double-underscore prefix.
  • Thanks to *args after the two snippet arguments, you can provide more snippets, also as positional keywords; you can use as many of them as you want.
  • All the remaining arguments are keyword-only. Here, this is achieved by putting them after *args
  • The function report the results as the mean of all calls from the quickest of seven runs. Hence, all the results are directly comparable, even if the timeit.repeat() function was used with different number values.
  • The function return implicit None and prints a short report of the benchmarks, using the pprint() function from the standard-library pprint module. More often than not, avoid replacing return statements with prints², unless your function/method is a one that prints.

Okay, we’ll see the function in action in a second. Let’s start off with comparing how quickly dict() and UserDict() create instances. Nonetheless, we can instantiate a regular dictionary using two methods, dict() and (significantly faster) {}, so we’ll include them both:

>>> compare("UserDict()", "dict()", "{}")
{'UserDict()': (1.278e-07,), 'dict()': (3.826e-08,), '{}': (1.518e-08,)}

For all benchmarks in this article, I used Python 3.11 on a Windows 10 machine, in WSL 1, 32 GM of RAM and four physical (eight logical) cores. The benchmark shows that to create a new instance, UserDict is twice slower than dict.

As mentioned above, the values we see in the results dictionary represent the time of creating one instance of either UserDict or regular dict (created using two methods). Clearly, to create an instance of UserDict, you need more time, about 1.3e-07 of a second — while {} needs about 1.5e-08. Not a big difference? Note when you need to create a single instance, but imagine creating millions of dictionaries. So, to create a regular dictionary, you need more or less 3–8 times less time than to create a UserDict, depending on the instantiation method.

Let’s see how this works with bigger dictionaries. We’ll create a simple numerical dictionary via a dictionary comprehension. Since UserDict does not enable us to use the dictcomp syntax (another drawback), the only thing we can do is to create first a regular dictionary using the dictcomp syntax and then convert it to a UserDict instance:

>>> compare(
"UserDict({i: i**2 for i in range(1000)})",
"{i: i**2 for i in range(1000)}",
number=100_000)
{'snippet 1': (0.0001316,), 'snippet 2': (5.027e-05,)}

A regular dictionary is about 2.5 times faster. That seems quite amazing given that creating an empty dictionary was even faster. We have to remember that the results of such benchmarks can vary from run to run. But we also have to remember that when we use many repetitions (here, a hundred thousand — we could use more), the variation should be relatively small.

Will the size matter when we compare lookup time? Basically key lookup is unrelated to dictionary size, so supposedly, we should be getting similar results irrespective of dict size.

First, a small dictionary:

>>> setup = """from collections import UserDict
... d = {'x': 1, 'y': 2, 'z': 3}
... ud = UserDict(d)
... """
>>> compare("ud['x']", "d['x']", setup=setup)
{"ud['x']": (4.754e-08,), "d['x']": (1.381e-08,)}

Okay, so about 3.5 times slower. Now, for a bigger one, of 10_000 key-value pairs:

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10_000)}
... ud = UserDict(d)
... """
>>> compare("ud['9999']", "d['9999']", setup=setup, number=1_000_000)
{"ud['9999']": (7.785e-08,), "d['9999']": (2.787e-08,)}

and for one of 10 million elements:

>>> compare("ud['9999']", "d['9999']", setup=setup, number=100_000)
{"ud['9999']": (6.662e-08,), "d['9999']": (2.499e-08,)}

So, the size indeed does not matter, and every time, dict was more or less 3–3.5 times faster. Let’s see, this time just for medium-sized dictionaries, how it works with an unexisting key:

>>> compare(
... "ud.get('a', None)",
... "d.get('a', None)",
... setup=setup,
... number=1_000_000)
{"d.get('a', None)": (4.318e-08,), "ud.get('a', None)": (4.525e-07,)}

This time it’s a bigger difference, dict being over 10 times faster.

What about checking if a key is in the dictionary?

>>> compare("'a' in ud", "'a' in d", setup=setup, number=1_000_000)
{"'a' in d": (1.465e-08,), "'a' in ud": (4.562e-08,)}

So, again 3–3.5 times faster.

Now, let’s benchmark a frequent operation, that is, iterating over a dictionary; again, let’s check various sizes of the dictionary:

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=1_000_000
... )
{'for i, v in d.items(): pass': (1.726e-07,),
'for i, v in ud.items(): pass': (1.235e-06,)}

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(10_000)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=10_000
... )
{'for i, v in d.items(): pass': (0.0001255,),
'for i, v in ud.items(): pass': (0.00112,)}

>>> setup = """from collections import UserDict
... d = {str(i): i for i in range(100_000)}
... ud = UserDict(d)
... """
>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=10_000
... )
{'for i, v in d.items(): pass': (0.001772,),
'for i, v in ud.items(): pass': (0.01718,)}

Okay, so for small dictionaries, dict is about 7 times faster to loop over its key-value pairs available via the .items() method. For medium-sized dictionaries (of 10 thousand elements in our experiment), about 9 times faster. For bigger dictionaries (of 100_000 elements), the result was similar, so looping itself — once it started — does not seem to depend on the type of dictionary.

Since it was a rather minor benchmark, we can conclude that a regular dictionary should be around 5–10 times faster to loop over its items than UserDict.

To conclude the benchmarks

Maybe let’s stop here. We could conduct more benchmarks, but that’s not the point. I did not want to conduct a full comparison of dict and UserDict in terms of execution time; if you’re interested, you can play with the code to conduct a set of solid benchmarks. Instead, I wanted to shed some light on the issue and check if, as I expected based on the knowledge of UserDict and dict’s implementations, the former is significantly slower than the latter.

And it is — unless you consider 5–10 times slower as an insignificant number. So, if you only can, consider using regular dictionaries, not those subclassing UserDict, unless you must change the behavior of dict.

Ah… Why can’t we just subclass dict?! Why?

Or… Can we?

Perhaps you’ve noticed that the rule of not subclassing dict is related to the dict methods implemented in C, which would not call the built-in dict methods overloaded in Python. But what if you just want to add some functionalities to dict, leaving the methods already implemented in C untouched?

That’s a very good question. And the answer is short and simple: Yes, you can do it! You can inherit from dict; simply don’t overload dict’s methods, that’s all.

The question is, will such a dict-based class be as performant as dict? Or rather as collections.UserDict? To answer this question, we will have to run some more benchmarks.

Let’s imagine we keep some data in a dictionary, and we want to add method .summarize() that calculates some summary statistics for the data. It could be something like this (just an example):

from collections.abc import Sequence
from typing import Callable

def try_calculate(func: Callable, *args, **kwargs):
"""Try calculations; when data are incorrect, return nan."""
try:
return func(*args, **kwargs)
except TypeError:
return float("nan")

class RichDict(dict):
measures = {
"sum": sum,
"n": len,
"mean": lambda x: sum(x) / len(x),
}

def summarize(self):
statistics = {}
for k, v in self.items():
if isinstance(v, str):
statistics[k] = {"n": len(v)}
elif isinstance(v, Sequence):
statistics[k] = {
name: try_calculate(func, v)
for name, func
in self.measures.items()
}
return statistics

RichDict is a dict with one more method: .summarize(). The method does the following:

  • It iterates over key-value pairs of the data (reached via the .items method).
  • When a value is a string, statistics includes only length and is returned as a dictionary with one key, n.
  • When a value is a Sequence, the main summary statistics are calculated. The measures are kept as callables in a class attribute RichDict.measures, which is a dictionary.
  • The method protects calculations: if a measure cannot be calculated, an exception is caught and float("nan") (for not-a-number) is returned as the results of calculations. That way, Python will not throw an error when attempting to calculate the mean of an empty list, for instance.

If you want to add a measure, you can easily do it:

RichDict.measures["min"] = min
RichDict.measures["max"] = max

If the function is more complicated, you can use a lambda function:

RichDict.measures["max-to-mean"] = lambda x: max(x) / min(x)

or, preferably, first define a function and then assign it here:

def max_to_min(x: float) -> float:
return max(x) / min(x)

RichDict.measures["max-to-mean"] = max_to_min

Note that since .measures is a class attribute, all instances of RichDict (those to be created, but also those already existing) will have the extended measures, with min and max statistics.

Just one example of RichDict in action:

>>> d = RichDict(x=[1,4,5,7],
... y=[1,"1",2],
... z="Shout Bamalama!",
... f=10)
>>>
>>> stats = d.summarize()
>>> stats # doctest: NORMALIZE_WHITESPACE
{'x': {'sum': 17, 'n': 4, 'mean': 4.25, 'min': 1, 'max': 7, 'max-to-min': 7},
'y': {'sum': nan, 'n': 3, 'mean': nan, 'min': nan, 'max': nan, 'max-to-min': nan},
'z': {'n': 15}}

Above, the RichDict class has one class attribute with measures to be used for sequence data; for strings, the .summarize() methods calculates only one measure. Update the class so that it has two class attributes, measures_seq and measures_str, designed as measures above. For strings, the .summarize() method should calculate the measures the way it does for sequences, that is, using measures_str.

You will find the solution in Appendix 1.

In the code, I used the standard-library doctests module for documentation testing. If you’re interested in learning more about this interesting module, you can do so from this article:

Okay, now that we know that RichDict works, we know we can subclass dict. What we want to learn now is whether RichDict’s added functionality, defined in Python (not in C, like the base code for dict), adds some overhead time for the regular behavior of dict. To this end, we will benchmark such behavior as creating a new RichDict versus creating a new dict, key lookup, and the like.

Let’s conduct similar benchmarks to those above, conducted for UserDict. You can find the code for them in this GitHib gist. You will find there the setup value used below.

>>> compare("UserDict()", "RichDict()", "dict()", setup=setup)
{'UserDict()': (2.236e-07,), 'RichDict()': (1.073e-07,), 'dict()': (5.892e-08,)}

As shown above, when it comes to creating an empty instance, RichDict is significantly faster than UserDict (about 2 times faster) and slower than dict (about 2 times slower).

>>> compare(
... "UserDict({i: i**2 for i in range(1000)})",
... "RichDict({i: i**2 for i in range(1000)})",
... "{i: i**2 for i in range(1000)}",
... number=100_000,
... setup=setup)
{'snippet 1': (0.0001765,), # UserDict
'snippet 2': (6.845e-05,), # RichDict
'snippet 3': (5.388e-05,)} # dict

This time, RichDict is about 2.5 times faster than UserDict but a little slower (around 1.3 times) than dict.

Below, you will find examples of more benchmarks, separated by blank lines for your convenience:

>>> setup += """d = {'x': 1, 'y': 2, 'z': 3}
... ud = UserDict(d)
... rd = RichDict(d)
... """
>>> compare("ud['x']", "rd['x']", "d['x']", setup=setup)
{"ud['x']": (5.111e-08,), rd['x']": (3.024e-08,), "d['x']": (1.475e-08,)}

>>> compare(
... "'a' in ud",
... "'a' in rd",
... "'a' in d",
... setup=setup,
... number=1_000_000)
{"'a' in d": (1.366e-08,), # dict
"'a' in rd": (2.228e-08,), # RichDict
"'a' in ud": (4.436e-08,)} # UserDict

>>> compare(
... "ud.get('a', None)",
... "rd.get('a', None)",
... "d.get('a', None)",
... setup=setup,
... number=1_000_000)
{"d.get('a', None)": (1.935e-08,), # dict
"rd.get('a', None)": (3.016e-08,), # RichDict
"ud.get('a', None)": (5.125e-07,)} # UserDict

>>> compare(
... "for i, v in ud.items(): pass",
... "for i, v in rd.items(): pass",
... "for i, v in d.items(): pass",
... setup=setup,
... number=1_000
... )
{'for i, v in d.items(): pass': (0.001783,),
'for i, v in rd.items(): pass': (0.001743,),
'for i, v in ud.items(): pass': (0.01627,)}

The numbers themselves say what we need, and so I will leave you with them for the moment.

To conclude the benchmarks

RichDict is usually slower than dict (though sometimes only very little) but faster than UserDict.

Thus, if you just want to add some functionalities to dict, without overwriting its built-in methods, you definitely can subclass dict. I’d say this should be your preferred method, instead of subclassing collections.UserDict, as the latter is significantly slower. Do remember that we’re talking about the situation when you do not need to change the regular behavior of a dictionary, just to add some new behavior.

Remember also that you will have to pay a tax for using a built-in type that way: Your class (in our example, RichDict) will be slower than dict. Still, however, it will be faster than UserDict, which was created for this very aim: to let you inherit from… well, not from dict, so rather to let you create a new type (class) that will have the same behavior as dict. Using UserDict, unfortunately, is rather costly, as its performance is significantly worse than that of dict.

Let’s summarize our discussion about subclassing dict and UserDict. We’ve learned we have three options to choose from:

  1. Inherit from UserDict, when you want to overwrite the built-in behavior of dict. This will be the slowest option.
  2. Inherit from dict, when you don’t want to overwrite the built-in behavior of dict, but to add new functionalities (methods). This will be a faster option than option 1.
  3. Use the built-in dict type, without creating a custom class. If you need custom functionalities, you can implement them in functions that take a dict instance as an argument. This is the fastest option (see below).

We have not discussed the third option yet, as it does not include subclassing. Not much to discuss, though, as this is the simplest approach, one that uses a more procedural approach rather than OOP. On the one hand, using the summarize() function in this approach would be only a little (if any at all) faster than using the RichDict.summarize() method in option 2. This gist contains the corresponding benchmark code; in my machine, it provided a small but stable (from run to run) improvement in performance. On the other hand, all the other behavior of regular dict, as we know, is significantly faster than that of RichDict. Hence, generally, option 3 offers the fastest way to handle a dictionary with additional functionalities.

Thus, if performance matters, the most sensible seems to be the third option — that is, using a regular dictionary and implementing the additional required behavior in external functions. Depending on the situation, it can also be an option with the clearest code, as it does not require a custom data structure but uses a dictionary, one of the most common data structures in Python, combined with a function. More often than not, this means clearer code.

The second option means worse performance, as adding methods to dict leads to overhead time of its behavior. As we know, option 3 gets rid of this overhead by moving the method outside the dictionary.

The first option is definitely the worst one in terms of performance. I think the only situations in which this option makes sense are those in which each of the following three conditions is met:

  • performance does not matter
  • you need to overwrite one or more built-in dict methods
  • the code will be clearer and easier to use thanks to creating a class that will join all the required functionalities

¹ I am planning to write a dedicated article to positional-only and keyword-only arguments. Once it’s published, I will link it here.

² By the way, in an interactive session, a return from the function would have the same effect (of course, when the results are not assigned). I ran the benchmarks in a script, not an interactive session, however.

Solution to the exercise

You can do it in various way. The below solution avoids repetition, but also makes it easy to add yet another type (to Sequence and str) to calculations.

from collections.abc import Sequence
from typing import Callable

class RichDict(dict):
measures_seq = {
"sum": sum,
"n": len,
"mean": lambda x: sum(x) / len(x),
}
measures_str = {
"n": len,
}

def summarize(self):
statistics = {}
for k, v in self.items():
if isinstance(v, str):
measures = self.measures_str
elif isinstance(v, Sequence):
measures = self.measures_seq
else:
continue
statistics[k] = {
name: try_calculate(func, v)
for name, func
in measures.items()
}
return statistics

NaN = float("nan")

def try_calculate(func: Callable, *args, **kwargs):
"""Try calculations and when the data are incorrect, return nan."""
try:
return func(*args, **kwargs)
except TypeError:
return NaN

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment