How To Improve The Performance of Python Functions

By Jessie Hobb On Mar 10, 2023

Speeding up frequently called functions in Python

In today’s world, where the amount of data being processed is growing at an unprecedented rate, having efficient and optimized code has become more important than ever. Python, being a popular programming language, offers several built-in tools to optimize the performance of your code. One of these tools is the lru_cache decorator, which can be used to cache the results of a function, thereby improving its performance.

In this article, we will explore the benefits of using the lru_cache decorator in Python, and how it can help you write faster and more efficient code for frequently called functions.

Caching in a nutshell

Caching is a widely adopted approach to enhance the performance of computer programs. It involves temporarily storing information that is costly to compute or frequently accessed within a specified time frame. Through caching, developers can efficiently store the results of previous computations and reduce the time and computational resources required to recompute them. This process can considerably improve the response time and overall performance of the system.

Caching can be implemented using various data structures such as arrays, hash tables, and linked lists, and can also be managed through different policies such as Least Recently Used (LRU) and First In First Out (FIFO).

Managing the size of a cache can be a crucial aspect of its implementation. Failure to put in place a mechanism that regularly shrinks the cache could result in an ever-increasing memory usage, leading to an eventual application crash. To overcome this challenge, programmers must carefully consider the appropriate policy or strategy that best suits their specific requirements.

In some cases, adaptive policies, which adapt to changing workload patterns, may be more suitable. Moreover, cache size management can be done through different techniques, such as setting a maximum size, implementing time-based expiry, or using a hybrid approach that combines multiple policies.

Ultimately, selecting an appropriate cache management policy requires careful consideration of factors such as performance, memory usage, and the specific needs of the application. Overall, caching is an effective technique that allows programmers to optimize their code and improve the user experience.

Creating Least Recently Used cache with lru_cache decorator

The lru_cache decorator is part of the language’s standard library and can be imported from functools module.

The decorator uses a strategy called Least Recently Used (LRU) to prioritize objects that were used most recently. Whenever an object is accessed using a cache that uses the LRU strategy, the cache places it at the top while shifting all the other objects down one position. However, if the cache reaches its maximum size, the least recently used object is removed to create room for new objects. This ensures that frequently used objects stay in the cache while infrequently used ones are gradually pushed out.

.. frequently used objects stay in the cache while infrequently used ones are gradually pushed out

Caches implementing LRU come in handy when dealing with computationally expensive or I/O bound functions that are repeatedly called with the same arguments.

The lru_cache decorator in Python operates by maintaining a thread-safe cache in the form of a dictionary. Upon invocation of the decorated function, a new entry is created in the cache, where the function call arguments correspond to the dictionary key and the returned value to the dictionary value. Subsequent calls to the function with the same arguments will not compute a new result; rather, the cached value will be retrieved. This feature avoids redundant computations, thereby improving the overall performance.

Since a dictionary is used to cache results, the positional and keyword arguments to the function must be hashable

– Python Documentation

The size of the lru_cache can be adjusted through the maxsize argument, that defaults to 128. This value specifies the maximum number of entries that the cache can hold at any given time. Once the cache reaches its maximum size and a new call is made, the decorator will discard the least recently used entry to make room for the most recent one. This ensures that the cache does not grow beyond the specified limit, thus preventing excessive memory consumption and improving performance. If the maxsize is set to None the LRU feature is then disabled which means that the cache can grow indefinitely.

It is important to note that when two function calls have the same keyword arguments but in a different order, two separate cache entries will be created. For instance, calling func(x=10, y=5) and func(y=5, x=10) will result in two distinct entries in the cache, even if both calls return the same value.

Furthermore, the lru_cache decorator accepts another argument called typed that defaults to False.

If typed is set to true, function arguments of different types will be cached separately. If typed is false, the implementation will usually regard them as equivalent calls and only cache a single result.

— Python Documentation

For instance, calling func(a=4) (a holds an integer value) and func(a=4.) (a now holds a float value) when typed=True, two distinct entries will be created in the cache. Note however that some types such as str and int might be cached in separate cache entries even when typed is set to False.

Now let’s assume we have a very simple Python function that accepts two arguments and returns their sum

def add_numbers(a, b):
return a + b

Now let’s assume that the function is decorated with the lru_cache and gets called several times, as outlined below:

from functools import lru_cache@lru_cache(maxsize=3)
def add_numbers(a, b):
return a + b
add_numbers(a=10, b=15)
add_numbers(a=10, b=10)
add_numbers(a=3, b=15)
add_numbers(a=20, b=21)
add_numbers(a=3, b=15)

The image below illustrates how a LRU cache with maxsize=3 is maintained over time and how it behaves upon cache hit or miss and how it handles cases when the current size reaches the specified maximum size.

LRU cache will prioritize most recently used function calls — Source: Author

Initially, when the function is called with arguments a=10 and b=15, the cache is empty and a cache miss occurs. The function is executed, and both the input arguments and the resulting value are stored in the cache.
On the next call, when the function is called with arguments a=10 and b=10, a cache miss occurs again since this combination of arguments is not found in the cache. The function is executed, and a new entry is created in the cache and placed at the top.
Subsequently, when the function is called with arguments a=3 and b=15, another cache miss occurs. The input arguments and the resulting value are stored in the cache after the function is executed.
On the next call, when the function is called with arguments a=20 and b=1, a cache miss occurs, and the cache has reached its maximum size. This means that the least recently used entry is removed, and the remaining entries are shifted down by one position. After executing the function, the latest entry is added to the cache and placed at the top.
Finally, when the function is called again with arguments a=3 and b=15, a cache hit occurs since this entry is now stored in the cache. The returned value is fetched from the cache instead of executing the function. The cache entry is then moved to the top.

How and when to use lru_cache decorator

Having gained a solid understanding of cache and the LRU caching strategy, it’s time to delve deeper into lru_cache and learn how to utilize it to its fullest potential, while also assessing its impact.

For the purposes of this tutorial, we will examine a more complex function that will greatly benefit from caching, and lru_cache decorator in particular. Our fibonacci function can be used to compute fibonacci numbers.

In mathematics, the Fibonacci numbers, commonly denoted Fn , form a sequence, the Fibonacci sequence, in which each number is the sum of the two preceding ones. The sequence commonly starts from 0 and 1, although some authors start the sequence from 1 and 1 or sometimes (as did Fibonacci) from 1 and 2. Starting from 0 and 1, the first few values in the sequence are:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144.

– Wikipedia

It is worth noting that the function below computes the Fibonacci number recursively. Although we can introduce memoization to optimize the solution further, this is beyond the scope of this tutorial. In addition, for the sake of simplicity, I will keep things straightforward and allow even those with less experience to follow along.

The fact that the implementation is not fully optimized serves our purpose well, as it demonstrates how lru_cache can enhance the performance of computationally expensive function calls.

def fibonacci(n):
if n < 0:
raise ValueError('Invalid input')# Base case
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)

To evaluate the time it takes for our function to produce results for several distinct inputs, we can use the timeit module. We’ll utilize the timeit module to measure the performance of the fibonacci function, and take the minimum timing from 5 repeated function calls with the same argument.

[When measuring execution time] Use the min() rather than the average of the timings. That is a recommendation from me, from Tim Peters, and from Guido van Rossum. The fastest time represents the best an algorithm can perform when the caches are loaded and the system isn’t busy with other tasks. All the timings are noisy — the fastest time is the least noisy. It is easy to show that the fastest timings are the most reproducible and therefore the most useful when timing two different implementations.

— Raymond Hettinger on StackOverflow

# module test.pyfrom timeit import repeat
setup = """
def fibonacci(n):
if n < 0:
raise ValueError('Invalid input')
# Base case
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
"""
min_timing = min(repeat(setup=setup, stmt='fibonacci(30)', number=5))
print(f'Min Timing: {round(min_timing, 2)}s')

In essence, our script will measure the time elapsed to executed the 30th fibonacci number with fibonacci. Out of 5 different calls, the minimum time it took on my machine was 1.39 seconds.

$ python3 test.py 
Min Timing: 1.39s

The fibonacci function is deterministic, meaning that it will always produce the same output given the same input. Therefore, we can take advantage of the concept of caching. By adding the @lru_cache decorator, the function’s output for a given input is now cached, and if the function is called again with the same input, it will return the cached result instead of recomputing it. This can significantly speed up the function’s execution time, particularly when called repeatedly with the same input values.

from functools import lru_cache@lru_cache
def fibonacci(n):
if n < 0:
raise ValueError('Invalid input')
# Base case
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)

Now let’s time the new version of fibonacci function:

# module test.pyfrom timeit import repeat
setup = """
from functools import lru_cache
@lru_cache
def fibonacci(n):
if n < 0:
raise ValueError('Invalid input')
# Base case
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)
"""
min_timing = min(repeat(setup=setup, stmt='fibonacci(30)', number=5))
print(f'Min Timing: {min_timing}s')

On my machine, the minimum timing was close to zero. fibonacci(30) was executed only during the first iteration. For the subsequent iterations, the cached result of this function call was retrieved, which is an inexpensive operation.

$ python3 test.py 
Min Timing: 7.790978997945786e-06s

Final Thoughts

In conclusion, optimizing the performance of your code has become more important than ever in today’s data-driven world. One way to achieve this in Python is by using the lru_cache decorator, which can cache the results of frequently called functions and improve their performance.

Caching, in general, involves temporarily storing information that is costly to compute or frequently accessed, reducing the time and computational resources required to recompute them. The decorator uses a strategy called Least Recently Used (LRU) to prioritize objects that were used most recently and discard the least recently used objects once the cache reaches its maximum size.

lru_cache decorator is an efficient tool for optimizing Python code performance by caching the results of computationally expensive or I/O bound functions.

Become a member and read every story on Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.

Related articles you may also like