Techno Blender
Digitally Yours.

Decorator Tricks for Data Scientists | by Diego Barba | Jun, 2022

0 65


If you are not using Python decorators yet, you should. Pure syntactic sugar.

Image by author.

I remember the first time I saw an “@” sign on top of a function in Python code. I felt compelled to research what was this weird syntax. It marked a before and after, that is for sure. The “@” sign on top of the function is called a decorator, a function of the function it decorates.

You can spend years as a data scientist and not use decorators. Or maybe you have used them but have not learned how to code your own. This story aims to build concrete decorator examples useful in many data science tasks, and, in the process, learn how to code decorators step by step.

The primary purposes for decorators are:

  • add functionality to existing functions and classes
  • serve as a note left in the code when doing something that might not go over a code review otherwise

This story will focus on the former, adding functionality to existing functions and classes.

To illustrate this point, imagine that you have many functions that make queries to a database. After coding the functions to query everything you need from the database, you realize that one of every ten attempts fails due to some random connection error. We all know how it is to work with databases.

You will need to add some retry logic to all the functions. Said logic will take shape as a loop for several attempts with possibly some waiting time between them — boilerplate code. However, the essential logic is already coded in the functions as they are.

So you have two options:

  • Add the boilerplate to all functions; this will require copy-pasting on an industrial scale and altering each function’s logic to accommodate the retry. Messy and smelly code.
  • Code a retry decorator and add it on top of the database functions (one line of code); otherwise, leave the functions’ logic intact. Clean and elegant code.

I certainly know which option I would choose.

The story is structured as follows:

  • Decorator basics
  • The retry decorator
  • Class registry decorator
  • Periodic execution decorator
  • Final Words

Decorator basics

The simplest decorator is a function that takes the function it decorates as the argument and returns another function which in turn returns whatever the decorated function is supposed to return. Tongue twister.

It is easier to learn by example:

What decorator_no_info actually returns is the _wrapper function. That is the essence of decorators; change one function by another. The _wrapper function takes whatever arguments and keyword arguments the decorated function (func) takes and returns whatever func returns. An additional print in _wrapper shows you that the code block ran.

Regarding the type hints, the purpose of RT (return type) is to show that whatever the decorated function returns is also returned by the decorator.

Let’s take a look at an example of the decorator:

prints:

this is the decorator wrapper

2.0

_wrapper

Here we see that the _wrapper ran and returned what test_deco was supposed to return. So far, so good. However, the decorated function’s name has changed! It is _wrapper instead of test_deco!

To fix that, we use the wraps decorator from the functools module. It will copy the original function’s info and pass it to the _wrapper function:

Now, if we run the examples, we see that the names are correct and that arguments from the decorated function are passed to the _wrapper function.

prints:

this is the decorator wrapper

2.0

test_deco

this is the decorator wrapper

5.0

test_deco_add

Next on the list is a decorator that can take arguments. In this case, we create an outer function that will take the decorator arguments and return the same decorator as in the case of no arguments:

For example:

prints:

this is the decorator _wrapper, deco_arg_str=’foo’

5.0

test_deco_add

Works as expected, the decorator argument is passed to the _wrapper function.

The retry decorator

At last, the first example. Here we implement the retry decorator discussed in the introduction. The decorator logic is the same as in the previous section with the more straightforward example. However, here we introduce the retry logic into the decorator.

This decorator executes the decorated function and returns whatever it is supposed to return but catches a custom exception and retries for n_retry times. Each retry, there is an increasing sleep time.

We use a logger to know whether there was a retry attempt. A logger can be provided as an argument, or a new logger will be created if there is none supplied.

In this example, we want to catch ValueError. It tries two times and logs the exception, but there is no crash; after the 2nd time, the program crashes because the exception is raised.

In this example, the function raises ValueError (crash) for the first time since we want to catch RunTimeError.

To catch all exceptions (excluding system-exiting ones), use Exception as ExceptionToRetry as all non-system-exiting exceptions are subclasses.

Class registry decorator

The second example is a class decorator. Yes, also classes can be decorated, not just functions. To decorate a class, the decorator returns a function, but this function (_wrapper in our case) returns an uninstantiated class.

This particular example implements a class registry. It is helpful for plugins or interface implementations. Sometimes, we do not know about the existence of objects until runtime.

Think about a plugin object; the object adheres to an interface, but the main program does not know about the plugins, nor does it need them to run. The main program loads whatever plugins are available. Hence we can add as many plugins as we want without changing the main code.

Without further ado, here is the class registry decorator example:

The decorator only takes the name of the registry as the argument. To avoid duplicates, the registry is indexed by a tuple of the registry’s name and the class’s name.

Additionally, there is a function to query the registry (this is how the main program would know about plugins).

For example, let’s add these classes to the registry:

We query the registry:

prints:

Foo

{‘Foo’: <class ‘__main__.Foo’>, ‘Baz’: <class ‘__main__.Baz’>}

The classes are there. We are golden.

Periodic execution decorator

The last example is a periodic function decorator. Sometimes we want to add some scheduling logic to a function within our application. For example, to an HTTP request to get some data every n minutes.

As it is the motivation for the retry decorator, we do not want to mess with the logic of the actual function by adding the scheduling logic. So instead, we create a decorator that does it for us. It’s cleaner.

The only requirement we ask is that the decorator function is void, i.e., it returns None. We will be executing the code inside it and are not concerned with the function’s return.

As you can see, we use the threading module to run the scheduled function in a separate thread and not block the main thread. This way, you can run the function, and the scheduling loop will not block:

prints:

foo 1656078663.629107
after periodic
foo 1656078664.6331909
foo 1656078665.6351948
foo 1656078666.635317
foo 1656078667.638187

As you can see, “after periodic” gets printed before the rest of the function periodic executions, i.e., non-blocking.

Final Words

This story was a quick tour of decorators and some applications in data science projects. Hopefully, you will code more of your own decorators and use them in your code. I strongly encourage you to start playing with them until you become confident to use them in production code.


If you are not using Python decorators yet, you should. Pure syntactic sugar.

Image by author.

I remember the first time I saw an “@” sign on top of a function in Python code. I felt compelled to research what was this weird syntax. It marked a before and after, that is for sure. The “@” sign on top of the function is called a decorator, a function of the function it decorates.

You can spend years as a data scientist and not use decorators. Or maybe you have used them but have not learned how to code your own. This story aims to build concrete decorator examples useful in many data science tasks, and, in the process, learn how to code decorators step by step.

The primary purposes for decorators are:

  • add functionality to existing functions and classes
  • serve as a note left in the code when doing something that might not go over a code review otherwise

This story will focus on the former, adding functionality to existing functions and classes.

To illustrate this point, imagine that you have many functions that make queries to a database. After coding the functions to query everything you need from the database, you realize that one of every ten attempts fails due to some random connection error. We all know how it is to work with databases.

You will need to add some retry logic to all the functions. Said logic will take shape as a loop for several attempts with possibly some waiting time between them — boilerplate code. However, the essential logic is already coded in the functions as they are.

So you have two options:

  • Add the boilerplate to all functions; this will require copy-pasting on an industrial scale and altering each function’s logic to accommodate the retry. Messy and smelly code.
  • Code a retry decorator and add it on top of the database functions (one line of code); otherwise, leave the functions’ logic intact. Clean and elegant code.

I certainly know which option I would choose.

The story is structured as follows:

  • Decorator basics
  • The retry decorator
  • Class registry decorator
  • Periodic execution decorator
  • Final Words

Decorator basics

The simplest decorator is a function that takes the function it decorates as the argument and returns another function which in turn returns whatever the decorated function is supposed to return. Tongue twister.

It is easier to learn by example:

What decorator_no_info actually returns is the _wrapper function. That is the essence of decorators; change one function by another. The _wrapper function takes whatever arguments and keyword arguments the decorated function (func) takes and returns whatever func returns. An additional print in _wrapper shows you that the code block ran.

Regarding the type hints, the purpose of RT (return type) is to show that whatever the decorated function returns is also returned by the decorator.

Let’s take a look at an example of the decorator:

prints:

this is the decorator wrapper

2.0

_wrapper

Here we see that the _wrapper ran and returned what test_deco was supposed to return. So far, so good. However, the decorated function’s name has changed! It is _wrapper instead of test_deco!

To fix that, we use the wraps decorator from the functools module. It will copy the original function’s info and pass it to the _wrapper function:

Now, if we run the examples, we see that the names are correct and that arguments from the decorated function are passed to the _wrapper function.

prints:

this is the decorator wrapper

2.0

test_deco

this is the decorator wrapper

5.0

test_deco_add

Next on the list is a decorator that can take arguments. In this case, we create an outer function that will take the decorator arguments and return the same decorator as in the case of no arguments:

For example:

prints:

this is the decorator _wrapper, deco_arg_str=’foo’

5.0

test_deco_add

Works as expected, the decorator argument is passed to the _wrapper function.

The retry decorator

At last, the first example. Here we implement the retry decorator discussed in the introduction. The decorator logic is the same as in the previous section with the more straightforward example. However, here we introduce the retry logic into the decorator.

This decorator executes the decorated function and returns whatever it is supposed to return but catches a custom exception and retries for n_retry times. Each retry, there is an increasing sleep time.

We use a logger to know whether there was a retry attempt. A logger can be provided as an argument, or a new logger will be created if there is none supplied.

In this example, we want to catch ValueError. It tries two times and logs the exception, but there is no crash; after the 2nd time, the program crashes because the exception is raised.

In this example, the function raises ValueError (crash) for the first time since we want to catch RunTimeError.

To catch all exceptions (excluding system-exiting ones), use Exception as ExceptionToRetry as all non-system-exiting exceptions are subclasses.

Class registry decorator

The second example is a class decorator. Yes, also classes can be decorated, not just functions. To decorate a class, the decorator returns a function, but this function (_wrapper in our case) returns an uninstantiated class.

This particular example implements a class registry. It is helpful for plugins or interface implementations. Sometimes, we do not know about the existence of objects until runtime.

Think about a plugin object; the object adheres to an interface, but the main program does not know about the plugins, nor does it need them to run. The main program loads whatever plugins are available. Hence we can add as many plugins as we want without changing the main code.

Without further ado, here is the class registry decorator example:

The decorator only takes the name of the registry as the argument. To avoid duplicates, the registry is indexed by a tuple of the registry’s name and the class’s name.

Additionally, there is a function to query the registry (this is how the main program would know about plugins).

For example, let’s add these classes to the registry:

We query the registry:

prints:

Foo

{‘Foo’: <class ‘__main__.Foo’>, ‘Baz’: <class ‘__main__.Baz’>}

The classes are there. We are golden.

Periodic execution decorator

The last example is a periodic function decorator. Sometimes we want to add some scheduling logic to a function within our application. For example, to an HTTP request to get some data every n minutes.

As it is the motivation for the retry decorator, we do not want to mess with the logic of the actual function by adding the scheduling logic. So instead, we create a decorator that does it for us. It’s cleaner.

The only requirement we ask is that the decorator function is void, i.e., it returns None. We will be executing the code inside it and are not concerned with the function’s return.

As you can see, we use the threading module to run the scheduled function in a separate thread and not block the main thread. This way, you can run the function, and the scheduling loop will not block:

prints:

foo 1656078663.629107
after periodic
foo 1656078664.6331909
foo 1656078665.6351948
foo 1656078666.635317
foo 1656078667.638187

As you can see, “after periodic” gets printed before the rest of the function periodic executions, i.e., non-blocking.

Final Words

This story was a quick tour of decorators and some applications in data science projects. Hopefully, you will code more of your own decorators and use them in your code. I strongly encourage you to start playing with them until you become confident to use them in production code.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment