Techno Blender
Digitally Yours.

Cython for absolute beginners: 30x faster code in two simple steps | by Mike Huls | May, 2022

0 74


Easy Python code compilation for blazingly fast applications

Let’s speed up our code (image by Abed Ismail on Unsplash)

Python is very easy to work with; clear syntax, the interpreter and duck-typing allow you to develop very quickly. There are some downsides though: if you don’t have to adhere to a strict syntax, then Python has to do some extra work to get your code to run, causing some functions e.g. to execute very slowly because it has to to all those checks again and again.

Combine the ease and speed of developing with Python with the speed of C: the best of both worlds

In this article we’ll take a “slow” function from a vanilla Python project and make it 30x faster. We do this by using a package called Cython that will convert our Python-code to a compiled, superfast piece of C-code that we can directly import into our project again.

A package called CythonBuilder will automatically Cythonize Python code for us in just 2 steps. With CythonBuilder we’ll Cythonize a function from an example project we’ll define below. Let’s code!

For those unfamiliar with Cython and CythonBuilder we’ll answer some exploratory questions. Then we’ll define our example project. We’re going to be using the command line so read up on that if you’re unfamiliar:

What is Cython / why use Cython?

Cython converts Python-code to a file that contains instructions for the CPU. The Python interpreter doesn’t have to perform any check anymore on this file; it can just run it. This results in a major performance increase. Check the article below for more detailed information on how Python works under the hood and how it compares to C:

When you Cythonize a piece of code you add extra information to your code; defining types e.g. Then the code is compiled so that Python doesn’t have to perform the extra checks. Again, check the article above for a more in-depth analysis.

How does Cython work

Just like you write Python code in a .py file, you write Cython code in a .pyx files. Cython will then convert these files to either a .so file or a .pyd file(depending on your OS). These files can be directly imported into your python project again:

Cythonizing a pyx file (image by author)

Can all code be optimized by compiling?

Not all code is better off compiled. Awaiting the response of an API is not faster in a C-package for example. In short: we focus on _CPU-heavy_ tasks that require a lot of calculating. Read more in the article below for a more clear distinction.

CythonBuilder — automating Cythonizing

How do you actually Cythonize your .pyx files? This process is pretty complex; you have to create a setup.py, define all your packages and then run some commands (see the article below). Instead, we’re going to use CythonBuilder: a package that automates everything for us: build your .pyx file in one command!

Your code after the two steps (image by Sammy Wong on Unsplash)

Our project contains a function that, for some reason, calculates a number of primes. This function takes a lot of computations that we can optimize.
Lets first install cythonbuilder with pip install cythonbuilder and then define the regular prime-calculating function

Preparation — the vanilla Python primecalculation function

The function is pretty simple: we’ll pass a number to the function and it returns the number of prime numbers between 0 and the target number:

This function is pure Python. It could be optimized a bit more but the goal is to have a function that performs a lot of calculations. Let’s check out how long it takes this function to find the number of primes between 0 and 100.000:

PurePython: 29.4883812 seconds

Step 1. — Cythonize

In this part we’ll introduce Cython. We’ll copy the code of our function and save it into a file called cy_count_primes.pyx (notice the .pyx).

Next we cd projectfolder and call cythonbuilder build. This will find all of the pyx-files in the projectfolder and build them. The result is a .pyd file on Windows or a .so file on Linux. This file is a compiled version of our Python function that we can directly import in our project:

from someplace.cy_count_primes import count_primes
print(count_primes(100_000))

Lets check out how it performs:

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython

Already over 2 times faster! Notice that we haven’t actually changed anything to our Python code. Let’s optimize the code.

Interfaces:
As you’ll notice even your IDE can inspect the imported files. It knows which functions are present and which arguments are required even though the file is compiled. This is possible because CythonBuilder also builds .pyi files; these are interface files that provide the IDE with information abouth the pyd files.

Step 2 — add types

In this part we add types to the cy_count_primes.pyx file and then build it again:

As you see we define our function with cpdef (both accessible by c and p(ython), tell it to return an int (before count_primes) and that it expects an the limit argument to be an int.

Next, on lines 2,3 and 4, we define the types for some of the variables that are used in the loops; nothing fancy.

Now we can cythonbuilder build again and time our function again:

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython
Cy+Types : 1.1600970 seconds (25.419 faster than PurePython

That’s a very impressive speedup!
The reason why this is so much faster is not within the scope of this article but it has to do with the way Python stores its variables in memory. It’s pretty inefficient compared to C so our C-compiled code can run much faster. Check out this article that dives deep in how Python and C differ from each other under the hood (and why C is so much faster).

Bonus — compilation options

We’ve already improved code execution by 25x but I think we can squeeze a bit more out of it. We’ll do this with compiler directives. These take a little bit of explanation:

Because Python is an interpreted language it has to perform a lot of checks at run-time, for example if your program divides by zero. In C, a compiled language, these check happen at build-time; these errors are spotted when compiling. The advantage is that your program can run more efficiently since it doesn’t have to perform these checks at runtime.

With compiler directives we can disable all these checks, but only if we know we don’t need them. In the example below we upgrade our previous code with 4 decorators that:

  • prevent checks on ZeroDivisionError
  • prevent checks on IndexErrors (calling myList[5] when the list only contains 3 items)
  • prevents checks on isNone
  • prevents wraparound; prevents extra checks that are required for calling a list relative to the end like mylist[-5])

Let’s re-build our code (cythonbuilder build) again and see what time-save skipping all these check offer

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython
Cy+Types : 1.1600970 seconds (25.419 faster than PurePython
Cy+Options: 0.9562186 seconds (30.838 faster than PurePython

We’ve shaved off another 0.2 seconds!

The final results (lower is better) (image by author)

Even more speed?

It is possible to speed up our code even more by making use of our multiple cores. Check out the article below for a guide on how to apply multi-processing and threading in Python programs. Also check out this article that shows you how to multi-process Cython code and explains Cython’s annotation files: graphical overviews of which parts of your code can be further optimized. Very convenient!

CythonBuilder makes it easy to speed up our Python code using Cython.
As we’ve seen just copying our Python code and building doubles the execution speeds! The greatest speed increase is by adding types; resulting in a 25x speed increase relative to vanilla Python.

I hope everything was as clear as I hope it to be but if this is not the case please let me know what I can do to clarify further. In the meantime, check out my other articles on all kinds of programming-related topics like these:

Happy coding!

— Mike

P.S: like what I’m doing? Follow me!


Easy Python code compilation for blazingly fast applications

Let’s speed up our code (image by Abed Ismail on Unsplash)

Python is very easy to work with; clear syntax, the interpreter and duck-typing allow you to develop very quickly. There are some downsides though: if you don’t have to adhere to a strict syntax, then Python has to do some extra work to get your code to run, causing some functions e.g. to execute very slowly because it has to to all those checks again and again.

Combine the ease and speed of developing with Python with the speed of C: the best of both worlds

In this article we’ll take a “slow” function from a vanilla Python project and make it 30x faster. We do this by using a package called Cython that will convert our Python-code to a compiled, superfast piece of C-code that we can directly import into our project again.

A package called CythonBuilder will automatically Cythonize Python code for us in just 2 steps. With CythonBuilder we’ll Cythonize a function from an example project we’ll define below. Let’s code!

For those unfamiliar with Cython and CythonBuilder we’ll answer some exploratory questions. Then we’ll define our example project. We’re going to be using the command line so read up on that if you’re unfamiliar:

What is Cython / why use Cython?

Cython converts Python-code to a file that contains instructions for the CPU. The Python interpreter doesn’t have to perform any check anymore on this file; it can just run it. This results in a major performance increase. Check the article below for more detailed information on how Python works under the hood and how it compares to C:

When you Cythonize a piece of code you add extra information to your code; defining types e.g. Then the code is compiled so that Python doesn’t have to perform the extra checks. Again, check the article above for a more in-depth analysis.

How does Cython work

Just like you write Python code in a .py file, you write Cython code in a .pyx files. Cython will then convert these files to either a .so file or a .pyd file(depending on your OS). These files can be directly imported into your python project again:

Cythonizing a pyx file (image by author)

Can all code be optimized by compiling?

Not all code is better off compiled. Awaiting the response of an API is not faster in a C-package for example. In short: we focus on _CPU-heavy_ tasks that require a lot of calculating. Read more in the article below for a more clear distinction.

CythonBuilder — automating Cythonizing

How do you actually Cythonize your .pyx files? This process is pretty complex; you have to create a setup.py, define all your packages and then run some commands (see the article below). Instead, we’re going to use CythonBuilder: a package that automates everything for us: build your .pyx file in one command!

Your code after the two steps (image by Sammy Wong on Unsplash)

Our project contains a function that, for some reason, calculates a number of primes. This function takes a lot of computations that we can optimize.
Lets first install cythonbuilder with pip install cythonbuilder and then define the regular prime-calculating function

Preparation — the vanilla Python primecalculation function

The function is pretty simple: we’ll pass a number to the function and it returns the number of prime numbers between 0 and the target number:

This function is pure Python. It could be optimized a bit more but the goal is to have a function that performs a lot of calculations. Let’s check out how long it takes this function to find the number of primes between 0 and 100.000:

PurePython: 29.4883812 seconds

Step 1. — Cythonize

In this part we’ll introduce Cython. We’ll copy the code of our function and save it into a file called cy_count_primes.pyx (notice the .pyx).

Next we cd projectfolder and call cythonbuilder build. This will find all of the pyx-files in the projectfolder and build them. The result is a .pyd file on Windows or a .so file on Linux. This file is a compiled version of our Python function that we can directly import in our project:

from someplace.cy_count_primes import count_primes
print(count_primes(100_000))

Lets check out how it performs:

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython

Already over 2 times faster! Notice that we haven’t actually changed anything to our Python code. Let’s optimize the code.

Interfaces:
As you’ll notice even your IDE can inspect the imported files. It knows which functions are present and which arguments are required even though the file is compiled. This is possible because CythonBuilder also builds .pyi files; these are interface files that provide the IDE with information abouth the pyd files.

Step 2 — add types

In this part we add types to the cy_count_primes.pyx file and then build it again:

As you see we define our function with cpdef (both accessible by c and p(ython), tell it to return an int (before count_primes) and that it expects an the limit argument to be an int.

Next, on lines 2,3 and 4, we define the types for some of the variables that are used in the loops; nothing fancy.

Now we can cythonbuilder build again and time our function again:

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython
Cy+Types : 1.1600970 seconds (25.419 faster than PurePython

That’s a very impressive speedup!
The reason why this is so much faster is not within the scope of this article but it has to do with the way Python stores its variables in memory. It’s pretty inefficient compared to C so our C-compiled code can run much faster. Check out this article that dives deep in how Python and C differ from each other under the hood (and why C is so much faster).

Bonus — compilation options

We’ve already improved code execution by 25x but I think we can squeeze a bit more out of it. We’ll do this with compiler directives. These take a little bit of explanation:

Because Python is an interpreted language it has to perform a lot of checks at run-time, for example if your program divides by zero. In C, a compiled language, these check happen at build-time; these errors are spotted when compiling. The advantage is that your program can run more efficiently since it doesn’t have to perform these checks at runtime.

With compiler directives we can disable all these checks, but only if we know we don’t need them. In the example below we upgrade our previous code with 4 decorators that:

  • prevent checks on ZeroDivisionError
  • prevent checks on IndexErrors (calling myList[5] when the list only contains 3 items)
  • prevents checks on isNone
  • prevents wraparound; prevents extra checks that are required for calling a list relative to the end like mylist[-5])

Let’s re-build our code (cythonbuilder build) again and see what time-save skipping all these check offer

PurePython: 29.4883812 seconds
CyPython : 14.0540504 seconds (2.0982 faster than PurePython
Cy+Types : 1.1600970 seconds (25.419 faster than PurePython
Cy+Options: 0.9562186 seconds (30.838 faster than PurePython

We’ve shaved off another 0.2 seconds!

The final results (lower is better) (image by author)

Even more speed?

It is possible to speed up our code even more by making use of our multiple cores. Check out the article below for a guide on how to apply multi-processing and threading in Python programs. Also check out this article that shows you how to multi-process Cython code and explains Cython’s annotation files: graphical overviews of which parts of your code can be further optimized. Very convenient!

CythonBuilder makes it easy to speed up our Python code using Cython.
As we’ve seen just copying our Python code and building doubles the execution speeds! The greatest speed increase is by adding types; resulting in a 25x speed increase relative to vanilla Python.

I hope everything was as clear as I hope it to be but if this is not the case please let me know what I can do to clarify further. In the meantime, check out my other articles on all kinds of programming-related topics like these:

Happy coding!

— Mike

P.S: like what I’m doing? Follow me!

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment