Techno Blender
Digitally Yours.

Python Documentation Testing, doctest, Marcin Kozak

0 42


doctest allows for documentation, unit and integration testing, and test-driven development

Four bookcases spelling out the word “READ”.
doctest allows for keeping up-to-date code documentation. Photo by Ishaq Robin on Unsplash

Code testing does not have to be difficult. What’s more, testing makes coding easier and faster — and even, at least for some developers, more pleasurable. For testing to be pleasurable, however, the testing framework we use needs to be user-friendly.

Python offers several testing frameworks, currently three of the most popular being pytest and built-in unittest and doctest. The first two are focused on unit testing. doctest is different, its main purpose — but definitely not the only one — being documentation testing.

doctest is also the main purpose of this article. I will introduce this particularly interesting tool, hoping you will see that despite its simplicity, it’s very useful. To be honest, doctest is my favorite Python package, as I stressed in this “PyDev of the Week” interview. Above, I wrote that if we want testing to be pleasurable, we need a user-friendly tool. To say that doctest is user-friendly is quite an understatement — it’s the simplest and user-friendliest testing tool I’ve ever met.

Having said that, I consider it a puzzle why most Pythonistas I know use doctest seldom or do not use it at all. I hope this article will convince them — and you — that it’s worth introducing doctest to daily coding routine.

I mentioned doctest is simple. Indeed, it’s so simple that reading this short article will be enough for you to use it!

doctest is a standard-library testing framework, so it comes with your Python installation. It means you do not have to install it. While it isn’t designed to be used in complicated unit testing scenarios, it is simple to use and yet powerful in most situations.

The doctest module offers what neither pytest nor unittest offers: documentation testing. To put it simply, documentation testing is useful in testing whether your code documentation is up-to-date. That’s quite some functionality, particularly in big projects: with just one command, you can check whether all examples are correct. Therefore, it replaces reading through code examples in the documentation over and over again, after each new commit, merge and release. You can also add doctesting into a CI/CD pipeline.

The approach doctest takes uses what we call regression testing. In principle, you run the code from examples, save the output, and in the future you can run doctest to check whether the examples provide the same output — which means they work the way they should. As you see, this limits the use of this tool to objects that do not change from session to session. So, you cannot use this method to test random elements of the output. One example is objects’ ids, which will not be the same from session to session. This limitation is not doctest-specific, but often it seems to be more significant in this testing framework than in others. Fortunately, doctest provides tools to deal with this limitation.

As with any testing tool, you can — and should — run doctest after making any changes to your code, in order to check out whether the changes did not affect the output. Sometimes they do, due to their character (e.g., you fixed a bug and now the function works correctly — but differently); if this is the case, you should update the corresponding tests. If the changes shouldn’t affect the output but they do, as indicated by failing tests, something is wrong with the code, and so it is the code that needs further changes.

To write doctests, you need to provide both the code and the expected output. In practice, you write the tests in a similar way to what the code and its output look like in the Python interpreter.

Let me picture this using a simple example. Consider the following function, made part of a doctest:

>>> def foo():
... return 42

(You can import this function, but we will return to this later.) And this is how I see it in my Python interpreter:

Code from Python interpreter — def foo(): return 42
The definition of the foo() function in the Python interpreter. Image by author

We see a minor difference. When you define a function in an interpreter, you need to hit the Enter key twice, which is why we see the additional ellipsis at the end of the function. doctest code does not need this additional ellipsis line. What’s more, when you code in the interpreter, you do not have to add an additional white space before each ellipsis line (after the ellipsis), while you need to do it in doctest code; otherwise, the indentation will look as though it was composed of three spaces, not four. And one more difference — and a nice one, actually — is that in the Python interpreter, you will see no code highlighting. You may see it when you write doctests, for instance, in Markdown or reStructuredText files.

I think you’ll agree these are not big differences, and that they are good differences. Thanks to them, doctest code looks nice.

The >>> and ... (the Python ellipsis), thus, constitute the essential parts of tests, as doctest uses them to recognize that a new command has just started (>>>) and that the command from the previous line is being continued (...).

The above cope snippet defining the foo() function still is not a valid doctest, as it only defines the function. To write a test, we need to call the function and include its expected output:

>>> foo()
42

And that’s it! This is our first doctest — and you’ve already learned most of what you need to know about this tool! Altogether, this doctest will look as follows:

>>> foo()
42
>>> def foo():
... return 42

You can run tests from shell. You can also do it from the Python interpreter, but since it needs the same code as one of the shell methods, I will not focus on this; you will seldom need it, too. ???unclear sentence!!!

Assume the test is saved as doctest_post_1.md. To run it, open shell, navigate to the folder in which the test file is located, and run the following command (it will work in both Windows and Linux):

$ python -m doctest doctest_post_1.md

If the test passes, you will see nothing. If it fails, you will see this in the shell. To see how it works, let’s change 42 in the test to 43:

>>> foo()
43

This will be the output:

Output of a doctest that failed.
Output from a doctest (which failed) run in shell. Image by author

You can do it in another way. Consider the following module:

# mathmodule.py
"""The simplest math module ever.

You can use two functions:
>>> add(2, 5.01)
7.01
>>> multiply(2, 5.01)
10.02
"""

def add(x, y):
"""Add two values.

>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
"""
return x + y

def multiply(x, y):
"""Multiple two values.

>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
"""
return x * y

if __name__ == "__main__":
import doctest

doctest.testmod()

With this name-main block, you can simplify the shell command:

$ python doctest_post_1.md

This runs the module, and running it means running all its doctests.

When you want to see a detailed output, add a -v flag. Below, I used this flag, which led to the following output:

$ python doctest_post_1.md -v
Trying:
add(2, 5.01)
Expecting:
7.01
ok
Trying:
multiply(2, 5.01)
Expecting:
10.02
ok
Trying:
add(1, 1)
Expecting:
2
ok
Trying:
add(-1.0001, 1.0001)
Expecting:
0.0
ok
Trying:
multiply(1, 1)
Expecting:
1
ok
Trying:
multiply(-1, 1)
Expecting:
-1
ok
3 items passed all tests:
2 tests in __main__
2 tests in __main__.add
2 tests in __main__.multiply
6 tests in 3 items.
6 passed and 0 failed.
Test passed.

It’s rather cluttered, and to be honest, I almost never use it. It’s main advantage is what we see at the end, so a summary of the tests. When developing, however, I do not need it, since I usually run the tests in order to check if a particular function passes the tests or not. Thus, and this is important, doctests should be rather fast, as you normally do not request a particular test, you simply run all tests from a module.

Above, we showed how to write and run doctests, so it’s time to consider the practicalities, with perhaps a little more complex examples. There are two general locations of doctest tests:

  • in function/class/method docstrings
  • in documentation files

Since they can significantly differ, let’s discuss them one by one.

doctests in docstrings

Consider the following function:

# mean_module.py
from typing import List, Union

def mean(x: List[Union[float, int]]) -> float:
"""Calculate the mean of a list of floats.

>>> x = [1.12, 1.44, 1.123, 6.56, 2.99]
>>> round(mean(x), 2)
2.65
"""
return sum(x) / len(x)

This is how you can write doctest tests in a docstring. Here, we did in in a function docstring, but you can write it in any docstring, placed in the four doctest locations:

  • a module docstring
  • a function docstring (like in the example above)
  • a class docstring
  • a method docstring

Using docstrings is, I think, the most common use of doctest. Remember that whatever a function’s docstring contains, it will occur in its help page, which you can see in the Python interpreter by calling the help() function. For example:

>>> from mean_module import mean
>>> help(mean)

The interpreter will disappear, and instead you will see the below help page of the mean() function:

Help on function mean in module mean_module: … and here the help is written
The help page of the mean() function. Image by author

This is another thing I appreciate in doctest tests: combined with docstrings, they make help pages, and so code documentation, clean and up-to-date. Sometimes functions or methods are so complex that it’s very difficult to see how they should be used. Then, oftentimes, something as small as a doctest test or two added to a docstring will be more informative than the type hints and the code and the rest of the docstring together.

Remember, however, to not overwhelm docstrings with doctests. While it may be tempting to include many such tests in them, you should not do that. Reading such docstrings is unpleasant. It’s better to keep crucial tests in a docstring and move the remaining ones elsewhere. You can either move them to a dedicated doctest file (or files) or translate them into pytest tests. I’ve used both solutions.

doctests in documentation files

You can write doctests in files of various types. I prefer Markdown (.md) but you can do it in reStructuredText (.rst) and even text (.txt) files. Do not try to do that in files that use specific coding, however, like of EOF. For instance, doctest does not work with .rtf files.

The below code block presents an example of a Markdown file with doctests. To save space, I will include here only basic tests, but they are enough to show how to create such files.

Consider the following README file (README.md) of the mathmodule module:

# Introduction

You will find here documentation of the mathmodule Python module.

It contains as many as two functions, `add()` and `multiply()`.

First, let's import the functions:

```python
>>> from mathmodule import add, multiply

```

## Using `add()`

You can use the `add()` function in the following way:

```python
>>> add(1, 1)
2
>>> add(2, 5.01)
7.01

```

## Using `multiply()

You can use the `multiply()` function in the following way:

```python
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1

```

As you see, there is no philosophy here: you add doctests in typical code blocks. There are two things you should remember:

  • Add an empty line before finishing a code block. Otherwise, doctest will treat ``` as part of the output.
  • Don’t forget to import all the objects you’re going to use in the test file; here, these are the add() and multiply() functions. This may seem to you a basic mistake, maybe even too basic. Even if it is basic, I’ve made it quite often; I made even here, when writing the above README file.

As you can see, I included all the tests inside code blocks, but even tests written outside of code blocks would be run. I do not see any point in doing so, however.

Above, we’ve learned the very basic use of doctest. We can do more with it, however. We can use most of additional doctest functionalities by using so-called directives.

Directives are added directly after the code being tested, using specific syntax shown below. When the command that needs a directive is splitted into several lines, you can add the directive into any of them (I will show an example in a second). Directives change the behavior of doctest; for instance, the test can ignore part of the output, normalize white spaces, catch exceptions, and the like.

Ellipsis

Perhaps the most important and most frequently used directive is the ellipsis: # doctest: +ELLIPSIS. Below, you can see two tests for the above multiply function, one without and the next with the ELLIPSIS directive:

>>> multiply(2.500056, 1/2.322)
1.0766821705426355
>>> multiply(2.500056, 1/2.322) # doctest: +ELLIPSIS
1.076...

So, you need to add the directive after the tested code and the ellipsis inside the output, and whatever is printed in the output where the ellipsis is will be ignored. This is another example:

>>> multiply
<function multiply at 0x7...>

In the example above, you can use the closing > character, but you don’t have to.

Long lines in output

Long lines can be a burden. We have two methods to deal with too long lines in output: (i) \ and (ii) the NORMALIZE_WHITESPACE directive.

Method (i): using \. Like in Python code, we can use a backslash (\) to split the output into several lines, like here:

>>> s1 = "a"*20
>>> s2 = "b"*40
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'

This test will indeed pass. The above format of the output is pretty ugly, but it will be much worse when we need to split lines in a function, class or method docstring:

def add(x, y):
"""Add two values.

>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
"""
return x + y

As you see, splitting the line into two required us to move the second line to its very beginning; otherwise, doctest would see the following output:

'aaaaaaaaaaaaaaaaaaaa    bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'

and so the test would fail.

Note that this is the only method to deal with splitting strings between lines. When you need to split long output (e.g., a list, a tuple and the like), the next method will work better.

Method (ii): the NORMALIZE_WHITESPACE directive. This method does not use the ugly backslash. Instead, it uses the NORMALIZE_WHITESPACE directive:

>>> multiply([1, 2], 10) #doctest: +NORMALIZE_WHITESPACE
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

This test would not pass without the directive:

Failed example:
multiply([1, 2], 10)
Expected:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
Got:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

but it passes with the directive. Whenever I can, I use this directive instead of the backslash, except for splitting long strings.

Exceptions

Above, we discussed the ellipsis directive. Ellipsis, however, help also in exception handling, that is, when we want to test an example that throws an error:

>>> add(1, "bzik")
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'str'

The output clearly shows that an exception has been raised and that we test whether it was the expected exception.

Do you see a difference between this example of the ellipsis and the previous ones? Note that even though we do use the ellipsis here, we do not have to provide the ellipsis directive when testing the output with an exception. I suppose this is because catching exceptions is so frequent in testing that the doctest creator decided to built this use of ellipsis into the doctest syntax without the need of adding the ellipsis directive.

We do not have to provide the ellipsis directive when testing the output with an exception.

Debugging

doctest offers several mechanisms for debugging. Since we’re talking about the basics¹, I will present only one mechanism, which is the simplest but also, at least for me, both the most intuitive and the most useful. To be honest, ever since I’ve been using doctest, I use only this method, and it’s more than enough for my needs.

The method uses the built-in pdb module and, also built-in, the breakpoint() function. We can debug using doctest in two ways:

  1. Debug a particular function tested via doctest.
  2. Debug a doctest session.

Ad 1. Debug a particular function tested via doctest. In this case, you set a point, using breakpoint(), in a function and run doctest. It’s nothing more than standard debugging using breakpoint(), but by running doctest.

Let’s use doctest to debug the following module, it’s named play_debug.py, containing only one function.

def play_the_debug_game(x, y):
"""Let's play and debug.

>>> play_the_debug_game(10, 200)
142
"""
x *= 2
breakpoint()
x += 22
y *= 2
y -= 300
return x + y

Below, you will find shell screenshot, containing a debug session after running doctest on the module. In the output, I replaced the path with the ellipsis.

$ python -m doctest play_debug.py
> .../play_debug.py(9)play_the_debug_game()
-> x += 22
(Pdb) l
4 >>> play_the_debug_game(10, 200)
5 142
6 """
7 x *= 2
8 breakpoint()
9 -> x += 22
10 y *= 2
11 y -= 300
12 return x + y
[EOF]
(Pdb) x
20
(Pdb) n
> .../play_debug.py(10)play_the_debug_game()
-> y *= 2
(Pdb) x
42
(Pdb) c

So, this is a standard use of the pdb debugger. It is not really doctest debugging, but debugging of the code that is run via doctest. I use this approach often during code development, especially when the code is not yet enclosed in a working application.

Ad 2. Debug a doctest session. Unlike the previous method, this one means debugging of the actual testing session. You do it in a similar way, but this time, you add a point in a test, not in a function. For instance:

def play_the_debug_game(x, y):
"""Let's play and debug.

>>> game_1 = play_the_debug_game(10, 200)
>>> game_2 = play_the_debug_game(1, 2)
>>> breakpoint()
>>> game_1, game_2
(142, -272)
"""
x *= 2
x += 22
y *= 2
y -= 300
return x + y

And now a session:

$ python -m doctest play_debug.py
--Return--
> <doctest play_debug.play_the_debug_game[2]>(1)<module>()->None
-> breakpoint()
(Pdb) game_1
142
(Pdb) game_2
-272
(Pdb) game_1 + game_2
-130
(Pdb) c

This type of debugging can be useful in various situations, in which you want to check what’s going on in the testing session; for instance, when the test behaves in an unexpected way, and so you want to check what’s going on.

In the previous sections, I showed simple examples that use the basic doctest tool. As already mentioned, I seldom use more advanced tools. This is because doctest is designed in a way that indeed its basic tools enable one to achieve a lot. Maybe this is why I like this testing framework that much.

As an example of the advanced use of doctest, I will use two files — one presentation documentation and another presenting unit tests — from one of my Python packages, perftester. If you want to see more examples, you can go to the package’s GitHub repository. As you can read in the test/ folder’s README, I treated perftester as an experiment:

The use of doctesting as the only testing framework in perftester is an experiment. Tests in perftester are abundant, and are collected in four locations: the main README, docstrings in the main perftester module, the docs folder, and this folder.

Note that the main README of the package is full of doctests — but that’s just the beginning. To see more, visit the docs/ folder, which is full of Markdown files, each being a doctesting file, too. While these were testable documentation files, the files in the tests/ folder constitute actual test files. In other words, I used doctest as the only testing framework, for documentation, unit and integration testing. In my eyes, the experiment was successful, and the final conclusion was that when you do not need complex unit and integration tests in your project, you can use doctest as the only only testing framework.

That I did not use pytest in this project does not mean I decided to stop using pytest altogether. My regular approach is joining doctest and pytest. I use doctest for test-based development, but once a function, class or method is ready, I move some of the tests to pytest files, keeping only some of the tests as doctests. I usually write READMEs with doctests, and if I need to add a doc file, I often make it a Markdown file with doctest tests.

Let me show you a fragment (about half) of one of the documentation files from the perftester’s docs folder, docs/most_basic_use_time.md. In the code below, I wrapped two lines, to shorten them.

# Basic use of `perftester.time_test()`

```python
>>> import perftester as pt
>>> def preprocess(string):
... return string.lower().strip()
>>> test_string = " Oh oh the young boy, this YELLOW one, wants to sing a song about the sun.\n"
>>> preprocess(test_string)[:19]
'oh oh the young boy'

```

## First step - checking performance

We will first benchmark the function, to learn how it performs:

```python
>>> first_run = pt.time_benchmark(preprocess, string=test_string)

```

`first_run` gives the following results:

```python
# pt.pp(first_run)
{'raw_times': [2.476e-07, 2.402e-07, 2.414e-07, 2.633e-07, 3.396e-07],
'raw_times_relative': [3.325, 3.226, 3.242, 3.536, 4.56],
'max': 3.396e-07,
'mean': 2.664e-07,
'min': 2.402e-07,
'min_relative': 3.226}
```

Fine, no need to change the settings, as the raw times are rather short,
and the relative time ranges from 3.3 to 4.6.

# Raw time testing

We can define a simple time-performance test, using raw values, as follows:

```python
>>> pt.time_test(preprocess, raw_limit=2e-06, string=test_string)

```

As is with the `assert` statement, no output means that the test has passed.

If you overdo with the limit so that the test fails, you will see the following:

```python
>>> pt.time_test(preprocess, raw_limit=2e-08, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
raw_limit = 2e-08
minimum run time = ...

```

# Relative time testing

Alternatively, we can use relative time testing, which will be more or
less independent of a machine on which it's run:

```python
>>> pt.time_test(preprocess, relative_limit=10, string=test_string)

```

If you overdo with the limit so that the test fails, you will see the following:

```python
>>> pt.time_test(preprocess, relative_limit=1, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
relative_limit = 1
minimum time ratio = ...

```

Note the following:

  • The only directive I used is the ellipsis directive. I did not need anything more advanced.
  • I wanted to present the output of the command # pt.pp(first_run). Since it contained a lot of random values (which represented benchmarking results), I did not make this code block a doctest test. I simply presented the results.

Below, I present one section (Defaults) of a test file tests/doctest_config.md. Unlike the above doc file, this one includes unit tests, not documentation. Hence, you will not see too much text in the whole file — and no text in this section. These are unit tests and nothing more:

## Defaults

```python
>>> import perftester as pt
>>> pt.config.defaults
{'time': {'number': 100000, 'repeat': 5}, 'memory': {'repeat': 1}}

>>> original_defaults = pt.config.defaults
>>> pt.config.set_defaults("time", number=100)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 5}, 'memory': {'repeat': 1}}

>>> pt.config.set_defaults("time", repeat=20)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 20}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=2, number=7)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 1}}

>>> pt.config.set_defaults("memory", repeat=100)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 100}}

>>> pt.config.set_defaults("memory", number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.set_defaults("memory", number=100, repeat=5)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.set_defaults("memory", repeat=5, number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.defaults = original_defaults

```

Note that in this fragment, the only other tool than mere output I used is the ellipsis used to represent exceptions thrown after some of the commands.

Maybe you think that I did not show you too much of doctest… And you’re right. But what you learned will be enough for you to write even quite advanced tests; and in my opinion, that’s most of what you need to know about doctesting.

This does not mean it’s the end of our doctest journey. We will return to it in next articles, as doctesting offers more advanced tools, which, perhaps, you will want to use. We will then learn, for instance, how to use doctest in Test-Driven Development (TDD); and how to use it effectively in PoC projects.

¹ In the case of doctest, basic does not mean for beginners. Although I am using this package almost daily, I seldom use more advanced tools than those I presented in this article.


doctest allows for documentation, unit and integration testing, and test-driven development

Four bookcases spelling out the word “READ”.
doctest allows for keeping up-to-date code documentation. Photo by Ishaq Robin on Unsplash

Code testing does not have to be difficult. What’s more, testing makes coding easier and faster — and even, at least for some developers, more pleasurable. For testing to be pleasurable, however, the testing framework we use needs to be user-friendly.

Python offers several testing frameworks, currently three of the most popular being pytest and built-in unittest and doctest. The first two are focused on unit testing. doctest is different, its main purpose — but definitely not the only one — being documentation testing.

doctest is also the main purpose of this article. I will introduce this particularly interesting tool, hoping you will see that despite its simplicity, it’s very useful. To be honest, doctest is my favorite Python package, as I stressed in this “PyDev of the Week” interview. Above, I wrote that if we want testing to be pleasurable, we need a user-friendly tool. To say that doctest is user-friendly is quite an understatement — it’s the simplest and user-friendliest testing tool I’ve ever met.

Having said that, I consider it a puzzle why most Pythonistas I know use doctest seldom or do not use it at all. I hope this article will convince them — and you — that it’s worth introducing doctest to daily coding routine.

I mentioned doctest is simple. Indeed, it’s so simple that reading this short article will be enough for you to use it!

doctest is a standard-library testing framework, so it comes with your Python installation. It means you do not have to install it. While it isn’t designed to be used in complicated unit testing scenarios, it is simple to use and yet powerful in most situations.

The doctest module offers what neither pytest nor unittest offers: documentation testing. To put it simply, documentation testing is useful in testing whether your code documentation is up-to-date. That’s quite some functionality, particularly in big projects: with just one command, you can check whether all examples are correct. Therefore, it replaces reading through code examples in the documentation over and over again, after each new commit, merge and release. You can also add doctesting into a CI/CD pipeline.

The approach doctest takes uses what we call regression testing. In principle, you run the code from examples, save the output, and in the future you can run doctest to check whether the examples provide the same output — which means they work the way they should. As you see, this limits the use of this tool to objects that do not change from session to session. So, you cannot use this method to test random elements of the output. One example is objects’ ids, which will not be the same from session to session. This limitation is not doctest-specific, but often it seems to be more significant in this testing framework than in others. Fortunately, doctest provides tools to deal with this limitation.

As with any testing tool, you can — and should — run doctest after making any changes to your code, in order to check out whether the changes did not affect the output. Sometimes they do, due to their character (e.g., you fixed a bug and now the function works correctly — but differently); if this is the case, you should update the corresponding tests. If the changes shouldn’t affect the output but they do, as indicated by failing tests, something is wrong with the code, and so it is the code that needs further changes.

To write doctests, you need to provide both the code and the expected output. In practice, you write the tests in a similar way to what the code and its output look like in the Python interpreter.

Let me picture this using a simple example. Consider the following function, made part of a doctest:

>>> def foo():
... return 42

(You can import this function, but we will return to this later.) And this is how I see it in my Python interpreter:

Code from Python interpreter — def foo(): return 42
The definition of the foo() function in the Python interpreter. Image by author

We see a minor difference. When you define a function in an interpreter, you need to hit the Enter key twice, which is why we see the additional ellipsis at the end of the function. doctest code does not need this additional ellipsis line. What’s more, when you code in the interpreter, you do not have to add an additional white space before each ellipsis line (after the ellipsis), while you need to do it in doctest code; otherwise, the indentation will look as though it was composed of three spaces, not four. And one more difference — and a nice one, actually — is that in the Python interpreter, you will see no code highlighting. You may see it when you write doctests, for instance, in Markdown or reStructuredText files.

I think you’ll agree these are not big differences, and that they are good differences. Thanks to them, doctest code looks nice.

The >>> and ... (the Python ellipsis), thus, constitute the essential parts of tests, as doctest uses them to recognize that a new command has just started (>>>) and that the command from the previous line is being continued (...).

The above cope snippet defining the foo() function still is not a valid doctest, as it only defines the function. To write a test, we need to call the function and include its expected output:

>>> foo()
42

And that’s it! This is our first doctest — and you’ve already learned most of what you need to know about this tool! Altogether, this doctest will look as follows:

>>> foo()
42
>>> def foo():
... return 42

You can run tests from shell. You can also do it from the Python interpreter, but since it needs the same code as one of the shell methods, I will not focus on this; you will seldom need it, too. ???unclear sentence!!!

Assume the test is saved as doctest_post_1.md. To run it, open shell, navigate to the folder in which the test file is located, and run the following command (it will work in both Windows and Linux):

$ python -m doctest doctest_post_1.md

If the test passes, you will see nothing. If it fails, you will see this in the shell. To see how it works, let’s change 42 in the test to 43:

>>> foo()
43

This will be the output:

Output of a doctest that failed.
Output from a doctest (which failed) run in shell. Image by author

You can do it in another way. Consider the following module:

# mathmodule.py
"""The simplest math module ever.

You can use two functions:
>>> add(2, 5.01)
7.01
>>> multiply(2, 5.01)
10.02
"""

def add(x, y):
"""Add two values.

>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
"""
return x + y

def multiply(x, y):
"""Multiple two values.

>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
"""
return x * y

if __name__ == "__main__":
import doctest

doctest.testmod()

With this name-main block, you can simplify the shell command:

$ python doctest_post_1.md

This runs the module, and running it means running all its doctests.

When you want to see a detailed output, add a -v flag. Below, I used this flag, which led to the following output:

$ python doctest_post_1.md -v
Trying:
add(2, 5.01)
Expecting:
7.01
ok
Trying:
multiply(2, 5.01)
Expecting:
10.02
ok
Trying:
add(1, 1)
Expecting:
2
ok
Trying:
add(-1.0001, 1.0001)
Expecting:
0.0
ok
Trying:
multiply(1, 1)
Expecting:
1
ok
Trying:
multiply(-1, 1)
Expecting:
-1
ok
3 items passed all tests:
2 tests in __main__
2 tests in __main__.add
2 tests in __main__.multiply
6 tests in 3 items.
6 passed and 0 failed.
Test passed.

It’s rather cluttered, and to be honest, I almost never use it. It’s main advantage is what we see at the end, so a summary of the tests. When developing, however, I do not need it, since I usually run the tests in order to check if a particular function passes the tests or not. Thus, and this is important, doctests should be rather fast, as you normally do not request a particular test, you simply run all tests from a module.

Above, we showed how to write and run doctests, so it’s time to consider the practicalities, with perhaps a little more complex examples. There are two general locations of doctest tests:

  • in function/class/method docstrings
  • in documentation files

Since they can significantly differ, let’s discuss them one by one.

doctests in docstrings

Consider the following function:

# mean_module.py
from typing import List, Union

def mean(x: List[Union[float, int]]) -> float:
"""Calculate the mean of a list of floats.

>>> x = [1.12, 1.44, 1.123, 6.56, 2.99]
>>> round(mean(x), 2)
2.65
"""
return sum(x) / len(x)

This is how you can write doctest tests in a docstring. Here, we did in in a function docstring, but you can write it in any docstring, placed in the four doctest locations:

  • a module docstring
  • a function docstring (like in the example above)
  • a class docstring
  • a method docstring

Using docstrings is, I think, the most common use of doctest. Remember that whatever a function’s docstring contains, it will occur in its help page, which you can see in the Python interpreter by calling the help() function. For example:

>>> from mean_module import mean
>>> help(mean)

The interpreter will disappear, and instead you will see the below help page of the mean() function:

Help on function mean in module mean_module: … and here the help is written
The help page of the mean() function. Image by author

This is another thing I appreciate in doctest tests: combined with docstrings, they make help pages, and so code documentation, clean and up-to-date. Sometimes functions or methods are so complex that it’s very difficult to see how they should be used. Then, oftentimes, something as small as a doctest test or two added to a docstring will be more informative than the type hints and the code and the rest of the docstring together.

Remember, however, to not overwhelm docstrings with doctests. While it may be tempting to include many such tests in them, you should not do that. Reading such docstrings is unpleasant. It’s better to keep crucial tests in a docstring and move the remaining ones elsewhere. You can either move them to a dedicated doctest file (or files) or translate them into pytest tests. I’ve used both solutions.

doctests in documentation files

You can write doctests in files of various types. I prefer Markdown (.md) but you can do it in reStructuredText (.rst) and even text (.txt) files. Do not try to do that in files that use specific coding, however, like of EOF. For instance, doctest does not work with .rtf files.

The below code block presents an example of a Markdown file with doctests. To save space, I will include here only basic tests, but they are enough to show how to create such files.

Consider the following README file (README.md) of the mathmodule module:

# Introduction

You will find here documentation of the mathmodule Python module.

It contains as many as two functions, `add()` and `multiply()`.

First, let's import the functions:

```python
>>> from mathmodule import add, multiply

```

## Using `add()`

You can use the `add()` function in the following way:

```python
>>> add(1, 1)
2
>>> add(2, 5.01)
7.01

```

## Using `multiply()

You can use the `multiply()` function in the following way:

```python
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1

```

As you see, there is no philosophy here: you add doctests in typical code blocks. There are two things you should remember:

  • Add an empty line before finishing a code block. Otherwise, doctest will treat ``` as part of the output.
  • Don’t forget to import all the objects you’re going to use in the test file; here, these are the add() and multiply() functions. This may seem to you a basic mistake, maybe even too basic. Even if it is basic, I’ve made it quite often; I made even here, when writing the above README file.

As you can see, I included all the tests inside code blocks, but even tests written outside of code blocks would be run. I do not see any point in doing so, however.

Above, we’ve learned the very basic use of doctest. We can do more with it, however. We can use most of additional doctest functionalities by using so-called directives.

Directives are added directly after the code being tested, using specific syntax shown below. When the command that needs a directive is splitted into several lines, you can add the directive into any of them (I will show an example in a second). Directives change the behavior of doctest; for instance, the test can ignore part of the output, normalize white spaces, catch exceptions, and the like.

Ellipsis

Perhaps the most important and most frequently used directive is the ellipsis: # doctest: +ELLIPSIS. Below, you can see two tests for the above multiply function, one without and the next with the ELLIPSIS directive:

>>> multiply(2.500056, 1/2.322)
1.0766821705426355
>>> multiply(2.500056, 1/2.322) # doctest: +ELLIPSIS
1.076...

So, you need to add the directive after the tested code and the ellipsis inside the output, and whatever is printed in the output where the ellipsis is will be ignored. This is another example:

>>> multiply
<function multiply at 0x7...>

In the example above, you can use the closing > character, but you don’t have to.

Long lines in output

Long lines can be a burden. We have two methods to deal with too long lines in output: (i) \ and (ii) the NORMALIZE_WHITESPACE directive.

Method (i): using \. Like in Python code, we can use a backslash (\) to split the output into several lines, like here:

>>> s1 = "a"*20
>>> s2 = "b"*40
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'

This test will indeed pass. The above format of the output is pretty ugly, but it will be much worse when we need to split lines in a function, class or method docstring:

def add(x, y):
"""Add two values.

>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
"""
return x + y

As you see, splitting the line into two required us to move the second line to its very beginning; otherwise, doctest would see the following output:

'aaaaaaaaaaaaaaaaaaaa    bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'

and so the test would fail.

Note that this is the only method to deal with splitting strings between lines. When you need to split long output (e.g., a list, a tuple and the like), the next method will work better.

Method (ii): the NORMALIZE_WHITESPACE directive. This method does not use the ugly backslash. Instead, it uses the NORMALIZE_WHITESPACE directive:

>>> multiply([1, 2], 10) #doctest: +NORMALIZE_WHITESPACE
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

This test would not pass without the directive:

Failed example:
multiply([1, 2], 10)
Expected:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
Got:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]

but it passes with the directive. Whenever I can, I use this directive instead of the backslash, except for splitting long strings.

Exceptions

Above, we discussed the ellipsis directive. Ellipsis, however, help also in exception handling, that is, when we want to test an example that throws an error:

>>> add(1, "bzik")
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'str'

The output clearly shows that an exception has been raised and that we test whether it was the expected exception.

Do you see a difference between this example of the ellipsis and the previous ones? Note that even though we do use the ellipsis here, we do not have to provide the ellipsis directive when testing the output with an exception. I suppose this is because catching exceptions is so frequent in testing that the doctest creator decided to built this use of ellipsis into the doctest syntax without the need of adding the ellipsis directive.

We do not have to provide the ellipsis directive when testing the output with an exception.

Debugging

doctest offers several mechanisms for debugging. Since we’re talking about the basics¹, I will present only one mechanism, which is the simplest but also, at least for me, both the most intuitive and the most useful. To be honest, ever since I’ve been using doctest, I use only this method, and it’s more than enough for my needs.

The method uses the built-in pdb module and, also built-in, the breakpoint() function. We can debug using doctest in two ways:

  1. Debug a particular function tested via doctest.
  2. Debug a doctest session.

Ad 1. Debug a particular function tested via doctest. In this case, you set a point, using breakpoint(), in a function and run doctest. It’s nothing more than standard debugging using breakpoint(), but by running doctest.

Let’s use doctest to debug the following module, it’s named play_debug.py, containing only one function.

def play_the_debug_game(x, y):
"""Let's play and debug.

>>> play_the_debug_game(10, 200)
142
"""
x *= 2
breakpoint()
x += 22
y *= 2
y -= 300
return x + y

Below, you will find shell screenshot, containing a debug session after running doctest on the module. In the output, I replaced the path with the ellipsis.

$ python -m doctest play_debug.py
> .../play_debug.py(9)play_the_debug_game()
-> x += 22
(Pdb) l
4 >>> play_the_debug_game(10, 200)
5 142
6 """
7 x *= 2
8 breakpoint()
9 -> x += 22
10 y *= 2
11 y -= 300
12 return x + y
[EOF]
(Pdb) x
20
(Pdb) n
> .../play_debug.py(10)play_the_debug_game()
-> y *= 2
(Pdb) x
42
(Pdb) c

So, this is a standard use of the pdb debugger. It is not really doctest debugging, but debugging of the code that is run via doctest. I use this approach often during code development, especially when the code is not yet enclosed in a working application.

Ad 2. Debug a doctest session. Unlike the previous method, this one means debugging of the actual testing session. You do it in a similar way, but this time, you add a point in a test, not in a function. For instance:

def play_the_debug_game(x, y):
"""Let's play and debug.

>>> game_1 = play_the_debug_game(10, 200)
>>> game_2 = play_the_debug_game(1, 2)
>>> breakpoint()
>>> game_1, game_2
(142, -272)
"""
x *= 2
x += 22
y *= 2
y -= 300
return x + y

And now a session:

$ python -m doctest play_debug.py
--Return--
> <doctest play_debug.play_the_debug_game[2]>(1)<module>()->None
-> breakpoint()
(Pdb) game_1
142
(Pdb) game_2
-272
(Pdb) game_1 + game_2
-130
(Pdb) c

This type of debugging can be useful in various situations, in which you want to check what’s going on in the testing session; for instance, when the test behaves in an unexpected way, and so you want to check what’s going on.

In the previous sections, I showed simple examples that use the basic doctest tool. As already mentioned, I seldom use more advanced tools. This is because doctest is designed in a way that indeed its basic tools enable one to achieve a lot. Maybe this is why I like this testing framework that much.

As an example of the advanced use of doctest, I will use two files — one presentation documentation and another presenting unit tests — from one of my Python packages, perftester. If you want to see more examples, you can go to the package’s GitHub repository. As you can read in the test/ folder’s README, I treated perftester as an experiment:

The use of doctesting as the only testing framework in perftester is an experiment. Tests in perftester are abundant, and are collected in four locations: the main README, docstrings in the main perftester module, the docs folder, and this folder.

Note that the main README of the package is full of doctests — but that’s just the beginning. To see more, visit the docs/ folder, which is full of Markdown files, each being a doctesting file, too. While these were testable documentation files, the files in the tests/ folder constitute actual test files. In other words, I used doctest as the only testing framework, for documentation, unit and integration testing. In my eyes, the experiment was successful, and the final conclusion was that when you do not need complex unit and integration tests in your project, you can use doctest as the only only testing framework.

That I did not use pytest in this project does not mean I decided to stop using pytest altogether. My regular approach is joining doctest and pytest. I use doctest for test-based development, but once a function, class or method is ready, I move some of the tests to pytest files, keeping only some of the tests as doctests. I usually write READMEs with doctests, and if I need to add a doc file, I often make it a Markdown file with doctest tests.

Let me show you a fragment (about half) of one of the documentation files from the perftester’s docs folder, docs/most_basic_use_time.md. In the code below, I wrapped two lines, to shorten them.

# Basic use of `perftester.time_test()`

```python
>>> import perftester as pt
>>> def preprocess(string):
... return string.lower().strip()
>>> test_string = " Oh oh the young boy, this YELLOW one, wants to sing a song about the sun.\n"
>>> preprocess(test_string)[:19]
'oh oh the young boy'

```

## First step - checking performance

We will first benchmark the function, to learn how it performs:

```python
>>> first_run = pt.time_benchmark(preprocess, string=test_string)

```

`first_run` gives the following results:

```python
# pt.pp(first_run)
{'raw_times': [2.476e-07, 2.402e-07, 2.414e-07, 2.633e-07, 3.396e-07],
'raw_times_relative': [3.325, 3.226, 3.242, 3.536, 4.56],
'max': 3.396e-07,
'mean': 2.664e-07,
'min': 2.402e-07,
'min_relative': 3.226}
```

Fine, no need to change the settings, as the raw times are rather short,
and the relative time ranges from 3.3 to 4.6.

# Raw time testing

We can define a simple time-performance test, using raw values, as follows:

```python
>>> pt.time_test(preprocess, raw_limit=2e-06, string=test_string)

```

As is with the `assert` statement, no output means that the test has passed.

If you overdo with the limit so that the test fails, you will see the following:

```python
>>> pt.time_test(preprocess, raw_limit=2e-08, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
raw_limit = 2e-08
minimum run time = ...

```

# Relative time testing

Alternatively, we can use relative time testing, which will be more or
less independent of a machine on which it's run:

```python
>>> pt.time_test(preprocess, relative_limit=10, string=test_string)

```

If you overdo with the limit so that the test fails, you will see the following:

```python
>>> pt.time_test(preprocess, relative_limit=1, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
relative_limit = 1
minimum time ratio = ...

```

Note the following:

  • The only directive I used is the ellipsis directive. I did not need anything more advanced.
  • I wanted to present the output of the command # pt.pp(first_run). Since it contained a lot of random values (which represented benchmarking results), I did not make this code block a doctest test. I simply presented the results.

Below, I present one section (Defaults) of a test file tests/doctest_config.md. Unlike the above doc file, this one includes unit tests, not documentation. Hence, you will not see too much text in the whole file — and no text in this section. These are unit tests and nothing more:

## Defaults

```python
>>> import perftester as pt
>>> pt.config.defaults
{'time': {'number': 100000, 'repeat': 5}, 'memory': {'repeat': 1}}

>>> original_defaults = pt.config.defaults
>>> pt.config.set_defaults("time", number=100)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 5}, 'memory': {'repeat': 1}}

>>> pt.config.set_defaults("time", repeat=20)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 20}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=2, number=7)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 1}}

>>> pt.config.set_defaults("memory", repeat=100)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 100}}

>>> pt.config.set_defaults("memory", number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.set_defaults("memory", number=100, repeat=5)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.set_defaults("memory", repeat=5, number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.

>>> pt.config.defaults = original_defaults

```

Note that in this fragment, the only other tool than mere output I used is the ellipsis used to represent exceptions thrown after some of the commands.

Maybe you think that I did not show you too much of doctest… And you’re right. But what you learned will be enough for you to write even quite advanced tests; and in my opinion, that’s most of what you need to know about doctesting.

This does not mean it’s the end of our doctest journey. We will return to it in next articles, as doctesting offers more advanced tools, which, perhaps, you will want to use. We will then learn, for instance, how to use doctest in Test-Driven Development (TDD); and how to use it effectively in PoC projects.

¹ In the case of doctest, basic does not mean for beginners. Although I am using this package almost daily, I seldom use more advanced tools than those I presented in this article.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment