Python Documentation Testing, doctest, Marcin Kozak
PYTHON PROGRAMMING
doctest allows for documentation, unit and integration testing, and test-driven development
Code testing does not have to be difficult. What’s more, testing makes coding easier and faster — and even, at least for some developers, more pleasurable. For testing to be pleasurable, however, the testing framework we use needs to be user-friendly.
Python offers several testing frameworks, currently three of the most popular being pytest
and built-in unittest
and doctest
. The first two are focused on unit testing. doctest
is different, its main purpose — but definitely not the only one — being documentation testing.
doctest
is also the main purpose of this article. I will introduce this particularly interesting tool, hoping you will see that despite its simplicity, it’s very useful. To be honest, doctest
is my favorite Python package, as I stressed in this “PyDev of the Week” interview. Above, I wrote that if we want testing to be pleasurable, we need a user-friendly tool. To say that doctest
is user-friendly is quite an understatement — it’s the simplest and user-friendliest testing tool I’ve ever met.
Having said that, I consider it a puzzle why most Pythonistas I know use doctest
seldom or do not use it at all. I hope this article will convince them — and you — that it’s worth introducing doctest
to daily coding routine.
I mentioned doctest
is simple. Indeed, it’s so simple that reading this short article will be enough for you to use it!
doctest
is a standard-library testing framework, so it comes with your Python installation. It means you do not have to install it. While it isn’t designed to be used in complicated unit testing scenarios, it is simple to use and yet powerful in most situations.
The doctest
module offers what neither pytest
nor unittest
offers: documentation testing. To put it simply, documentation testing is useful in testing whether your code documentation is up-to-date. That’s quite some functionality, particularly in big projects: with just one command, you can check whether all examples are correct. Therefore, it replaces reading through code examples in the documentation over and over again, after each new commit, merge and release. You can also add doctest
ing into a CI/CD pipeline.
The approach doctest
takes uses what we call regression testing. In principle, you run the code from examples, save the output, and in the future you can run doctest
to check whether the examples provide the same output — which means they work the way they should. As you see, this limits the use of this tool to objects that do not change from session to session. So, you cannot use this method to test random elements of the output. One example is objects’ id
s, which will not be the same from session to session. This limitation is not doctest
-specific, but often it seems to be more significant in this testing framework than in others. Fortunately, doctest
provides tools to deal with this limitation.
As with any testing tool, you can — and should — run doctest
after making any changes to your code, in order to check out whether the changes did not affect the output. Sometimes they do, due to their character (e.g., you fixed a bug and now the function works correctly — but differently); if this is the case, you should update the corresponding tests. If the changes shouldn’t affect the output but they do, as indicated by failing tests, something is wrong with the code, and so it is the code that needs further changes.
To write doctest
s, you need to provide both the code and the expected output. In practice, you write the tests in a similar way to what the code and its output look like in the Python interpreter.
Let me picture this using a simple example. Consider the following function, made part of a doctest
:
>>> def foo():
... return 42
(You can import this function, but we will return to this later.) And this is how I see it in my Python interpreter:
We see a minor difference. When you define a function in an interpreter, you need to hit the Enter key twice, which is why we see the additional ellipsis at the end of the function. doctest
code does not need this additional ellipsis line. What’s more, when you code in the interpreter, you do not have to add an additional white space before each ellipsis line (after the ellipsis), while you need to do it in doctest
code; otherwise, the indentation will look as though it was composed of three spaces, not four. And one more difference — and a nice one, actually — is that in the Python interpreter, you will see no code highlighting. You may see it when you write doctests, for instance, in Markdown or reStructuredText files.
I think you’ll agree these are not big differences, and that they are good differences. Thanks to them, doctest
code looks nice.
The >>>
and ...
(the Python ellipsis), thus, constitute the essential parts of tests, as doctest
uses them to recognize that a new command has just started (>>>
) and that the command from the previous line is being continued (...
).
The above cope snippet defining the foo()
function still is not a valid doctest
, as it only defines the function. To write a test, we need to call the function and include its expected output:
>>> foo()
42
And that’s it! This is our first doctest
— and you’ve already learned most of what you need to know about this tool! Altogether, this doctest
will look as follows:
>>> foo()
42
>>> def foo():
... return 42
You can run tests from shell. You can also do it from the Python interpreter, but since it needs the same code as one of the shell methods, I will not focus on this; you will seldom need it, too. ???unclear sentence!!!
Assume the test is saved as doctest_post_1.md
. To run it, open shell, navigate to the folder in which the test file is located, and run the following command (it will work in both Windows and Linux):
$ python -m doctest doctest_post_1.md
If the test passes, you will see nothing. If it fails, you will see this in the shell. To see how it works, let’s change 42 in the test to 43:
>>> foo()
43
This will be the output:
You can do it in another way. Consider the following module:
# mathmodule.py
"""The simplest math module ever.You can use two functions:
>>> add(2, 5.01)
7.01
>>> multiply(2, 5.01)
10.02
"""
def add(x, y):
"""Add two values.
>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
"""
return x + y
def multiply(x, y):
"""Multiple two values.
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
"""
return x * y
if __name__ == "__main__":
import doctest
doctest.testmod()
With this name-main block, you can simplify the shell command:
$ python doctest_post_1.md
This runs the module, and running it means running all its doctest
s.
When you want to see a detailed output, add a -v
flag. Below, I used this flag, which led to the following output:
$ python doctest_post_1.md -v
Trying:
add(2, 5.01)
Expecting:
7.01
ok
Trying:
multiply(2, 5.01)
Expecting:
10.02
ok
Trying:
add(1, 1)
Expecting:
2
ok
Trying:
add(-1.0001, 1.0001)
Expecting:
0.0
ok
Trying:
multiply(1, 1)
Expecting:
1
ok
Trying:
multiply(-1, 1)
Expecting:
-1
ok
3 items passed all tests:
2 tests in __main__
2 tests in __main__.add
2 tests in __main__.multiply
6 tests in 3 items.
6 passed and 0 failed.
Test passed.
It’s rather cluttered, and to be honest, I almost never use it. It’s main advantage is what we see at the end, so a summary of the tests. When developing, however, I do not need it, since I usually run the tests in order to check if a particular function passes the tests or not. Thus, and this is important, doctest
s should be rather fast, as you normally do not request a particular test, you simply run all tests from a module.
Above, we showed how to write and run doctest
s, so it’s time to consider the practicalities, with perhaps a little more complex examples. There are two general locations of doctest
tests:
- in function/class/method docstrings
- in documentation files
Since they can significantly differ, let’s discuss them one by one.
doctest
s in docstrings
Consider the following function:
# mean_module.py
from typing import List, Uniondef mean(x: List[Union[float, int]]) -> float:
"""Calculate the mean of a list of floats.
>>> x = [1.12, 1.44, 1.123, 6.56, 2.99]
>>> round(mean(x), 2)
2.65
"""
return sum(x) / len(x)
This is how you can write doctest
tests in a docstring. Here, we did in in a function docstring, but you can write it in any docstring, placed in the four doctest
locations:
- a module docstring
- a function docstring (like in the example above)
- a class docstring
- a method docstring
Using docstrings is, I think, the most common use of doctest
. Remember that whatever a function’s docstring contains, it will occur in its help page, which you can see in the Python interpreter by calling the help()
function. For example:
>>> from mean_module import mean
>>> help(mean)
The interpreter will disappear, and instead you will see the below help page of the mean()
function:
This is another thing I appreciate in doctest
tests: combined with docstrings, they make help pages, and so code documentation, clean and up-to-date. Sometimes functions or methods are so complex that it’s very difficult to see how they should be used. Then, oftentimes, something as small as a doctest
test or two added to a docstring will be more informative than the type hints and the code and the rest of the docstring together.
Remember, however, to not overwhelm docstrings with doctests. While it may be tempting to include many such tests in them, you should not do that. Reading such docstrings is unpleasant. It’s better to keep crucial tests in a docstring and move the remaining ones elsewhere. You can either move them to a dedicated doctest
file (or files) or translate them into pytest
tests. I’ve used both solutions.
doctest
s in documentation files
You can write doctests in files of various types. I prefer Markdown (.md) but you can do it in reStructuredText (.rst) and even text (.txt) files. Do not try to do that in files that use specific coding, however, like of EOF. For instance, doctest
does not work with .rtf files.
The below code block presents an example of a Markdown file with doctests. To save space, I will include here only basic tests, but they are enough to show how to create such files.
Consider the following README file (README.md) of the mathmodule
module:
# IntroductionYou will find here documentation of the mathmodule Python module.
It contains as many as two functions, `add()` and `multiply()`.
First, let's import the functions:
```python
>>> from mathmodule import add, multiply
```
## Using `add()`
You can use the `add()` function in the following way:
```python
>>> add(1, 1)
2
>>> add(2, 5.01)
7.01
```
## Using `multiply()
You can use the `multiply()` function in the following way:
```python
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
```
As you see, there is no philosophy here: you add doctest
s in typical code blocks. There are two things you should remember:
- Add an empty line before finishing a code block. Otherwise,
doctest
will treat```
as part of the output. - Don’t forget to import all the objects you’re going to use in the test file; here, these are the
add()
andmultiply()
functions. This may seem to you a basic mistake, maybe even too basic. Even if it is basic, I’ve made it quite often; I made even here, when writing the above README file.
As you can see, I included all the tests inside code blocks, but even tests written outside of code blocks would be run. I do not see any point in doing so, however.
Above, we’ve learned the very basic use of doctest
. We can do more with it, however. We can use most of additional doctest
functionalities by using so-called directives.
Directives are added directly after the code being tested, using specific syntax shown below. When the command that needs a directive is splitted into several lines, you can add the directive into any of them (I will show an example in a second). Directives change the behavior of doctest
; for instance, the test can ignore part of the output, normalize white spaces, catch exceptions, and the like.
Ellipsis
Perhaps the most important and most frequently used directive is the ellipsis: # doctest: +ELLIPSIS
. Below, you can see two tests for the above multiply
function, one without and the next with the ELLIPSIS
directive:
>>> multiply(2.500056, 1/2.322)
1.0766821705426355
>>> multiply(2.500056, 1/2.322) # doctest: +ELLIPSIS
1.076...
So, you need to add the directive after the tested code and the ellipsis inside the output, and whatever is printed in the output where the ellipsis is will be ignored. This is another example:
>>> multiply
<function multiply at 0x7...>
In the example above, you can use the closing >
character, but you don’t have to.
Long lines in output
Long lines can be a burden. We have two methods to deal with too long lines in output: (i) \
and (ii) the NORMALIZE_WHITESPACE
directive.
Method (i): using \
. Like in Python code, we can use a backslash (\
) to split the output into several lines, like here:
>>> s1 = "a"*20
>>> s2 = "b"*40
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
This test will indeed pass. The above format of the output is pretty ugly, but it will be much worse when we need to split lines in a function, class or method docstring:
def add(x, y):
"""Add two values.>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
"""
return x + y
As you see, splitting the line into two required us to move the second line to its very beginning; otherwise, doctest
would see the following output:
'aaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
and so the test would fail.
Note that this is the only method to deal with splitting strings between lines. When you need to split long output (e.g., a list, a tuple and the like), the next method will work better.
Method (ii): the NORMALIZE_WHITESPACE
directive. This method does not use the ugly backslash. Instead, it uses the NORMALIZE_WHITESPACE
directive:
>>> multiply([1, 2], 10) #doctest: +NORMALIZE_WHITESPACE
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
This test would not pass without the directive:
Failed example:
multiply([1, 2], 10)
Expected:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
Got:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
but it passes with the directive. Whenever I can, I use this directive instead of the backslash, except for splitting long strings.
Exceptions
Above, we discussed the ellipsis directive. Ellipsis, however, help also in exception handling, that is, when we want to test an example that throws an error:
>>> add(1, "bzik")
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'str'
The output clearly shows that an exception has been raised and that we test whether it was the expected exception.
Do you see a difference between this example of the ellipsis and the previous ones? Note that even though we do use the ellipsis here, we do not have to provide the ellipsis directive when testing the output with an exception. I suppose this is because catching exceptions is so frequent in testing that the doctest
creator decided to built this use of ellipsis into the doctest
syntax without the need of adding the ellipsis directive.
We do not have to provide the ellipsis directive when testing the output with an exception.
Debugging
doctest
offers several mechanisms for debugging. Since we’re talking about the basics¹, I will present only one mechanism, which is the simplest but also, at least for me, both the most intuitive and the most useful. To be honest, ever since I’ve been using doctest
, I use only this method, and it’s more than enough for my needs.
The method uses the built-in pdb
module and, also built-in, the breakpoint()
function. We can debug using doctest
in two ways:
- Debug a particular function tested via
doctest
. - Debug a
doctest
session.
Ad 1. Debug a particular function tested via doctest
. In this case, you set a point, using breakpoint()
, in a function and run doctest. It’s nothing more than standard debugging using breakpoint()
, but by running doctest
.
Let’s use doctest
to debug the following module, it’s named play_debug.py
, containing only one function.
def play_the_debug_game(x, y):
"""Let's play and debug.>>> play_the_debug_game(10, 200)
142
"""
x *= 2
breakpoint()
x += 22
y *= 2
y -= 300
return x + y
Below, you will find shell screenshot, containing a debug session after running doctest
on the module. In the output, I replaced the path with the ellipsis.
$ python -m doctest play_debug.py
> .../play_debug.py(9)play_the_debug_game()
-> x += 22
(Pdb) l
4 >>> play_the_debug_game(10, 200)
5 142
6 """
7 x *= 2
8 breakpoint()
9 -> x += 22
10 y *= 2
11 y -= 300
12 return x + y
[EOF]
(Pdb) x
20
(Pdb) n
> .../play_debug.py(10)play_the_debug_game()
-> y *= 2
(Pdb) x
42
(Pdb) c
So, this is a standard use of the pdb
debugger. It is not really doctest
debugging, but debugging of the code that is run via doctest
. I use this approach often during code development, especially when the code is not yet enclosed in a working application.
Ad 2. Debug a doctest
session. Unlike the previous method, this one means debugging of the actual testing session. You do it in a similar way, but this time, you add a point in a test, not in a function. For instance:
def play_the_debug_game(x, y):
"""Let's play and debug.>>> game_1 = play_the_debug_game(10, 200)
>>> game_2 = play_the_debug_game(1, 2)
>>> breakpoint()
>>> game_1, game_2
(142, -272)
"""
x *= 2
x += 22
y *= 2
y -= 300
return x + y
And now a session:
$ python -m doctest play_debug.py
--Return--
> <doctest play_debug.play_the_debug_game[2]>(1)<module>()->None
-> breakpoint()
(Pdb) game_1
142
(Pdb) game_2
-272
(Pdb) game_1 + game_2
-130
(Pdb) c
This type of debugging can be useful in various situations, in which you want to check what’s going on in the testing session; for instance, when the test behaves in an unexpected way, and so you want to check what’s going on.
In the previous sections, I showed simple examples that use the basic doctest
tool. As already mentioned, I seldom use more advanced tools. This is because doctest
is designed in a way that indeed its basic tools enable one to achieve a lot. Maybe this is why I like this testing framework that much.
As an example of the advanced use of doctest
, I will use two files — one presentation documentation and another presenting unit tests — from one of my Python packages, perftester
. If you want to see more examples, you can go to the package’s GitHub repository. As you can read in the test/ folder’s README, I treated perftester
as an experiment:
The use of
doctest
ing as the only testing framework inperftester
is an experiment. Tests inperftester
are abundant, and are collected in four locations: the main README, docstrings in the main perftester module, the docs folder, and this folder.
Note that the main README of the package is full of doctest
s — but that’s just the beginning. To see more, visit the docs/ folder, which is full of Markdown files, each being a doctest
ing file, too. While these were testable documentation files, the files in the tests/ folder constitute actual test files. In other words, I used doctest
as the only testing framework, for documentation, unit and integration testing. In my eyes, the experiment was successful, and the final conclusion was that when you do not need complex unit and integration tests in your project, you can use doctest
as the only only testing framework.
That I did not use pytest
in this project does not mean I decided to stop using pytest
altogether. My regular approach is joining doctest
and pytest
. I use doctest
for test-based development, but once a function, class or method is ready, I move some of the tests to pytest
files, keeping only some of the tests as doctest
s. I usually write READMEs with doctest
s, and if I need to add a doc file, I often make it a Markdown file with doctest
tests.
Let me show you a fragment (about half) of one of the documentation files from the perftester
’s docs folder, docs/most_basic_use_time.md
. In the code below, I wrapped two lines, to shorten them.
# Basic use of `perftester.time_test()````python
>>> import perftester as pt
>>> def preprocess(string):
... return string.lower().strip()
>>> test_string = " Oh oh the young boy, this YELLOW one, wants to sing a song about the sun.\n"
>>> preprocess(test_string)[:19]
'oh oh the young boy'
```
## First step - checking performance
We will first benchmark the function, to learn how it performs:
```python
>>> first_run = pt.time_benchmark(preprocess, string=test_string)
```
`first_run` gives the following results:
```python
# pt.pp(first_run)
{'raw_times': [2.476e-07, 2.402e-07, 2.414e-07, 2.633e-07, 3.396e-07],
'raw_times_relative': [3.325, 3.226, 3.242, 3.536, 4.56],
'max': 3.396e-07,
'mean': 2.664e-07,
'min': 2.402e-07,
'min_relative': 3.226}
```
Fine, no need to change the settings, as the raw times are rather short,
and the relative time ranges from 3.3 to 4.6.
# Raw time testing
We can define a simple time-performance test, using raw values, as follows:
```python
>>> pt.time_test(preprocess, raw_limit=2e-06, string=test_string)
```
As is with the `assert` statement, no output means that the test has passed.
If you overdo with the limit so that the test fails, you will see the following:
```python
>>> pt.time_test(preprocess, raw_limit=2e-08, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
raw_limit = 2e-08
minimum run time = ...
```
# Relative time testing
Alternatively, we can use relative time testing, which will be more or
less independent of a machine on which it's run:
```python
>>> pt.time_test(preprocess, relative_limit=10, string=test_string)
```
If you overdo with the limit so that the test fails, you will see the following:
```python
>>> pt.time_test(preprocess, relative_limit=1, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
relative_limit = 1
minimum time ratio = ...
```
Note the following:
- The only directive I used is the ellipsis directive. I did not need anything more advanced.
- I wanted to present the output of the command
# pt.pp(first_run)
. Since it contained a lot of random values (which represented benchmarking results), I did not make this code block adoctest
test. I simply presented the results.
Below, I present one section (Defaults) of a test file tests/doctest_config.md
. Unlike the above doc file, this one includes unit tests, not documentation. Hence, you will not see too much text in the whole file — and no text in this section. These are unit tests and nothing more:
## Defaults```python
>>> import perftester as pt
>>> pt.config.defaults
{'time': {'number': 100000, 'repeat': 5}, 'memory': {'repeat': 1}}
>>> original_defaults = pt.config.defaults
>>> pt.config.set_defaults("time", number=100)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 5}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=20)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 20}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=2, number=7)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("memory", repeat=100)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 100}}
>>> pt.config.set_defaults("memory", number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.set_defaults("memory", number=100, repeat=5)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.set_defaults("memory", repeat=5, number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.defaults = original_defaults
```
Note that in this fragment, the only other tool than mere output I used is the ellipsis used to represent exceptions thrown after some of the commands.
Maybe you think that I did not show you too much of doctest
… And you’re right. But what you learned will be enough for you to write even quite advanced tests; and in my opinion, that’s most of what you need to know about doctest
ing.
This does not mean it’s the end of our doctest
journey. We will return to it in next articles, as doctest
ing offers more advanced tools, which, perhaps, you will want to use. We will then learn, for instance, how to use doctest
in Test-Driven Development (TDD); and how to use it effectively in PoC projects.
¹ In the case of doctest
, basic does not mean for beginners. Although I am using this package almost daily, I seldom use more advanced tools than those I presented in this article.
PYTHON PROGRAMMING
doctest allows for documentation, unit and integration testing, and test-driven development
Code testing does not have to be difficult. What’s more, testing makes coding easier and faster — and even, at least for some developers, more pleasurable. For testing to be pleasurable, however, the testing framework we use needs to be user-friendly.
Python offers several testing frameworks, currently three of the most popular being pytest
and built-in unittest
and doctest
. The first two are focused on unit testing. doctest
is different, its main purpose — but definitely not the only one — being documentation testing.
doctest
is also the main purpose of this article. I will introduce this particularly interesting tool, hoping you will see that despite its simplicity, it’s very useful. To be honest, doctest
is my favorite Python package, as I stressed in this “PyDev of the Week” interview. Above, I wrote that if we want testing to be pleasurable, we need a user-friendly tool. To say that doctest
is user-friendly is quite an understatement — it’s the simplest and user-friendliest testing tool I’ve ever met.
Having said that, I consider it a puzzle why most Pythonistas I know use doctest
seldom or do not use it at all. I hope this article will convince them — and you — that it’s worth introducing doctest
to daily coding routine.
I mentioned doctest
is simple. Indeed, it’s so simple that reading this short article will be enough for you to use it!
doctest
is a standard-library testing framework, so it comes with your Python installation. It means you do not have to install it. While it isn’t designed to be used in complicated unit testing scenarios, it is simple to use and yet powerful in most situations.
The doctest
module offers what neither pytest
nor unittest
offers: documentation testing. To put it simply, documentation testing is useful in testing whether your code documentation is up-to-date. That’s quite some functionality, particularly in big projects: with just one command, you can check whether all examples are correct. Therefore, it replaces reading through code examples in the documentation over and over again, after each new commit, merge and release. You can also add doctest
ing into a CI/CD pipeline.
The approach doctest
takes uses what we call regression testing. In principle, you run the code from examples, save the output, and in the future you can run doctest
to check whether the examples provide the same output — which means they work the way they should. As you see, this limits the use of this tool to objects that do not change from session to session. So, you cannot use this method to test random elements of the output. One example is objects’ id
s, which will not be the same from session to session. This limitation is not doctest
-specific, but often it seems to be more significant in this testing framework than in others. Fortunately, doctest
provides tools to deal with this limitation.
As with any testing tool, you can — and should — run doctest
after making any changes to your code, in order to check out whether the changes did not affect the output. Sometimes they do, due to their character (e.g., you fixed a bug and now the function works correctly — but differently); if this is the case, you should update the corresponding tests. If the changes shouldn’t affect the output but they do, as indicated by failing tests, something is wrong with the code, and so it is the code that needs further changes.
To write doctest
s, you need to provide both the code and the expected output. In practice, you write the tests in a similar way to what the code and its output look like in the Python interpreter.
Let me picture this using a simple example. Consider the following function, made part of a doctest
:
>>> def foo():
... return 42
(You can import this function, but we will return to this later.) And this is how I see it in my Python interpreter:
We see a minor difference. When you define a function in an interpreter, you need to hit the Enter key twice, which is why we see the additional ellipsis at the end of the function. doctest
code does not need this additional ellipsis line. What’s more, when you code in the interpreter, you do not have to add an additional white space before each ellipsis line (after the ellipsis), while you need to do it in doctest
code; otherwise, the indentation will look as though it was composed of three spaces, not four. And one more difference — and a nice one, actually — is that in the Python interpreter, you will see no code highlighting. You may see it when you write doctests, for instance, in Markdown or reStructuredText files.
I think you’ll agree these are not big differences, and that they are good differences. Thanks to them, doctest
code looks nice.
The >>>
and ...
(the Python ellipsis), thus, constitute the essential parts of tests, as doctest
uses them to recognize that a new command has just started (>>>
) and that the command from the previous line is being continued (...
).
The above cope snippet defining the foo()
function still is not a valid doctest
, as it only defines the function. To write a test, we need to call the function and include its expected output:
>>> foo()
42
And that’s it! This is our first doctest
— and you’ve already learned most of what you need to know about this tool! Altogether, this doctest
will look as follows:
>>> foo()
42
>>> def foo():
... return 42
You can run tests from shell. You can also do it from the Python interpreter, but since it needs the same code as one of the shell methods, I will not focus on this; you will seldom need it, too. ???unclear sentence!!!
Assume the test is saved as doctest_post_1.md
. To run it, open shell, navigate to the folder in which the test file is located, and run the following command (it will work in both Windows and Linux):
$ python -m doctest doctest_post_1.md
If the test passes, you will see nothing. If it fails, you will see this in the shell. To see how it works, let’s change 42 in the test to 43:
>>> foo()
43
This will be the output:
You can do it in another way. Consider the following module:
# mathmodule.py
"""The simplest math module ever.You can use two functions:
>>> add(2, 5.01)
7.01
>>> multiply(2, 5.01)
10.02
"""
def add(x, y):
"""Add two values.
>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
"""
return x + y
def multiply(x, y):
"""Multiple two values.
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
"""
return x * y
if __name__ == "__main__":
import doctest
doctest.testmod()
With this name-main block, you can simplify the shell command:
$ python doctest_post_1.md
This runs the module, and running it means running all its doctest
s.
When you want to see a detailed output, add a -v
flag. Below, I used this flag, which led to the following output:
$ python doctest_post_1.md -v
Trying:
add(2, 5.01)
Expecting:
7.01
ok
Trying:
multiply(2, 5.01)
Expecting:
10.02
ok
Trying:
add(1, 1)
Expecting:
2
ok
Trying:
add(-1.0001, 1.0001)
Expecting:
0.0
ok
Trying:
multiply(1, 1)
Expecting:
1
ok
Trying:
multiply(-1, 1)
Expecting:
-1
ok
3 items passed all tests:
2 tests in __main__
2 tests in __main__.add
2 tests in __main__.multiply
6 tests in 3 items.
6 passed and 0 failed.
Test passed.
It’s rather cluttered, and to be honest, I almost never use it. It’s main advantage is what we see at the end, so a summary of the tests. When developing, however, I do not need it, since I usually run the tests in order to check if a particular function passes the tests or not. Thus, and this is important, doctest
s should be rather fast, as you normally do not request a particular test, you simply run all tests from a module.
Above, we showed how to write and run doctest
s, so it’s time to consider the practicalities, with perhaps a little more complex examples. There are two general locations of doctest
tests:
- in function/class/method docstrings
- in documentation files
Since they can significantly differ, let’s discuss them one by one.
doctest
s in docstrings
Consider the following function:
# mean_module.py
from typing import List, Uniondef mean(x: List[Union[float, int]]) -> float:
"""Calculate the mean of a list of floats.
>>> x = [1.12, 1.44, 1.123, 6.56, 2.99]
>>> round(mean(x), 2)
2.65
"""
return sum(x) / len(x)
This is how you can write doctest
tests in a docstring. Here, we did in in a function docstring, but you can write it in any docstring, placed in the four doctest
locations:
- a module docstring
- a function docstring (like in the example above)
- a class docstring
- a method docstring
Using docstrings is, I think, the most common use of doctest
. Remember that whatever a function’s docstring contains, it will occur in its help page, which you can see in the Python interpreter by calling the help()
function. For example:
>>> from mean_module import mean
>>> help(mean)
The interpreter will disappear, and instead you will see the below help page of the mean()
function:
This is another thing I appreciate in doctest
tests: combined with docstrings, they make help pages, and so code documentation, clean and up-to-date. Sometimes functions or methods are so complex that it’s very difficult to see how they should be used. Then, oftentimes, something as small as a doctest
test or two added to a docstring will be more informative than the type hints and the code and the rest of the docstring together.
Remember, however, to not overwhelm docstrings with doctests. While it may be tempting to include many such tests in them, you should not do that. Reading such docstrings is unpleasant. It’s better to keep crucial tests in a docstring and move the remaining ones elsewhere. You can either move them to a dedicated doctest
file (or files) or translate them into pytest
tests. I’ve used both solutions.
doctest
s in documentation files
You can write doctests in files of various types. I prefer Markdown (.md) but you can do it in reStructuredText (.rst) and even text (.txt) files. Do not try to do that in files that use specific coding, however, like of EOF. For instance, doctest
does not work with .rtf files.
The below code block presents an example of a Markdown file with doctests. To save space, I will include here only basic tests, but they are enough to show how to create such files.
Consider the following README file (README.md) of the mathmodule
module:
# IntroductionYou will find here documentation of the mathmodule Python module.
It contains as many as two functions, `add()` and `multiply()`.
First, let's import the functions:
```python
>>> from mathmodule import add, multiply
```
## Using `add()`
You can use the `add()` function in the following way:
```python
>>> add(1, 1)
2
>>> add(2, 5.01)
7.01
```
## Using `multiply()
You can use the `multiply()` function in the following way:
```python
>>> multiply(1, 1)
1
>>> multiply(-1, 1)
-1
```
As you see, there is no philosophy here: you add doctest
s in typical code blocks. There are two things you should remember:
- Add an empty line before finishing a code block. Otherwise,
doctest
will treat```
as part of the output. - Don’t forget to import all the objects you’re going to use in the test file; here, these are the
add()
andmultiply()
functions. This may seem to you a basic mistake, maybe even too basic. Even if it is basic, I’ve made it quite often; I made even here, when writing the above README file.
As you can see, I included all the tests inside code blocks, but even tests written outside of code blocks would be run. I do not see any point in doing so, however.
Above, we’ve learned the very basic use of doctest
. We can do more with it, however. We can use most of additional doctest
functionalities by using so-called directives.
Directives are added directly after the code being tested, using specific syntax shown below. When the command that needs a directive is splitted into several lines, you can add the directive into any of them (I will show an example in a second). Directives change the behavior of doctest
; for instance, the test can ignore part of the output, normalize white spaces, catch exceptions, and the like.
Ellipsis
Perhaps the most important and most frequently used directive is the ellipsis: # doctest: +ELLIPSIS
. Below, you can see two tests for the above multiply
function, one without and the next with the ELLIPSIS
directive:
>>> multiply(2.500056, 1/2.322)
1.0766821705426355
>>> multiply(2.500056, 1/2.322) # doctest: +ELLIPSIS
1.076...
So, you need to add the directive after the tested code and the ellipsis inside the output, and whatever is printed in the output where the ellipsis is will be ignored. This is another example:
>>> multiply
<function multiply at 0x7...>
In the example above, you can use the closing >
character, but you don’t have to.
Long lines in output
Long lines can be a burden. We have two methods to deal with too long lines in output: (i) \
and (ii) the NORMALIZE_WHITESPACE
directive.
Method (i): using \
. Like in Python code, we can use a backslash (\
) to split the output into several lines, like here:
>>> s1 = "a"*20
>>> s2 = "b"*40
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
This test will indeed pass. The above format of the output is pretty ugly, but it will be much worse when we need to split lines in a function, class or method docstring:
def add(x, y):
"""Add two values.>>> add(1, 1)
2
>>> add(-1.0001, 1.0001)
0.0
>>> add(s1, s2)
'aaaaaaaaaaaaaaaaaaaa\
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
"""
return x + y
As you see, splitting the line into two required us to move the second line to its very beginning; otherwise, doctest
would see the following output:
'aaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'
and so the test would fail.
Note that this is the only method to deal with splitting strings between lines. When you need to split long output (e.g., a list, a tuple and the like), the next method will work better.
Method (ii): the NORMALIZE_WHITESPACE
directive. This method does not use the ugly backslash. Instead, it uses the NORMALIZE_WHITESPACE
directive:
>>> multiply([1, 2], 10) #doctest: +NORMALIZE_WHITESPACE
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
This test would not pass without the directive:
Failed example:
multiply([1, 2], 10)
Expected:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2,
1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
Got:
[1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2]
but it passes with the directive. Whenever I can, I use this directive instead of the backslash, except for splitting long strings.
Exceptions
Above, we discussed the ellipsis directive. Ellipsis, however, help also in exception handling, that is, when we want to test an example that throws an error:
>>> add(1, "bzik")
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for +: 'int' and 'str'
The output clearly shows that an exception has been raised and that we test whether it was the expected exception.
Do you see a difference between this example of the ellipsis and the previous ones? Note that even though we do use the ellipsis here, we do not have to provide the ellipsis directive when testing the output with an exception. I suppose this is because catching exceptions is so frequent in testing that the doctest
creator decided to built this use of ellipsis into the doctest
syntax without the need of adding the ellipsis directive.
We do not have to provide the ellipsis directive when testing the output with an exception.
Debugging
doctest
offers several mechanisms for debugging. Since we’re talking about the basics¹, I will present only one mechanism, which is the simplest but also, at least for me, both the most intuitive and the most useful. To be honest, ever since I’ve been using doctest
, I use only this method, and it’s more than enough for my needs.
The method uses the built-in pdb
module and, also built-in, the breakpoint()
function. We can debug using doctest
in two ways:
- Debug a particular function tested via
doctest
. - Debug a
doctest
session.
Ad 1. Debug a particular function tested via doctest
. In this case, you set a point, using breakpoint()
, in a function and run doctest. It’s nothing more than standard debugging using breakpoint()
, but by running doctest
.
Let’s use doctest
to debug the following module, it’s named play_debug.py
, containing only one function.
def play_the_debug_game(x, y):
"""Let's play and debug.>>> play_the_debug_game(10, 200)
142
"""
x *= 2
breakpoint()
x += 22
y *= 2
y -= 300
return x + y
Below, you will find shell screenshot, containing a debug session after running doctest
on the module. In the output, I replaced the path with the ellipsis.
$ python -m doctest play_debug.py
> .../play_debug.py(9)play_the_debug_game()
-> x += 22
(Pdb) l
4 >>> play_the_debug_game(10, 200)
5 142
6 """
7 x *= 2
8 breakpoint()
9 -> x += 22
10 y *= 2
11 y -= 300
12 return x + y
[EOF]
(Pdb) x
20
(Pdb) n
> .../play_debug.py(10)play_the_debug_game()
-> y *= 2
(Pdb) x
42
(Pdb) c
So, this is a standard use of the pdb
debugger. It is not really doctest
debugging, but debugging of the code that is run via doctest
. I use this approach often during code development, especially when the code is not yet enclosed in a working application.
Ad 2. Debug a doctest
session. Unlike the previous method, this one means debugging of the actual testing session. You do it in a similar way, but this time, you add a point in a test, not in a function. For instance:
def play_the_debug_game(x, y):
"""Let's play and debug.>>> game_1 = play_the_debug_game(10, 200)
>>> game_2 = play_the_debug_game(1, 2)
>>> breakpoint()
>>> game_1, game_2
(142, -272)
"""
x *= 2
x += 22
y *= 2
y -= 300
return x + y
And now a session:
$ python -m doctest play_debug.py
--Return--
> <doctest play_debug.play_the_debug_game[2]>(1)<module>()->None
-> breakpoint()
(Pdb) game_1
142
(Pdb) game_2
-272
(Pdb) game_1 + game_2
-130
(Pdb) c
This type of debugging can be useful in various situations, in which you want to check what’s going on in the testing session; for instance, when the test behaves in an unexpected way, and so you want to check what’s going on.
In the previous sections, I showed simple examples that use the basic doctest
tool. As already mentioned, I seldom use more advanced tools. This is because doctest
is designed in a way that indeed its basic tools enable one to achieve a lot. Maybe this is why I like this testing framework that much.
As an example of the advanced use of doctest
, I will use two files — one presentation documentation and another presenting unit tests — from one of my Python packages, perftester
. If you want to see more examples, you can go to the package’s GitHub repository. As you can read in the test/ folder’s README, I treated perftester
as an experiment:
The use of
doctest
ing as the only testing framework inperftester
is an experiment. Tests inperftester
are abundant, and are collected in four locations: the main README, docstrings in the main perftester module, the docs folder, and this folder.
Note that the main README of the package is full of doctest
s — but that’s just the beginning. To see more, visit the docs/ folder, which is full of Markdown files, each being a doctest
ing file, too. While these were testable documentation files, the files in the tests/ folder constitute actual test files. In other words, I used doctest
as the only testing framework, for documentation, unit and integration testing. In my eyes, the experiment was successful, and the final conclusion was that when you do not need complex unit and integration tests in your project, you can use doctest
as the only only testing framework.
That I did not use pytest
in this project does not mean I decided to stop using pytest
altogether. My regular approach is joining doctest
and pytest
. I use doctest
for test-based development, but once a function, class or method is ready, I move some of the tests to pytest
files, keeping only some of the tests as doctest
s. I usually write READMEs with doctest
s, and if I need to add a doc file, I often make it a Markdown file with doctest
tests.
Let me show you a fragment (about half) of one of the documentation files from the perftester
’s docs folder, docs/most_basic_use_time.md
. In the code below, I wrapped two lines, to shorten them.
# Basic use of `perftester.time_test()````python
>>> import perftester as pt
>>> def preprocess(string):
... return string.lower().strip()
>>> test_string = " Oh oh the young boy, this YELLOW one, wants to sing a song about the sun.\n"
>>> preprocess(test_string)[:19]
'oh oh the young boy'
```
## First step - checking performance
We will first benchmark the function, to learn how it performs:
```python
>>> first_run = pt.time_benchmark(preprocess, string=test_string)
```
`first_run` gives the following results:
```python
# pt.pp(first_run)
{'raw_times': [2.476e-07, 2.402e-07, 2.414e-07, 2.633e-07, 3.396e-07],
'raw_times_relative': [3.325, 3.226, 3.242, 3.536, 4.56],
'max': 3.396e-07,
'mean': 2.664e-07,
'min': 2.402e-07,
'min_relative': 3.226}
```
Fine, no need to change the settings, as the raw times are rather short,
and the relative time ranges from 3.3 to 4.6.
# Raw time testing
We can define a simple time-performance test, using raw values, as follows:
```python
>>> pt.time_test(preprocess, raw_limit=2e-06, string=test_string)
```
As is with the `assert` statement, no output means that the test has passed.
If you overdo with the limit so that the test fails, you will see the following:
```python
>>> pt.time_test(preprocess, raw_limit=2e-08, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
raw_limit = 2e-08
minimum run time = ...
```
# Relative time testing
Alternatively, we can use relative time testing, which will be more or
less independent of a machine on which it's run:
```python
>>> pt.time_test(preprocess, relative_limit=10, string=test_string)
```
If you overdo with the limit so that the test fails, you will see the following:
```python
>>> pt.time_test(preprocess, relative_limit=1, string=test_string) #doctest: +ELLIPSIS
Traceback (most recent call last):
...
perftester.perftester.TimeTestError: Time test not passed for function preprocess:
relative_limit = 1
minimum time ratio = ...
```
Note the following:
- The only directive I used is the ellipsis directive. I did not need anything more advanced.
- I wanted to present the output of the command
# pt.pp(first_run)
. Since it contained a lot of random values (which represented benchmarking results), I did not make this code block adoctest
test. I simply presented the results.
Below, I present one section (Defaults) of a test file tests/doctest_config.md
. Unlike the above doc file, this one includes unit tests, not documentation. Hence, you will not see too much text in the whole file — and no text in this section. These are unit tests and nothing more:
## Defaults```python
>>> import perftester as pt
>>> pt.config.defaults
{'time': {'number': 100000, 'repeat': 5}, 'memory': {'repeat': 1}}
>>> original_defaults = pt.config.defaults
>>> pt.config.set_defaults("time", number=100)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 5}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=20)
>>> pt.config.defaults
{'time': {'number': 100, 'repeat': 20}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("time", repeat=2, number=7)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 1}}
>>> pt.config.set_defaults("memory", repeat=100)
>>> pt.config.defaults
{'time': {'number': 7, 'repeat': 2}, 'memory': {'repeat': 100}}
>>> pt.config.set_defaults("memory", number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.set_defaults("memory", number=100, repeat=5)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.set_defaults("memory", repeat=5, number=100)
Traceback (most recent call last):
...
perftester.perftester.IncorrectArgumentError: For memory tests, you can only set repeat, not number.
>>> pt.config.defaults = original_defaults
```
Note that in this fragment, the only other tool than mere output I used is the ellipsis used to represent exceptions thrown after some of the commands.
Maybe you think that I did not show you too much of doctest
… And you’re right. But what you learned will be enough for you to write even quite advanced tests; and in my opinion, that’s most of what you need to know about doctest
ing.
This does not mean it’s the end of our doctest
journey. We will return to it in next articles, as doctest
ing offers more advanced tools, which, perhaps, you will want to use. We will then learn, for instance, how to use doctest
in Test-Driven Development (TDD); and how to use it effectively in PoC projects.
¹ In the case of doctest
, basic does not mean for beginners. Although I am using this package almost daily, I seldom use more advanced tools than those I presented in this article.