Python: Type Aliases, Type Variables, New Types

By Jessie Hobb On Apr 26, 2023

PYTHON PROGRAMMING

See type aliases, type variables and new types in action

Python offers type hints. The choice is still yours. Photo by William Felker on Unsplash

As I wrote in the article below, if you want to use type hints in Python, do it the right way:

What is the right way? Simply put, one that makes your code readable and correct from a point of view of static type checkers. So, two things: readable and correct.

Among the things I mentioned in the article above was that creating type aliases is a great way of increasing readability. We will start our discussion with them, focusing on when they can help indeed. Then, we move on to using type variables (typing.TypeVar) and new types (typing.NewType), which will help us achieve what we wouldn’t be able to achieve using regular type aliases.

I will use Python 3.11 and mypy version 1.2.0.

Put simply, the point of using type aliases is twofold:

to let the user know, in a relatively simple way, what type an argument should have (should, as we’re still talking about type hints), and
to make statistic checkers happy.

Making static checkers happy should make us happy, too: An unhappy type checker usually means errors, or at the very least some inconsistencies.

For some users, point 2 is the only one worth mentioning — since static checking is the only reason they use type hints. It helps them avoid mistakes.

Sure, that’s great — but that’s not all. Type hints can help us do more than that. And note that if our only aim is to satisfy static checkers, type aliases would have no use, as they don’t help static checkers at all. They help the user.

For me, both points are equally important. These days, when reading a function, I pay close attention to its annotations. When written well, they help me understand the function. When written poorly — not to mention incorrectly — they make the function less readable than had it been defined without any annotations whatsoever.

Let’s start off with type aliases. I will show you their two main use cases. Then, we will see type aliases can help in rather simple situations, and sometimes we need something more. In our case, type variables and new types will come to rescue.

Type aliases offer a simple yet powerful tool to make type hints clearer. I will reuse here a nice and convincing example from Python documentation on type aliases:

from collections.abc import SequenceConnectionOptions = dict[str, str]
Address = tuple[str, int]
Server = tuple[Address, ConnectionOptions]
def broadcast_message(message: str,
servers: Sequence[Server]
) -> None:
...

As the documentation says, the above type signature for servers is exactly equivalent to the one used below:

def broadcast_message(
message: str,
servers: Sequence[tuple[tuple[str, int], dict[str, str]]]
) -> None:
...

As you see, the equivalency is not full: while the two signatures are indeed equivalent in terms of code, they do differ in readability. The point here lies in this type signature:

servers: Sequence[tuple[tuple[str, int], dict[str, str]]]

Although it’s difficult to read and understand, after redefining to Sequence[Server] using several type aliases, it has become much clearer. What helps is the information conveyed by the type aliases used in the signature. Good naming can do wonders.

Note that we could make this type signature a little different, by adding one more type alias:

Servers = Sequence[Server]servers: Servers

but to me, Sequence[Server] is much better than Servers, as I immediately see I deal with an object that implements the Sequence protocol. It can be a list, for instance. Besides, we already have the argument’s name servers, so creating a type alias Servers seems redundant.

Surely, understanding this very signature to its last detail using these type aliases is not simple:

ConnectionOptions = dict[str, str]
Address = tuple[str, int]
Server = tuple[Address, ConnectionOptions]
servers: Sequence[Server]

but thanks to the type aliases ConnectionOptions, Address, and Server and their clear meaning, it’s much simpler than understanding this signature:

servers: Sequence[tuple[tuple[str, int], dict[str, str]]]

Simply put, with so complex types, raw type signatures will make static checkers happy — but they will unlikely make the users’ lives easier. Type aliases can help achieve that — they help communicate additional information about a variable, function, class or method to the user. They act as a communication tool.

Type aliases as a communication tool: Further considerations

Okay, let’s jump into another example. This time, we will try to utilize type aliases in order to improve our communication with the user in a simpler situation than before.

As we saw, the most important such communication tool is good naming. A good function, class or method name should clearly indicate its responsibility. When you see a name of calculate_distance(), you know that this function will, well, calculate distance; you would be surprised to see that such a function returns a tuple of two strings. When you see a City class, you know the class will somehow represent a city — not an animal, a car or a beaver.

Annotations can help convey even more information than the function’s (class’s, method’s) and its arguments’ names. In other words, we want type hints not only to hint what types should be used, but also to help the user understand our functions and classes — and to help them provide correct values. As already mentioned, this can be done thanks to well-named type aliases.

Let’s start off with a simple example, this time using a variable type hint. Imagine we have something like this:

length = 73.5

Sure, we know the variable represents a length of something. But that’s all we know. First, of what? A better name could help:

length_of_parcel = 73.5

Clear now. Imagine you’re a delivery guy and you’re to decide if the parcel will fit into your car. Well, will it?

If one has made one’s decision based on the above knowledge, one is either an I-will-handle-any-parcel kind of guy or a better-not-to-risk one. In neither case was this an informed decision. We miss the units, don’t we?

length_of_parcel = 73.5 # in cm

Better! But this is still just a comment, and it’s better if the code itself provided this information; it does not above, but it does here:

Cm = float
length_of_parcel: Cm = 73.5

We have used a type alias again. But remember, this is just a type alias, and for Python, length_of_parcel is still just a float, nothing else. It means far more to us, though — that the parcel is 73.5 cm long.

Let’s move on to a more complicated situation, that is, from variable annotation to function annotation. Imagine we want to implement a function that calculates the circumference of a rectangle. Let’s start with no annotation:

def get_rectangle_circumference(x, y):
return 2*x + 2*y

Simple. Pythonic¹. Correct.

We’re already familiar with the problem: without annotations, the user does not know what sort of data the function expects. Centimeters? Inches? Meters? Kilometers? In fact, the function will work with strings:

>>> get_rectangle_circumference("a", "X")
'aaXX'

Khm. Surely, this works — but makes no sense. Do we want the user to be able to use our function for stuff like that? Do we want the user to say:

Hey, their function told me that when I create a rectangle with the side lengths of "a" and "X", this rectangle has the circumference of "aaXX", haha!

Nah, better not. Sure, the function’s name does say what the function does, but it would help to let the user know what sort of data the function expects. Then we could respond:

Hey, can’t you read? Don’t you see the function expects floating-point numbers? Or maybe you think a string is a floating-point number, haha?

I think it’s always better to avoid such haha-discussions. So, it’s a big yes to type hints. Let’s go on.

Okay, so we have a rectangle, it has four sides, and x and y are their lengths. It doesn’t matter which unit the users provides, as the function works for any length unit; it can be centimeters, inches, kilometers, anything that is a length unit. What does matter, however — and in fact, what makes much of a difference — is that both x and y be provided in the same units. Otherwise, the function will not work correctly. This is fine:

>>> x = 10                  # in cm
>>> y = 200                 # in cm
>>> get_rectangle_circumference(x, y) # in cm
420

But this is not:

>>> x = 10                  # in cm
>>> y = 2                   # in m
>>> get_rectangle_circumference(x, y) # incorrect!
24

The problem is that even though this call makes no sense and we know it, it is correct from a Python perspective — both

dynamically: we will get 24; and
statically: x and y are both floating-point numbers.

The problem is, we did not let the user — and Python, for that matter — know that the two arguments, x and y, should be in the same units, just that they should use floating-point numbers. For Python, a float is a float, and it does not distinguish kilometers from inches, not to mention kilograms.

Let’s check out if we can use type hinting to do something with this. In other words: Can we use type hints to let the user know that they should use the same type for both arguments, and that the return value would be of this type, too?

The simplest annotation would be one using floating-point numbers:

def get_rectangle_circumference(
x: float,
y: float) -> float:
return 2*x + 2*y

This function signature is a little better than that without annotations, as at least the user is informed they should use floats. But again, inches? Centimeters? Meters? And actually, why not kilograms?

So, let’s try a type alias:

Cm = floatdef get_rectangle_circumference(x: Cm, y: Cm) -> Cm:
return 2*x + 2*y

Clear, isn’t it? mypy will clap:

So will Pylance. The user knows that they should provide centimeters, and that the function will return the circumference in centimeters. Cm is a type alias, which basically means it’s still float, and there is no difference between Cm and float. But the point is, the user knows.

Static checkers, however, will not be too helpful in this case. You can provide an additional type alias of float, and it will be treated just the same as Cm and as any float:

Cm = float
M = floatdef get_rectangle_circumference(x: Cm, y: Cm) -> Cm:
return 2*x + 2*y
x: Cm = 10
y: M = 10
get_rectangle_circumference(x, y)

The type checker is fully okay with this, as both Cm and M are just aliases of the same type, that is, float. Basically, for static checkers Cm is equivalent not only to float, but also to M. Thus, if you want to use type aliases in such instances, you have to remember that they are merely… aliases — and nothing more!

I am sure you’ve noticed another big disadvantage of the above signature that used the Cm type alias. Why should the user provide x and y in centimeters when they have them in inches, or any other unit? Convert? And then what, convert back? That would be i-n-s-a-n-e!

Well… Maybe we could create a distance-related (or length-related) float alias?

DistanceUnit = floatdef get_rectangle_circumference(
x: DistanceUnit,
y: DistanceUnit
) -> DistanceUnit:
return 2*x + 2*y

mypy will clap again, as the only thing we’ve changed is the name. But this did not change anything else: The user still can make the very same error of providing values in different units, both of which will be DistanceUnits, like centimeters and inches. At least the user knows they should not provide kilograms.

As you see, type aliases will not help us solve this problem. On the one hand, I think we can assume that anyone using Python should know that when calculating the circumference of a rectangle, one should provide the lengths of the sides in the same units. This is not Python knowledge. This is simple maths.

However, in some other scenarios you might want to make things clear, as not always are things as clear as with calculating a rectangle’s circumference. We know type aliases will not help, so let’s move on to two other typing’s tools: type variables (TypeVar) and new types (NewType). Will they help?

If you really want to implement so detailed type hinting, you can. Beware, however, that this will make the code more complex. To this end, typing.NewType and typing.TypeVar can help.

Let’s start with NewType. This is a typing tool to create new types with a minimal runtime overhead (see Appendix 1). Types created that way offer minimal functionality, so you should prefer them when you don’t need anything more than just clear type hints and a possibility to convert a value to and from this type. Its advantage is that it works with static checkers (as we will see in a moment). Its disadvantage — in my opinion, quite a big one — is that a type created using typing.NewType is not treated as a type by isinstance (at least in Python 3.11.2 — I hope this will change in future versions):

isinstance() does not treat typing.NewType types as actual types — Screenshot from Python 3.11.2: typing.NewType types are not considered types by isinstance(). Image by author.

For me, this is a serious issue. But as you will see, typing.NewType types can still be very useful, with small overhead (as shown in Appendix 1).

So, we want to create types representing our distance-related units. The problem is, we will have to create as many types as many units we want to take into account. For simplicity, let’s limit them to a couple of most important length units based on International System of Units (SI units). This is how you would proceed when working on your project in which the number of types is limited. When working on a framework to be used by others, however, you should create more types.

Four types will do in our case:

from typing import NewTypeMm = NewType("Mm", float)
Cm = NewType("Cm", float)
M = NewType("M", float)
Km = NewType("Km", float)

NewType creates subtypes — so, Mm, Cm, M and Km are all subtypes of float. They can be used anywhere where float can be used, but a static checker will not accept a regular float value where any of these four subtypes are to be used. You will need to convert such a float value to the type of choice; for example, you could do distance = Km(30.24), meaning that the distance in question is equal to 30 km and 240 m.

Let’s see the types used to annotate this simple function:

def km_to_mm(x: Km) -> Mm:
return x / 1_000_000

Pylance screams:

Pylance error: Expression pf type “float” cannot be assigned to return type “Mm” — Screenshot from Pylance from VSCode. Image by author

This is because x / 1_000_000 gives a float while we indicated that the function returns a value of the Mm type. To achieve this, we need to convert the returned value to the expected type:

def km_to_mm(x: Km) -> Mm:
return Mm(x / 1_000_000)

As you see, types created using typing.NewType can be used as callables (before Python 3.10 they were functions; now they are classes) in order to convert a value to their type. This is very convenient in situations like this one.

But how will this help us with our get_rectangle_circumference() function? We still have four different subtypes of float and we want to make the function return the very type that its x and y arguments have.

It’s time to introduce a new typing tool, type variables, or typing.TypeVar. As it occurs, a type variable can help us achieve what we need:

from typing import NewType, TypeVarMm = NewType("Mm", float)
Cm = NewType("Cm", float)
M = NewType("M", float)
Km = NewType("Km", float)
DistanceUnit = TypeVar("DistanceUnit", Mm, Cm, M, Km)
def get_rectangle_circumference(
x: DistanceUnit,
y: DistanceUnit) -> DistanceUnit:
t = type(x)
return t(2*x + 2*y)

Unlike before, when we used type aliases, this time you cannot mix up different types. Let’s see how a static type checker, Pylance, treats three different calls of this function:

Floats will not work:

floats don’t work — (1) Floats don’t work. Image by author

You can’t mix up different types:

two different types don’t work — (2) Two different types don’t work. Image by author

The only way for the function to pass a static check is to use the same type for both lengths:

(3) Only the same type for both arguments work. Image by author

Of course, the type of the return value will match the type of the two arguments — so, for instance, when you provide meters, you will get meters. This is why we needed the t = type(x) line. We could make the function a little shorter:

A shorter version of the function, without assigning type(x) to `t`. — A shorter version of the function. Image by author

For intermediate and advanced Pythonistas, both versions will likely be equally readable; for a beginner, however, the former may be easier to understand.

Note that a DistanceUnit type alias would not work the same way:

Type alias for DistanceUnit does not work as expected — Type alias for DistanceUnit does not work as required. Image by author

Here, you can mix up different types in a call to get_rectangle_circumference(), something we wanted to avoid; something a type variable helped us achieve.

And so, here we are, we got what we wanted. Although the task did not seem overly complex, type aliases were not enough to achieve our aim. Nevertheless, typing’s type variables (TypeVar) and new types (NewType) came to rescue.

Type hints are not required in Python; they are optional. Sometimes it’s better to omit them altogether. When you’re forced to use them, however, you should use them wisely: let them be a help to you and your code users, not a hindrance.

I hope you’re now ready to use typing’s type aliases, type variables and new types in your own projects, at least in similar, rather simple, scenarios. When doing so, do remember not to overuse these tools. To be honest, I seldom decide to use type variables and new types. So, before deciding you’re opening these doors, think twice. Your code will definitely be much more complicated, so you must have a good reason behind doing this.

We’ve covered the basic ideas of using type aliases, type variables and new types in the Python type-hinting system. There’s much more to the topic, as Python’s static checking system is still developing, but this more comes at a cost of much greater complexity. Let’s leave it as is for today, and we’ll return to the topic some other day, when we’ll be ready to focus on more advanced aspects of Python type hinting.

¹ If someone wants to scream at me that this is not Pythonic because the function is not annotated, let me remind this person that type hints are optional in Python. If something is optional, it cannot be a decisive factor behind a claim that code is or is not Pythonic.

Time overhead of typing.NewType is visibly smaller when compared to, for example, a float-based custom class. The simple snippet below uses perftester to benchmark the two aspects:

Is creating a new type faster using typing.NewType or a custom class?
Which of the two kinds of types is quicker to use (specifically, to convert a float value to this type)?

import perftesterfrom typing import NewType
def typing_type_create():
TypingFloat = NewType("TypingFloat", float)
def class_type_create():
class ClassFloat(float): ...
TypingFloat = NewType("TypingFloat", float)
class ClassFloat(float): ...
def typing_type_use(x):
return TypingFloat(x)
def class_type_use(x):
return ClassFloat(x)
if __name__ == "__main__":
perftester.config.set_defaults("time", Number=1_000_000)
t_typing_create = perftester.time_benchmark(typing_type_create)
t_class_create = perftester.time_benchmark(class_type_create)
t_typing_use = perftester.time_benchmark(
typing_type_use, x = 10.0034
)
t_class_use = perftester.time_benchmark(
class_type_use, x = 10.0034
)
perftester.pp(dict(
create=dict(typing=t_typing_create["min"],
class_=t_class_create["min"]),
use=dict(typing=t_typing_use["min"],
class_=t_class_use["min"]),
))

Here are the results I got on my machine:

Benchmark results: the typing-based approach is faster. Image by author.

Clearly, typing.NewType creates a new type significantly — an order of magnitude — faster than a custom class does. They are, however, more or less similarly fast in creating a new class instance.

The above benchmark code is simple and shows that perftester offers a very simple API. If you want to learn more about it, read the article below:

You can of course use the timeit module for such benchmarks: