Add some new posts about typing & update the old one

This commit is contained in:
ItsDrike 2024-09-05 08:25:38 +02:00
parent 26a1bf83ff
commit 3c8e6b8d65
Signed by: ItsDrike
GPG key ID: FA2745890B7048C0
4 changed files with 1266 additions and 670 deletions

View file

@ -0,0 +1,549 @@
---
title: Typing generics and variance
date: 2021-10-04
tags: [programming, python, typing]
aliases:
- /posts/typing-variance-of-generics
changelog:
2024-09-05:
- Complete overhaul of the article, rewrite most things
- Focus more on explaining generics as a concept too, not just their variance
- Rename the article from 'Variance of typing generics (covariance, contravariance and invariance)' to 'Typing generics and variance'
- Move section about type vars into ti's own [article]({{< ref "posts/type-vars" >}})
---
Generics and variance are advanced concepts in Python's type system that offer powerful ways to write code that is
flexible and reusable across different types. By understanding these concepts, you can write more robust, maintainable
and less repetitive code, making the most of Python's type hinting capabilities.
Even if you don't work in Python, the basic concepts of generics and especially their variance carry over to all kinds
of programming languages, making them useful to understand no matter what you're coding in.
_**Pre-requisites**: This article assumes that you already have a [basic knowledge of python typing]({{< ref
"posts/python-type-checking"
>}}) and [type-vars]({{< ref "posts/type-vars" >}})._
## What are Generics
Generics allow you to define functions and classes that operate on types in a flexible yet type-safe manner. They are a
way to specify that a function or class works with multiple types, without being restricted to a single type.
### Basic generic classes
Essentially when a class is generic, it just defines that something inside of it is of some dynamic type. A
good example would be for example a list of integers: `list[int]` (or in older python versions: `typing.List[int]`).
We've specified that our list will be holding elements of `int` type.
Generics like this can be used for many things, for example with a dict, we actually provide 2 types, first is the type
of the keys and second is the type of the values: `dict[str, int]` would be a dict with `str` keys and `int` values.
Here's a list of some definable generic types that are currently present in python 3.12:
{{< table >}}
| Type | Description |
|-------------------|-----------------------------------------------------|
| list[str] | List of `str` objects |
| tuple[int, int] | Tuple of two `int` objects (immutable) |
| tuple[int, ...] | Tuple of arbitrary number of `int` (immutable) |
| dict[str, int] | Dictionary with `str` keys and `int` values |
| Iterable[int] | Iterable object containing ints |
| Sequence[bool] | Sequence of booleans (immutable) |
| Mapping[str, int] | Mapping from `str` keys to `int` values (immutable) |
{{< /table >}}
### Custom generics
In python, we can even make up our own generics with the help of `typing.Generic`:
```python
from typing import TypeVar, Generic
T = TypeVar("T")
class Person: ...
class Student(Person): ...
# If we specify a type-hint for our building like Building[Student]
# it will mean that the `inhabitants` variable will be a of type: `tuple[Student, ...]`
class Building(Generic[T]):
def __init__(self, *inhabitants: T):
self.inhabitants = inhabitants
people = [Person() for _ in range(10)]
my_building: Building[Person] = Building(*people)
students = [Student() for _ in range(10)]
my_dorm = Building[Student] = Building(*students)
# We now know that `my_building` will contain inhabitants of `Person` type,
# while `my_dorm` will only have `Student`(s) as it's inhabitants.
```
I'll go deeper into creating our custom generics later, after we learn the differences between covariance,
contravariance and invariance. For now, this is just a very simple illustrative example.
## Variance
The concept of variance tells us about whether a generic of certain type can be assigned to a generic of another type.
So for example, variance tackles a question like: Can a value of type `Building[Student]` be assigned to a variable of
type `Building[Person]`? Let's see the different kinds of generic variances.
### Covariance
The first concept of generic variance is **covariance**, the definition of which looks like this:
> If a generic `G[T]` is covariant in `T` and `A` is a subtype of `B`, then `G[A]` is a subtype of `G[B]`. This means
> that every variable of `G[A]` type can be assigned as having the `G[B]` type.
So, in other words, covariance is a concept where if we have a generic of some type, we can assign it to a generic type
of some supertype of that type. This means that the generic type is a subtype of this new generic which we've assigned
it to.
I know that this definition can sound really complicated, but it's actually not that hard, it just needs some code
examples.
#### Tuple
As an example, I'll use a `tuple`, which is an immutable sequence in python. If we have a tuple of `Car` type
(`tuple[Car, ...]`), `Car` being a subclass of `Vehicle`, can we assign this type to a tuple of Vehicles
(`tuple[Vehicle, ...]`)? The answer here is yes, so a tuple of cars is a subtype of tuple of vehicles.
This indicates that the generic type parameter for tuple is covariant.
Let's explore this further with some proper python code example:
```python
class Vehicle: ...
class Boat(Vehicle): ...
class Car(Vehicle): ...
my_vehicle = Vehicle()
my_boat = Boat()
my_car_1 = Car()
my_car_2 = Car()
vehicles: tuple[Vehicle, ...] = (my_vehicle, my_car_1, my_boat)
cars: tuple[Car, ...] = (my_car_1, my_car_1)
# This line assigns a variable with the type of 'tuple of cars' to a 'tuple of vehicles' type
# this makes sense because a tuple of vehicles can hold cars, cars are vehicles
x: tuple[Vehicle, ...] = cars
# This line however tries to assign a tuple of vehicles to a tuple of cars type.
# That however doesn't make sense because not all vehicles are cars, a tuple of
# vehicles can also contain other non-car vehicles, such as boats. These may lack
# some of the functionalities of cars, so a type checker would complain here
x: tuple[Car, ...] = vehicles
# In here, both of these assignments are valid because both cars and vehicles will
# implement all of the logic that a basic `object` class needs (everything in python
# falls under the object type). This means that this assignment is also valid
# for a generic that's covariant.
x: tuple[object, ...] = cars
x: tuple[object, ...] = vehicles
```
#### Return type
Another example of a covariant type would be the return value of a function. In python, the `collections.abc.Callable` type
(or `typing.Callable`) represents a type that supports being called. So for example a function.
If specified like `Callable[[A, B], R]`, it denotes a function that takes in 2 parameters, first with type `A`, second
with type `B`, and returns a type `R` (`def func(x: A, y: B) -> R`).
In this case, the return type for our function is also covariant, because we can return a more specific type (subtype)
as a return type.
Consider the following:
```python
class Car: ...
class WolkswagenCar(Car): ...
class AudiCar(Car): ...
def get_car() -> Car:
# The type of this function is Callable[[], Car]
# yet we can return a more specific type (a subtype) of the Car type from this function
r = random.randint(1, 2)
elif r == 1:
return WolkswagenCar()
elif r == 2:
return AudiCar()
def get_wolkswagen_car() -> WolkswagenCar:
# The type of this function is Callable[[], WolkswagenCar]
return WolkswagenCar()
# In the line below, we define a callable `x` which is expected to have a type of
# Callable[[], Car], meaning it's a function that returns a Car.
# Here, we don't mind that the actual function will be returning a more specififc
# WolkswagenCar type, since that type is fully compatible with the less specific Car type.
x: Callable[[], Car] = get_wolkswagen_car
# However this wouldn't really make sense the other way around.
# We can't assign a function which returns any kind of Car to a variable with is expected to
# hold a function that's supposed to return a specific type of a car. This is because not
# every car is a WolkswagenCar, we may get an AudiCar from this function, and that may not
# support everything WolkswagenCar does.
x: Callable[[], WolkswagenCar] = get_car
```
All of this probably seemed fairly trivial, covariance is very intuitive and it's what you would assume a generic
parameter to be in most cases.
### Contravariance
Another concept is known as **contravariance**. It is essentially a complete opposite of **covariance**.
> If a generic `G[T]` is contravariant in `T`, and `A` is a subtype of `B`, then `G[B]` is a subtype of `G[A]`. This
> means that every variable of `G[B]` type can be assigned as having the `G[A]` type.
In this case, this means that if we have a generic of some type, we can assign it to a generic type of some subtype
(e.g. `G[Car]` can be assigned to `G[AudiCar]`).
In all likelihood, this will feel very confusing, since it isn't at all obvious when a relation like this would make
sense. To answer this, let's look at the other portion of the `Callable` type, which contains the arguments to a
function.
```python
class Car: ...
class WolkswagenCar(Car): ...
class AudiCar(Car): ...
# The type of this function is Callable[[Car], None]
def drive_car(car: Car) -> None:
car.start_engine()
car.drive()
print(f"Driving {car.__class__.__name__} car.")
# The type of this function is Callable[[WolkswagenCar], None]
def drive_wolkswagen_car(wolkswagen_car: WolkswagenCar) -> None:
# We need to login to our wolkswagen account with the wolkswagen ID,
# in order to be able to drive it.
wolkswagen_car.login(wolkswagen_car.wolkswagen_id)
drive_car(wolkswagen_car)
# The type of this function is Callable[[AudiCar], None]
def drive_audi_car(audi_car: AudiCar) -> None:
# All audi cars need to report back with their license plate
# to Audi servers before driving is enabled
audi_car.contact_audi(audi_car.license_plate_number)
drive_car(wolkswagen_car)
# In here, we try to assign a function that takes a wolkswagen car to a variable
# which is declared as a callable taking any car. However this is a problem,
# because now we can call x with any car, including an AudiCar, but x is assigned
# to a fucntion that only works with wolkswagen cars!
#
# So, G[VolkswagenCar] is not a subtype of G[Car], that means this type parameter
# isn't covariant
x: Callable[[Car], None] = drive_wolkswagen_car
# On the other hand, in this example, we're assigning a function that can take any
# car to a variable that is defined as a callable that only takes wolkswagen cars
# as arguments. This is fine, because x only allows us to pass in wolkswagen cars,
# and it is set to a function which accepts any kind of car, including wolkswagen cars.
#
# This means that G[Car] is a subtype of G[WolkswagenCar], so this type parameter
# is actually contravariant
x: Callable[[WolkswagenCar], None] = drive_car
```
So from this it should be clear that the type parameters for the arguments portion of the `Callable` type aren't
covariant and you should have a basic idea what contravariance is.
To solidify this understanding a bit more, let's see contravariance again, in a slightly different scenario:
```python
class Library: ...
class Book: ...
class FantasyBook(Book): ...
class DramaBook(Book): ...
def remove_while_used(func: Callable[[Library, Book], None]) -> Callable[[Library, Book], None]:
"""This decorator removes a book from the library while `func` is running."""
def wrapper(library: Library, book: Book) -> None:
library.remove(book)
value = func(book)
library.add(book)
return value
return wrapper
# As we can see here, we can use the `remove_while_used` decorator with the
# `read_fantasy_book` function below, since this decorator expects a function
# of type: Callable[[Library, Book], None] to which we're assigning
# our function `read_fantasy_book`, which has a type of
# Callable[[Library, FantasyBook], None].
#
# Obviously, there's no problem with Library, it's the same type, but as for
# the type of the book argument, our read_fantasy_book func only expects fantasy
# books, and we're assigning it to `func` attribute of the decorator, which
# expects a general Book type. This is fine because a FantasyBook meets all of
# the necessary criteria for a general Book, it just includes some more special
# things, but the decorator function won't use those anyway.
#
# Since this assignment is be valid, it means that Callable[[Library, Book], None]
# is a subtype of Callable[[Library, FantasyBook], None], stripping the unnecessary parts
# it means that G[Book] is a subtype of G[FantasyBook], even though Book isn't a subtype
# of FantasyBook, but rather it's supertype.
@remove_while_used
def read_fantasy_book(library: Library, book: FantasyBook) -> None:
book.read()
my_rating = random.randint(1, 10)
# Rate the fantasy section of the library
library.submit_fantasy_rating(my_rating)
```
Hopefully, this made the concept of contravariance pretty clear. An interesting thing is that contravariance doesn't
really come up anywhere else other than in function arguments. Even though you may see generic types with contravariant
type parameters, they are only contravariant because those parameters are being used as function arguments in that
generic type internally.
### Invariance
The last type of variance is called **invariance**, and by now you may have already figured out what it means. Simply,
a generic is invariant in type when it's neither covariant nor contravariant.
> If a generic `G[T]` is invariant in `T` and `A` is a subtype of `B`, then `G[A]` is neither a subtype nor a supertype
> of `G[B]`. This means that any variable of `G[A]` type can never be assigned as having the `G[B]` type, and
> vice-versa.
This means that the generic type taking in a type parameter will only be assignable to itself, if the type parameter
has any different type, regardless of whether that type is a subtype or a supertype of the original, it would no longer
be assignable to the original.
What can be a bit surprising is that the `list` datatype is actually invariant in it's elements type. While an
immutable sequence such as a `tuple` is covariant in the type of it's elements, this isn't the case for mutable
sequences. This may seem weird, but there is a good reason for it. Let's take a look:
```python
class Person:
def eat() -> None: ...
class Adult(Person):
def work() -> None: ...
class Child(Person):
def study() -> None: ...
person1 = Person()
person2 = Person()
adult1 = Adult()
adult2 = Adult()
child1 = Child()
child2 = Child()
people: list[Person] = [person1, person2, adult2, child1]
adults: list[Adult] = [adult1, adult2]
# At first, it is important to establish that list isn't contravariant. This is perhaps quite intuitive, but it is
# important nevertheless. In here, we tried to assign a list of people to `x` which has a type of list of children.
# This obviously can't work, because a list of people can include other types than just `Child`, and these types
# can lack some of the features that children have, meaning lists can't be contravariant.
x: list[Child] = people
```
Now that we've established that list type's elements aren't contravariant, let's see why it would be a bad idea to make
them covariant (like tuples). Essentially, the main difference here is the fact that a tuple is immutable, list isn't.
This means that you can add new elements to lists and alter them, but you can't do that with tuples, if you want to add
a new element there, you'd have to make a new tuple with those elements, so you wouldn't be altering an existing one.
Why does that matter? Well let's see this in an actual example
```python
def append_adult(adults: list[Person]) -> None:
new_adult = Adult()
adults.append(adult)
child1 = Child()
child2 = Child()
children: list[Child] = [child1, child2]
# This is where the covariant assignment happens, we assign a list of children
# to a list of people, `Child` being a subtype of Person`. Which would imply that
# list is covariant in the type of it's elements. A type-checker should complain
# about this line, so let's see why allowing it is a bad idea.
people: list[Person] = children
# Since we know that `people` is a list of `Person` type elements, we can obviously
# pass it over to `append_adult` function, which takes a list of `Person` type elements.
# After we called this fucntion, our list got altered. it now includes an adult, which
# should be fine assuming list really is covariant, since this is a list of people, and
# `Adult` type is a subtype of `Person`.
append_adult(people)
# Let's go back to the `children` list now, let's loop over the elements and do some stuff with them
for child in children:
# This will work fine, all people can eat, that includes adults and children
child.eat()
# Only children can study, but that's not an issue, because we're working with
# a list of children, right? Oh wait, but we appended an Adult into `people`, which
# also mutated `children` (it's the same list) and Adults can't study, uh-oh ...
child.study() # AttributeError, 'Adult' class doesn't have 'study' attribute
```
As we can see from this example, the reason lists can't be covariant because it would allow us to mutate them and add
elements of completely unrelated types that break our original list.
That said, if we copied the list, re-typing in to a supertype wouldn't be an issue:
```python
class Game: ...
class BoardGame(Game): ...
class SportGame(Game): ...
board_games: list[BoardGame] = [tic_tac_toe, chess, monopoly]
games: list[Game] = board_games.copy()
games.append(voleyball)
```
This is why immutable sequences are covariant, they don't make it possible to edit the original, instead, if a change is
desired, a new object must be made. This is why `tuple` or other `typing.Sequence` types can be covariant, but lists and
`typing.MutableSequence` types need to be invariant.
### Recap
- if G[T] is covariant in T, and A (wolkswagen car) is a subtype of B (car), then G[A] is a subtype of G[B]
- if G[T] is contravariant in T, and A is a subtype of B, then G[B] is a subtype of G[A]
- if G[T] is invariant in T, and A is a subtype of B, then G[A] and G[B] don't have any subtype relation
## Creating Generics
Now that we know what it means for a generic to have a covariant/contravariant/invariant type, we can explore how to
make use of this knowledge and actually create some generics with these concepts in mind
### Making an invariant generics
```python
from typing import TypeVar, Generic
from collections.abc import Iterable
# We don't need to specify covariant=False nor contravariant=False, these are the default
# values (meaning all type-vars are invariant by default), I specify these parameters
# explicitly just to showcase them.
T = TypeVar("T", covariant=False, contravariant=False)
class University(Generic[T]):
students: list[T]
def __init__(self, students: Iterable[T]) -> None:
self.students = [s for s in students]
def add_student(self, student: T) -> None:
students.append(student)
x: University[EngineeringStudent] = University(engineering_students)
y: University[Student] = x # NOT VALID! University isn't covariant
z: University[ComputerEngineeringStudent] = x # NOT VALID! University isn't contravariant
```
In this case, our University generic type is invariant in the student type, meaning that
if we have a `University[Student]` type and `University[EngineeringStudent]` type, neither
is a subtype of the other.
### Making covariant generics
In the example below, we create a covariant `TypeVar` called `T_co`, which we then use in our custom generic, this name
for the type-var is actually following a common convention for covariant type-vars, so it's a good idea to stick to it
if you can.
```python
from collections.abc import Iterable, Sequence
from typing import Generic, TypeVar
T_co = TypeVar("T_co", covariant=True)
class Matrix(Sequence[Sequence[T_co]], Generic[T_co]):
_rows: tuple[tuple[T_co, ...], ...]
def __init__(self, rows: Iterable[Iterable[T_co]]):
self._rows = tuple(tuple(el for el in row) for row in rows)
def __getitem__(self, row_id: int, col_id: int) -> T_co:
return self._rows[row_id][col_id]
def __len__(self) -> int:
return len(self._rows)
class X: ...
class Y(X): ...
class Z(Y): ...
x: Matrix[Y] = Matrix([[Y(), Z()], [Z(), Y()]])
y: Matrix[X] = x # VALID. Matrix is covariant
z: Matrix[Z] = x # INVALID! Matirx isn't contravariant
```
In this case, our Matrix generic type is covariant in the element type, meaning that we can assign `Matrix[Y]` type
to `Matrix[X]` type, with `Y` being a subtype of `X`.
This works because the type-var is only used in covariant generics, in this case, with a `tuple`. If we stored the
internal state in an invariant type, like a `list`, marking our type-var as covariant would be unsafe. Some
type-checkers can detect and warn you if you do this, but many won't, so be cautions.
### Making contravariant generics
Similarly to the above, the contravariant type var we create here is following a well established naming convention,
being called `T_contra`.
```python
from typing import TypeVar, Generic
import pickle
import requests
T_contra = TypeVar("T_contra", contravariant=True)
class Sender(Generic[T_contra]):
def __init__(self, url: str) -> None:
self.url = url
def send_request(self, val: T_contra) -> str:
s = pickle.dumps(val)
requests.post(self.url, data={"object": s})
class X: ...
class Y(X): ...
class Z(Y): ...
a: Sender[Y] = Sender("https://test.com")
b: Sender[Z] = x # VALID, sender is contravariant
c: Sender[X] = x # INVALID, sender is covariant
```
In this case, our `Sender` generic type is contravariant in it's value type, meaning that
if we have a `Sender[Y]` type and `Sender[Z]` type, we could assign the `Sender[Y]` type
to the `Sender[Z]` type, hence making it it's subtype.
i.e. if we had a sender generic of Car type with `send_request` function, and we would be able to assign it to a sender
of Vehicle type, suddenly it would allow us to use other vehicles, such as airplanes to be passed to `send_request`
function, but this function only expects type of `Car` (or it's subtypes).
On the other hand, if we had this generic and we tried to assign it to a sender of `AudiCar`, that's fine, because now
all arguments passed to `send_request` function will be required to be of the `AudiCar` type, but that's a subtype of a
general `Car` and implements everything this general car would, so the function doesn't mind.
This works because the type variable is only used in contravariant generics, in this case, in Callable's arguments.
This means that the logic of determining subtypes for callables will be the same for our Sender generic. Once again, be
cautions about marking a type-var as contravariant, and make sure to only do it when it really is safe. If you use this
type-var in any covariant or invariant structure, while also being used in a contravariant structure, the type-var
needs to be changed to an invariant type-var.
## Conclusion
Understanding generics and variance in Python's type system opens the door to writing more flexible, reusable, and
type-safe code. By learning the differences between covariance, contravariance, and invariance, you can design better
abstractions and APIs that handle various types in a safe manner. Covariant types are useful when you want to ensure
that your type hierarchy flows upwards, whereas contravariant types allow you to express type hierarchies in reverse
for certain use cases like function arguments. Invariance, meanwhile, helps maintain strict type safety in mutable
structures like lists.
These principles of variance are not unique to Python — they are foundational concepts in many statically-typed
languages such as Java or C#. Understanding them will not only deepen your grasp of Python's type system but
also make it easier to work with other languages that implement similar type-checking mechanisms.

View file

@ -0,0 +1,449 @@
---
title: A guide to type checking in python
date: 2024-10-04
tags: [programming, python, typing]
sources:
- https://dev.to/decorator_factory/type-hints-in-python-tutorial-3pel
- https://docs.basedpyright.com/#/type-concepts
- https://mypy.readthedocs.io/en/stable/
- https://typing.readthedocs.io/en/latest/spec/special-types.html
---
Python is often known for its dynamic typing, which can be a drawback for those who prefer static typing due to its
benefits in catching bugs early and enhancing editor support. However, what many people don't know is that Python does
actually support specifying the types and it is even possible to enforce these types and work in a statically
type-checked Python environment. This article is an introduction to using Python in this way.
## Regular python
In regular python, you might end up writing a function like this:
```python
def add(x, y):
return x + y
```
In this code, you have no idea what the type of `x` and `y` arguments should be. So, even though you may have intended
for this function to only work with numbers (ints), it's actually entirely possible to use it with something else. For
example, running `add("hello", "world)` will return `"helloworld"` because the `+` operator works on strings too.
The point is, there's nothing telling you what the type of these parameters should be, and that could lead to
misunderstandings. Even though in some cases, you can judge what the type of these variables should be just based on
the name of that function, in most cases, it's not that easy to figure out and often requires looking through docs, or
just going over the code of that function.
Annoyingly, python doesn't even prevent you from passing in types that are definitely incorrect, like: `add(1, "hi")`.
Running this would cause a `TypeError`, but unless you have unit-tests that actually run that code, you won't find out
about this bug until it actually causes an issue and at that point, it might already be too late, since your code has
crashed a production app.
Clearly then, this isn't ideal.
## Type-hints
While python doesn't require it, it does have support for specifying "hints" that indicate what type should a given
variable have. So, when we take a look at the function above, adding type-hints to it would look like this:
```python
def add(x: int, y: int) -> int:
return x + y
```
We've now made the types very explicit to the programmer, which means they'll no longer need to spend a bunch of time
looking through the implementation of that function, or going through the documentation just to know how to use this
function. Instead, the type hints will tell just you.
This is incredibly useful, because most editors will be able to pick up these type hints, and show them to you while
calling the function, so you know what to pass right away, without even having to look at the function definition where
the type-hints are defined.
Not only that, specifying a type-hint will greatly improve the development experience in your editor / IDE, because
you'll get much better auto-completion. The thing is, if you have a parameter like `x`, but your editor doesn't know
what type it should have, it can't really help you if you start typing `x.remove`, looking for the `removeprefix`
function. However, if you tell your editor that `x` is a string (`x: str`), it will now be able to go through all of
the methods that strings have, and show you those that start with `remove` (being `removeprefix` and `removesuffix`).
This makes type-hints great at saving you time while developing, even though you have to do some additional work when
specifying them.
## Run-time behavior
Even though type-hints are a part of the Python language, the Python interpreter doesn't actually care about them. That
means that there isn't any optimizations or checking performed when you're running your code, so even with type hints
specified, they will not be enforced! This means that you can actually just choose to ignore them, and call the
function with incorrect types, like: `add(1, "hi")` without it causing any immediate runtime errors.
Most editors are configured very loosely when it comes to type-hints. That means they will show you these hints when
you're working with the function, but they won't produce warnings. That's why they're called "type hints", they're only
hints that can help you out, but they aren't actually enforced.
## Static type checking tools
Even though python on it's own indeed doesn't enforce the type-hints you specify, there are tools that can run static
checks against your code to check for type correctness.
{{< notice tip >}}
A static check is a check that works with your code in it's textual form. It will read the contents of your python
files without actually running that file and analyze it purely based on that text content.
{{< /notice >}}
Using these tools will allow you to analyze your code for typing mistakes before you ever even run your program. That
means having a function call like `add(1, "hi")` anywhere in your code would be detected and reported as an issue. This
is very similar to running a linter like [`flake8`](https://flake8.pycqa.org/en/latest/) or
[`ruff`](https://docs.astral.sh/ruff/).
Since running the type-checker manually could be quite annoying, so most of them have integrations with editors / IDEs,
which will allow you to see these errors immediately as you code. This makes it much easier to immediately notice any
type inconsistencies, which can help you catch or avoid a whole bunch of bugs.
### Most commonly used type checkers
- [**Pyright**](https://github.com/microsoft/pyright): Known for its speed and powerful features, it's written in
TypeScript and maintained by Microsoft.
- [**MyPy**](https://mypy.readthedocs.io/en/stable/): The most widely used type-checker, developed by the official
Python community. It's well integrated with most IDEs and tools, but it's known to be slow to adapt new features.
- [**PyType**](https://google.github.io/pytype/): Focuses on automatic type inference, making it suitable for codebases
with minimal type annotations.
- [**BasedPyright**](https://docs.basedpyright.com/): A fork of pyright with some additional features and enhancements,
my personal preference.
## When to use type hints?
Like you saw before with the `add` function, you can specify type-hints on functions, which allows you to describe what
types can be passed as parameters of that function alongside with specifying a return-type:
```python
def add(x: int, y: int) -> int:
...
```
You can also add type-hints directly to variables:
```python
my_variable: str = "hello"
```
That said, doing this is usually not necessary, since most type-checkers can "infer" what the type of `my_variable`
should be, based on the value it's set to have. However, in some cases, it can be worth adding the annotation, as the
inference might not be sufficient. Let's consider the following example:
```python
my_list = []
```
In here, a type-checker can infer that this is a `list`, but they can't recognize what kind of elements will this list
contain. That makes it worth it to specify a more specific type:
```python
my_list: list[int] = []
```
Now the type-checker will recognize that the elements inside of this list will be integers.
## Special types
While in most cases, it's fairly easy to annotate something with the usual types, like `int`, `str`, `list`, `set`, ...
in some cases, you might need some special types to represent certain types.
### None
This isn't very special at all, but it may be surprising for beginners at first. You've probably seen the `None` type
in python before, but what you may not realize is that if you don't add any return statements into your function, it
will automatically return a `None` value. That means if your function doesn't return anything, you should annotate it
as returning `None`:
```python
def my_func() -> None:
print("I'm a simple function, I just print something, but I don't explicitly return anything")
x = my_func()
assert x is None
```
### Union
A union type is a way to specify that a type can be one of multiple specified types, allowing flexibility while still
enforcing type safety.
There are multiple ways to specify a Union type. In modern versions of python (3.10+), you can do it like so:
```python
x: int | str = "string"
```
If you need to support older python versions, you can also using `typing.Union`, like so:
```python
from typing import Union
x: Union[int, str] = "string"
```
As an example this function takes a value that can be of various types, and parses it into a bool:
```python
def parse_bool_setting(value: str | int | bool) -> bool:
if isinstance(value, bool):
return value
if isinstance(value, int):
if value == 0:
return False
if value == 1:
return True
raise ValueError(f"Value {value} can't be converted to boolean")
# value can only be str now
if value.lower() in {"yes", "1", "true"}:
return True
if value.lower() in {"no", "0", "false"}:
return False
raise ValueError(f"Value {value} can't be converted to boolean")
```
One cool thing to notice here is that after the `isinstance` check, the type-checker will narrow down the type, so that
when inside of the block, it knows what type `value` has, but also outside of the block, the type-checker can narrow
the entire union and remove one of the variants since it was already handled. That's why at the end, we didn't need the
last `isinstance` check, the type checker knew the value was a string, because all the other options were already
handled.
### Any
In some cases, you might want to specify that your function can take in any type. This can be useful when annotating a
specific type could be way too complex / impossible, or you're working with something dynamic where you just don't care
about the typing information.
```python
from typing import Any
def foo(x: Any) -> None:
# a type checker won't warn you about accessing unknown attributes on Any types,
# it will just blindly allow anything
print(x.foobar)
```
{{< notice warning >}}
Don't over-use `Any` though, in vast majority of cases, it is not the right choice. I will touch more on it in the
section below, on using the `object` type.
{{< /notice >}}
The most appropriate use for the `Any` type is when you're returning some dynamic value from a function, where the
developer can confidently know what the type will be, but which is impossible for the type-checker to figure out,
because of the dynamic nature. For example:
```python
from typing import Any
global_state = {}
def get_state_variable(name: str) -> Any:
return global_state[name]
global_state["name"] = "Ian"
global_state["surname"] = "McKellen"
global_state["age"] = 85
###
# Notice that we specified the annotation here manually, so that the type-checker will know
# what type we're working with. But we only know this type because we know what we stored in
# our dynamic state, so the function itself can't know what type to give us
full_name: str = get_state_variable("name") + " " + get_state_variable("surname")
```
### object
In many cases where you don't care about what type is passed in, people mistakenly use `typing.Any` when they should
use `object` instead. Object is a class that every other class subclasses. That means every value is an `object`.
The difference between doing `x: object` and `x: Any` is that with `Any`, the type-checker will essentially avoid
performing any checks whatsoever. That will mean that you can do whatever you want with such a variable, like access a
parameter that might not exist (`y = x.foobar`) and since the type-checker doesn't know about it, `y` will now also be
considered as `Any`. With `object`, even though you can still assign any value to such a variable, the type checker
will now only allow you to access attributes that are shared to all objects in python. That way, you can make sure that
you don't do something that not all types support, when your function is expected to work with all types.
For example:
```python
def do_stuff(x: object) -> None:
print(f"The do_stuff function is now working with: {x}")
if isinstance(x, str):
# We can still narrow the type down to a more specific type, now the type-checker
# knows `x` is a string, and we can do some more things, that strings support, like:
print(x.removeprefix("hello"))
if x > 5: # A type-checker will mark this as an error, because not all types support comparison against ints
print("It's bigger than 5")
```
### Collection types
Python also provides some types to represent various collections. We've already seen the built-in `list` collection
type before. Another such built-in collection types are `tuple`, `set`, `forzenset` and `dict`. All of these types are
what we call "generic", which means that we can specify an internal type, which in this case represents the items that
these collections can hold, like `list[int]`.
Here's a quick example of using these generic collection types:
```python
def print_items(lst: list[str]) -> None:
for index, item in enumerate(lst):
# The type-checker knows `item` variable is a string now
print(f"-> Item #{index}: {item.strip()}")
print_items([1, 2, 3])
```
That said, in many cases, instead of using these specific collection types, you can use a less specific collection, so
that your function will work with multiple kinds of collections. Python has abstract classes for general collections
inside of the `collections.abc` module. One example would be the `Sequence` type:
```python
from collections.abc import Sequence
def print_items2(lst: Sequence[str]) -> None:
for index, item in enumerate(lst):
# The type-checker knows `item` variable is a string now
print(f"Item #{index}: {item.strip()}")
print_items([1, 2, 3]) # fine
print_items((1, 2, 3)) # nope
print_items2([1, 2, 3]) # works
print_items2((1, 2, 3)) # works
print_items2({1, 2, 3}) # works
```
You may think that you could also just use a union like: `list[str] | set[str] | tuple[str, ...]`, however that still
wouldn't quite cover everything, since people can actually make their own custom classes that have `__getitem__` and
work like a sequence, yet doesn't inherit from `list` or any of the other built-in types. By specifying
`collections.abc.Sequence` type-hint, even these custom classes that behave like sequences will work with your function.
There are various other collections classes like these and it would take pretty long to explain them all here, so you
should do some research on them on your own to know what's available.
{{< notice warning >}}
It is important to note that the built-in collection types like `list` weren't subscriptable in earlier versions of
python (before 3.9). If you still need to maintain compatibility with such older python versions, you can instead use
`typing.List`, `typing.Tuple`, `typing.Set` and `typing.Dict`. These types will support being subscripted even in those
older versions.
Similarly, this also applies to the `collections.abc` abstract types, like `Sequence`, which also wasn't subscriptable
in these python versions. These also have alternatives in `typing` module: `typing.Sequence`, `typing.Mapping`,
`typing.MutableSequence`, `typing.Iterable`, ...
{{< /notice >}}
#### Tuple type
Python tuples are a bit more complicated than the other collection types, since we can specify which type is at which
position of the tuple. For example: `tuple[int, str, float]` will represent a tuple like: `(1, "hi", 5.3)`. The tricky
thing here is that specifying `tuple[int]` will not mean a tuple of integers, it will mean a tuple with a single
integer: `(1, )`. If you do need to specify a tuple with any amount of items of the same type, what you actually need
to do is: `tuple[int, ...]`. This annotation will work for `(1, )` or `(1, 1, 1)` or `(1, 1, 1, 1, 1)`.
The reason for this is that we often use tuples to allow returning multiple values from a function. Yet these values
usually don't have the same type, so it's very useful to be able to specify these types individually:
```python
def some_func() -> tuple[int, str]:
return 1, "hello"
```
That said, a tuple can also be useful as a sequence type, with the major difference between it and a list being that
tuples are immutable. This can make them more appropriate for storing certain sequences than lists.
## Type casts
Casting is a way to explicitly specify the type of a variable, overriding the type inferred by the type-checker.
This can be very useful, as sometimes, we programmers have more information than the type-checker does, especially when
it comes to some dynamic logic that is hard to statically evaluate. The type checker's inference may end up being too
broad or sometimes even incorrect.
For example:
```python
from typing import cast
my_list: list[str | int] = []
my_list.append("Foo")
my_list.append(10)
my_list.append("Bar")
# We know that the first item in the list is a string
# the type-checker would otherwise infer `x: str | int`
x = cast(str, my_list[0])
```
Another example:
```python
from typing import cast
def foo(obj: object, type_name: str) -> None:
if type_name == "int":
obj = cast(int, obj)
... # some logic
elif type_name == "str":
obj = cast(str, obj)
... # some logic
else:
raise ValueError(f"Unknown type name: {type_name}")
```
{{< notice warning >}}
It is important to mention that unlike the casts in languages like Java or C#, in Python, type casts do not perform any
runtime checks to ensure that the variable really is what we claim it to be. Casts are only used as a hint to the
type-checker, and on runtime, the `cast` function just returns the value back without any extra logic.
If you do wish to also perform a runtime check, you can use assertions to narrow the type:
```python
def foo(obj: object) -> None:
print(obj + 1) # can't add 'object' and 'int'
assert isinstance(obj, int)
print(obj + 1) # works
```
Alternatively, you can just check with if statements:
```python
def foo(obj: object) -> None:
print(obj + 1) # can't add 'object' and 'int'
if not isinstance(obj, int):
raise TypeError("Expected int")
print(obj + 1) # works
```
{{< /notice >}}
## Closing notes
In summary, Pythons type hints are a powerful tool for improving code clarity, reliability, and development
experience. By adding type annotations to your functions and variables, you provide valuable information to both your
IDE and fellow developers, helping to catch potential bugs early and facilitating easier code maintenance.
Type hints offer significant benefits:
- Enhanced Readability: Clearly specifies the expected types of function parameters and return values, making the code
more self-documenting.
- Improved Development Experience: Provides better auto-completion and in-editor type checking, helping you avoid
errors and speeding up development.
- Early Error Detection: Static type checkers can catch type-related issues before runtime, reducing the risk of bugs
making it into production.
For further exploration of Pythons type hints and their applications, you can refer to additional resources such as:
- The [Type Hinting Cheat Sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html) from mypy for a quick
reference on various type hints and their usage.
- My other articles on more advanced typing topics like [TypeVars]({{< ref "posts/type-vars" >}}) and [Generics]({{< ref
"posts/generics-and-variance" >}}) for deeper insights into Python's typing system.
Embracing type hints can elevate your Python programming experience, making your code more robust and maintainable in
the long run.

268
content/posts/type-vars.md Normal file
View file

@ -0,0 +1,268 @@
---
title: Type variables in python typing
date: 2024-10-04
tags: [programming, python, typing]
sources:
- https://mypy.readthedocs.io/en/stable/generics.html#generic-functions
- https://docs.basedpyright.com/#/type-concepts-advanced?id=value-constrained-type-variables
- https://dev.to/decorator_factory/typevars-explained-hmo
- https://peps.python.org/pep-0695/
- https://typing.readthedocs.io/en/latest/spec/generics.html
---
Pythons type hinting system offers great flexibility, and a crucial part of that flexibility comes from **Type
Variables**. These allow us to define [generic types]({{< ref "posts/generics-and-variance" >}}), enabling us to write
functions and classes that work with different types while maintaining type safety. Lets dive into what type variables
are, how to use them effectively, and why they are useful.
_**Pre-requisites**: This article assumes that you already have a [basic knowledge of python typing]({{< ref
"posts/python-type-checking" >}})._
## What's a Type Variable
A type variable (or a `TypeVar`) is basically representing a variable type. It essentially acts as a placeholder for a
specific type within a function or a class. Instead of locking down a function to operate on a specific type, type
variables allow it to adapt to whatever type is provided.
For example:
```python
from typing import TypeVar
T = TypeVar("T")
def identity(item: T) -> T:
"""
Return the same item that was passed in, without modifying it.
The type of the returned item will be the same as the input type.
"""
return item
```
In this example, `T` is a type variable, meaning that the function `identity` can take any type of argument and will return
an object of that same type. If you pass an integer, it returns an integer; if you pass a string, it returns a string.
The function adapts to the type of input while preserving the type in the output.
```python
identity(5) # Returns 5 (int)
identity("hello") # Returns "hello" (str)
identity([1, 2, 3]) # Returns [1, 2, 3] (list)
```
Whenever the function is called, the type-var gets "bound" to the type used in that call, that allows the type checker
to enforce the type consistency across the function with this bound type.
## Type Variables with Upper Bounds
You can also restrict a type variable to only types that are subtypes of a specific type by using the `bound` argument.
This is useful when you want to ensure that the type variable is always a subclass of a particular type.
```python
from typing import TypeVar
from collections.abc import Sequence
T = TypeVar("T", bound=Sequence)
def split_sequence(seq: T, chunks: int) -> list[T]:
""" Split a given sequence into n equally sized chunks of itself.
If the sequence can't be evenly split, the last chunk will contain
the additional elements.
"""
new = []
chunk_size = len(seq) // chunks
for i in range(chunks):
start = i * chunk_size
end = i * chunk_size + chunk_size
if i == chunks - 1:
# On last chunk, include all remaining elements
new.append(seq[start:])
else:
new.append(seq[start:end])
return new
```
In this example, `T` is bounded by `Sequence`, so `split_sequence` can work with any type of sequence, such as lists or
tuples. The return type will be a list with elements being slices of the original sequence, so the list items will
match the type of the input sequence, preserving it.
If you pass a `list[int]`, you'll get a `list[list[int]]`, and if you pass a `tuple[str]`, you'll get a
`list[tuple[str]]`.
## Type Variables with Specific Type Restrictions
Type variables can also be restricted to specific types, which can be useful when you want to enforce that a type
variable can only be one of a predefined set of types.
One common example is `AnyStr`, which can be either `str` or `bytes`. In fact, this type is so common that the `typing`
module actually contains it directly (`typing.AnyStr`). Here is an example of how to define this type-var:
```python
from typing import TypeVar
AnyStr = TypeVar("AnyStr", str, bytes)
def concat(x: AnyStr, y: AnyStr) -> AnyStr:
return x + y
concat("a", "b") # valid
concat(b"a", b"b") # valid
concat(1, 2) # error
```
**Why not just use `Union[str, bytes]`?**
You might wonder why we dont just use a union type, like this:
```python
from typing import Union
def concat(x: Union[str, bytes], y: Union[str, bytes]) -> Union[str, bytes]:
return x + y
```
While this might seem similar, the key difference lies in type consistency. A Union would allow one of the variables to
be `str` while the other is `bytes`, however, combining these isn't possible, meaning this would break at runtime!
```python
concat(b"a", "b") # No type error, but implementation fails!
```
**How about `TypeVar("T", bound=Union[str, bytes])`?**
This actually results in the same issue. The thing is, type-checkers are fairly smart and if you call `concat(b"a",
"b")`, it will use the narrowest type from that top `Union[str, bytes]` type bound when binding the type var. This
narrowest type will end up being the union itself, so the type-var will essentially become the union type, leaving you
with the same issue.
For that reason, it can sometimes be useful to use specific type restrictions with a type-var, rather than just binding
it to some common top level type.
## New TypeVar syntax
In python 3.12, it's now possible to use a new, much more convenient syntax for generics, which looks like this:
```python
def indentity[T](x: T) -> T:
return x
```
To specify a bound for a type var like this, you can do:
```python
def car_identity[T: Car](car: T) -> T:
return car
```
This syntax also works for generic classes:
```python
class Foo[T]:
def __init__(self, x: T):
self.x = x
```
## TypeVarTuple
A `TypeVarTuple` is defined similarly to a regular `TypeVar`, but it is used to represent a variable-length tuple. This
can be useful when you want to work with functions or classes that need to preserve a certain shape of tuples, or
modify it in a type-safe manner.
```python
from typing import TypeVar, TypeVarTuple, reveal_type, cast
T = TypeVar("T")
Ts = TypeVarTuple("Ts")
def tuple_append(tup: tuple[*Ts], val: T) -> tuple[*Ts, T]:
return (*tup, val)
x = (2, "hi", 0.8)
y = tuple_append(x, 10)
reveal_type(y) # tuple[int, str, float, int]
```
Or with the new 3.12 syntax:
```python
from typing import cast, reveal_type
def tuple_sum[*Ts](*tuples: tuple[*Ts]) -> tuple[*Ts]:
summed = tuple(sum(tup) for tup in zip(*tuples))
reveal_type(summed) # tuple[int, ...]
# The type checker only knows that the sum function returns an int here, but this is way too dynamic
# for it to understand that summed will end up being tuple[*Ts]. For that reason, we can use a cast
return cast(tuple[*Ts], summed)
x = (10, 15, 20.0)
y = (5, 10, 15.0)
z = (1, 2, 3.2)
res = tuple_sum(x, y, z)
print(res) # (16, 27, 38.2)
reveal_type(res) # tuple[int, int, float]
```
## ParamSpec
In addition to `TypeVar`, Python 3.10 introduced `ParamSpec` for handling type variables related to function parameters.
Essentially, a `ParamSpec` is kind of like having multiple type-vars for all of your parameters, but stored in a single
place. It is mainly useful in function decorators:
```python
from typing import TypeVar, ParamSpec
from collections.abc import Callable
P = ParamSpec('P')
R = TypeVar('R')
def decorator(func: Callable[P, R]) -> Callable[P, R]:
def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
print("Before function call")
result = func(*args, **kwargs)
print("After function call")
return result
return wrapper
@decorator
def say_hello(name: str) -> str:
return f"Hello, {name}!"
print(say_hello("Alice"))
print(say_hello(55)) # error: 'int' type can't be assigned to parameter name of type 'str'
```
In this example, the `ParamSpec` is able to fully preserve the input parameters of the decorated function, just like
the `TypeVar` here preserves the single return type parameter.
With the new 3.12 syntax, `ParamSpec` can also be specified like this:
```python
from collections.abc import Callable
def decorator[**P, R](func: Callable[P, R]) -> Callable[P, R]:
...
```
### Concatenate
In many cases, `ParamSpec` is used in combination with `typing.Concatenate`, which can allow for consuming or adding
function parameters, for example by specifying: `Callable[Concatenate[int, P], str]` we limit our decorator to only
accept functions that take an int as the first argument and return a string. This also allows the decorator to remove
that argument after decoration, by specifying the return type as `Callable[P, str]`:
```python
from typing import ParamSpec, Concatenate
from collections.abc import Callable
P = ParamSpec('P')
def id_5(func: Callable[Concatenate[int, P], str]) -> Callable[P, str]:
def wrapper(*args: P.args, **kwargs: P.kwargs) -> str:
return func(5, *args, **kwargs)
return wrapper
@id_5
def log_event(id: int, event: str) -> str:
return f"Got an event on {id=}: {event}!"
```

View file

@ -1,670 +0,0 @@
---
title: Variance of typing generics (covariance, contravariance and invariance)
date: 2021-10-04
tags: [programming, python]
---
In many programming languages where typing matters we often need to define certain properties for the types of generics
so that they can work properly. Specifically, when we use a generic type of some typevar `X` we need to know when that
generic type with typevar `Y` should be treated as it's subtype. I know this probably sounds pretty confusing but don't
worry, I'll explain what that sentence means in quite a lot of detail here. (That's why I wrote a whole article about
it). It's actually not that difficult to understand, it just needs a few examples to explain it.
As a very quick example of what I mean: When we use a sequence of certain types, say a sequence containing elements of
type Shirt that is a subtype of a Clothing type, can we assign this sequence as having a type of sequence of clothing
elements? If yes, than this sequence would be covariant in it's elements type. What about a sequence of Clothing
elements? Can we assign this sequence as having a type of a sequence of Shirts? If yes, then this sequence generic
would be contravariant in it's elements type. Or, if the answer to both of these was no, then the sequence is
invariant.
For simplicity, I'll be using python in the examples. Even though python isn't a strictly typed language, because of
tools such as pyright, mypy or many others, python does have optional support for typing that can be checked for
outside of run time (it's basically like strictly typed languages that check this on compile time, except in python,
it's optional and doesn't actually occur on compilation, so we say that it occurs "on typing time" or "linting time").
Do note that this post is a bit more advanced than the other ones I made and if you don't already feel comfortable with
basic typing concepts in python, it may not be very clear what's going on in here so I'd suggest learning something
about them before reading this.
## Pre-conceptions
This section includes some explanation of certain concepts that I'll be using in later the article, if you already know
what these are, you can skip them, however if you don't it is crucial that you read through this to understand the rest
of this article. I'll go through these concepts briefly, but it should be sufficient to understand the rest of this
article. If you do want to know more though, I'd suggest looking at mypy documentation or python documentation.
### Type Variables
A type variable (or a TypeVar) is basically representing a variable type. What this means is that we can have a
function that takes a variable of type T (which is our TypeVar) and returns the type T. Something like this will mean
that we return an object of the same type as the object that was given to the function.
```python
from typing import TypeVar, Any
T = TypeVar("T")
def set_a(obj: T, a_value: Any) -> T:
"""
Set the value of 'a' attribute for given `obj` of any type to given `a_value`
Return the same object after this adjustment was made.
"""
obj.a = a_value
# Note that this function probably doesn't really need to return this
# because `obj` is obviously mutable since we were able to set the it's value to something
# that wasn't previously there
return obj
```
If you've understood this example, you can move onto the next section, however if you want to know something extra
about these type variables or you didn't quite understand everything, I've included some more subsections about them
with more examples on some interesting things that you can do with them.
#### Type variables with value restriction
By default, a type variable can be replaced by any type. This is usually what we want, but sometimes it does make sense
to restrict a TypeVar to only certain types.
A commonly used variable with such restrictions is `typing.AnyStr`. This typevar can only have values `str` and
`bytes`.
```python
from typing import TypeVar
AnyStr = TypeVar("AnyStr", str, bytes)
def concat(x: AnyStr, y: AnyStr) -> AnyStr:
return x + y
concat("a", "b")
concat(b"a", b"b)
concat(1, 2) # Error!
```
This is very different from just using a simple `Union[str, bytes]`:
```python
from typing import Union
UnionAnyStr = Union[str, bytes]
def concat(x: UnionAnyStr, y: UnionAnyStr) -> UnionAnyStr:
return x + y
```
Because in this case, if we pass in 2 strings, we don't know whether we will get a `str` object back, or a `bytes` one.
It would also allow us to use `concat("x", b"y")` however we don't know how to concatenate string object with bytes.
With a TypeVar, the type checker will reject something like this, but with a simple Union, this would be treated as
a valid function call and the argument types would be marked as correct, even though the implementation will fail.
#### Type variable with upper bounds
We can also restrict a type variable to having values that are a subtype of a specific type. This specific type is
called the upper bound of the type variable.
```python
from typing import TypeVar, Sequence
T = TypeVar("T", bound=Sequence)
# Signify that the return type of this function will be the list containing
# sequences of the same type sequence as the type we got from the argument
def split_sequence(seq: T, chunks: int) -> list[T]:
"""
Split a given sequence into n equally sized chunks of itself.
If the sequence can't be evenly split, the last chunk will contain
the additional elements.
"""
new = []
chunk_size = len(seq) // chunks
for i in range(chunks):
start = i * chunk_size
end = i * chunk_size + chunk_size
if i == chunks - 1:
# On last chunk, include all remaining elements
new.append(seq[start:])
else:
new.append(seq[start:end])
return new
```
In here, we know that this function function will work for any type of sequence, however just using input argument type
of sequence wouldn't be ideal, because it wouldn't preserve that type when returning a list of chunks of those
sequences. With that kind of approach, we'd lost the type definition of our sequence from for example `list[int]` only to
`Sequence[object]`.
For that reason, we can use a type-var, in which we can enforce that the type must be a sequence, but we still don't
know what kind of sequence it may be, so it can be any subtype that implements the necessary functions for a sequence.
This means if we pass in a list, we know we will get back a list of lists, if we pass a tuple, we'll get a list of
tuples, and if we pass a list of integers, we'll get a list of lists of integers. This means the original type won't be
lost even after going through a function.
### Generic Types
Essentially when a class is generic, it just defines that something inside of our generic type is of some other type. A
good example would be for example a list of integers: `list[int]` (or in older python versions: `typing.List[int]`).
We've specified that our list will be holding elements of `int` type.
Generics like this can be used for many things, for example with a dict, we actually provide 2 types, first is the type
of the keys and second is the type of the values: `dict[str, int]` would be a dict with `str` keys and `int` values.
Here's a list of some definable generic types that are currently present in python 3.9:
{{< table >}}
| Type | Description |
|-------------------|-----------------------------------------------------|
| list[str] | List of `str` objects |
| tuple[int, int] | Tuple of two `int` objects |
| tuple[int, ...] | Tuple of arbitrary number of `int` |
| dict[str, int] | Dictionary with `str` keys and `int` values |
| Iterable[int] | Iterable object containing ints |
| Sequence[bool] | Sequence of booleans (immutable) |
| Mapping[str, int] | Mapping from `str` keys to `int` values (immutable) |
{{< /table >}}
In python, we can even make up our own generics with the help of `typing.Generic`:
```python
from typing import TypeVar, Generic
T = TypeVar("T")
# If we specify a type-hint for our building like Building[Student]
# it will mean that the `inhabitants` variable will be a of type: `list[Student]`
class Building(Generic[T]):
def __init__(self, *inhabitants: T):
self.inhabitants = inhabitants
class Person: ...
class Student(Person): ...
people = [Person() for _ in range(10)]
my_building: Building[Person] = Building(*people)
students = [Student() for _ in range(10)]
my_dorm = Building[Student] = Building(*students)
# We now know that `my_building` will contain inhabitants of `Person` type,
# while `my_dorm` will only have `Student`(s) as it's inhabitants.
```
I'll go deeper into creating our custom generics later, after we learn the differences between covariance,
contravariance and invariance. For now, this is just a very simple illustrative example.
## Variance
As I've quickly explained in the start, the concept of variance tells us about whether a generic of certain type can be
assigned to a generic of another type. But I won't bother with trying to define variance more meaningfully since the
definition would be convoluted and you probably wouldn't really get what is it about until you'll see the examples of
different types of variances. So for that reason, let's just take a look at those.
### Covariance
The first concept of generic variance is **covariance**, the definition of which looks like this:
> If a generic `G[T]` is covariant in `T` and `A` is a subtype of `B`, then `G[A]` is a subtype of `G[B]`. This means
> that every variable of `G[A]` type can be assigned as having the `G[B]` type.
As I've very quickly explained initially, covariance is a concept where if we have a generic of some type, we can
assign it to a generic type of some supertype of that type. This means that the actual generic type is a subtype of
this new generic which we've assigned it to.
I know that this definition can sound really complicated, but it's actually not that hard. As an example, I'll use a `tuple`,
which is an immutable sequence in python. If we have a tuple of `Car` type, `Car` being a subclass of `Vehicle`, can we
assign this tuple a type of tuple of Vehicles? The answer here is yes, because every `Car` is a `Vehicle`, so a
tuple of cars is a subtype of tuple of vehicles. So is a tuple of objects, `object` being the basic class that
pretty much everything has in python, so both tuple of cars, and tuple of vehicles is a subtype of tuple of objects,
and we can assign those tuples to a this tuple of objects.
```python
from typing import Tuple
class Vehicle: ...
class Boat(Vehicle): ...
class Car(Vehicle): ...
my_vehicle = Vehicle()
my_boat = Boat()
my_car_1 = Car()
my_car_2 = Car()
vehicles: Tuple[Vehicle, ...] = (my_vehicle, my_car_1, my_boat)
cars: Tuple[Car, ...] = (my_car_1, my_car_1)
# This line assigns a variable with the type of 'tuple of cars' to a 'tuple of vehicles' type
# this makes sense because a tuple of vehicles can hold cars
# since cars are vehicles
x: Tuple[Vehicle, ...] = cars
# This line however tries to assign a tuple of vehicles to a tuple of cars type
# this however doesn't make sense because not all vehicles are cars, a tuple of
# vehicles can also contain other non-car vehicles, such as boats. These may lack
# some of the functionalities of cars, so a type checker would complain here
x: Tuple[Car, ...] = vehicles
# In here, both of these assignments are valid because both cars and vehicles will
# implement all of the logic that a basic `object` class needs. This means this
# assignment is also valid for a generic that's covariant.
x: Tuple[object, ...] = cars
x: Tuple[object, ...] = vehicles
```
Another example of a covariant type would be the return value of a function. In python, the `typing.Callable` type is
initialized like `Callable[[argument_type1, argument_type2], return_type]`. In this case, the return type for our
function is also covariant, because we can return a more specific type (subtype) as a return type. This is because we
don't mind treating a type with more functionalities as their supertype which have less functionalities, since the type
still has all of the functionalities we want i.e. it's fully compatible with the less specific type.
```python
class Car: ...
class WolkswagenCar(Car): ...
class AudiCar(Car)
def get_car() -> Car:
# The type of this function is Callable[[], Car]
r = random.randint(1, 3)
if r == 1:
return Car()
elif r == 2:
return WolkswagenCar()
elif r == 3:
return AudiCar()
def get_wolkswagen_car() -> WolkswagenCar:
# The type of this function is Callable[[], WolkswagenCar]
return WolkswagenCar()
# In the line below, we define a function `x` which is expected to have a type of
# Callable[[], Car], meaning it's a function that returns a Car.
# Here, we don't mind that the actual function will be returning a more specififc
# WolkswagenCar type, since that type is fully compatible with the less specific Car type.
x: Callable[[], Car] = get_wolkswagen_car
# However this wouldn't really make sense the other way around.
# We can't assign a function which returns any kind of Car to a variable with is expected to
# hold a function that's supposed to return a specific type of a car. This is because not
# every car is a WolkswagenCar, we may get an AudiCar from this function, and that may not
# support everything WolkswagenCar does.
x: Callable[[], WolkswagenCar] = get_car
```
### Contravariance
Another concept is known as **contravariance**. It is essentially a complete opposite of **covariance**.
> If a generic `G[T]` is contravariant in `T`, and `A` is a subtype of `B`, then `G[B]` is a subtype of `G[A]`. This
> means that every variable of `G[B]` type can be assigned as having the `G[A]` type.
In this case, this means that if we have a generic of some type, we can assign it to a generic type of some subtype of
that type. This means that the actual generic type is a subtype of this new generic which we've assigned it to.
This explanation is probably even more confusing if you only look at the definition. But even when we think about it as
an opposite of covariance, there's a question that comes up: Why would we ever want to have something like this? When
is it actually useful? To answer this, let's look at the other portion of the `typing.Callable` type which contains the
arguments to a function.
```python
class Car: ...
class WolkswagenCar(Car): ...
class AudiCar(Car): ...
# The type of this function is Callable[[Car], None]
def drive_car(car: Car) -> None:
car.start_engine()
car.drive()
print(f"Driving {car.__class__.__name__} car.")
# The type of this function is Callable[[WolkswagenCar], None]
def drive_wolkswagen_car(wolkswagen_car: WolkswagenCar) -> None:
# We need to login to our wolkswagen account on the car first
# with the wolkswagen ID, in order to be able to drive it.
wolkswagen_car.login(wolkswagen_car.wolkswagen_id)
drive_car(wolkswagen_car)
# The type of this function is Callable[[AudiCar], None]
def drive_audi_car(audi_car: AudiCar) -> None:
# All audi cars need to report back with their license plate
# to Audi servers before driving is enabled
audi_car.contact_audi(audi_car.license_plate_number)
drive_car(wolkswagen_car)
# In here, we try to assign a function that takes a wolkswagen car
# to a variable which is defined as a function/callable which takes any regular car.
# However this is a problem, because now we can use x with any car, including an
# AudiCar, but x is assigned to a fucntion that only accept wolkswagen cars, this
# may cause issues because not every car has the properties of a wolkswagen car,
# which this function may need to utilize.
x: Callable[[Car], None] = drive_wolkswagen_car
# On the other hand, in this example, we're assigning a function that can
# take any car to a variable that is defined as a function/callable that only
# takes wolkswagen cars as arguments.
# This is fine, because x only allows us to pass in wolkswagen cars, and it is set
# to a function which accepts any kind of car, including wolkswagen cars.
x: Callable[[WolkswagenCar], None] = drive_car
```
So from this it's already clear that the `Callable` type for the arguments portion can't be covariant, and hopefully
you can now recognize what it means for something to be contravariant. But to reinforce this, here's one more bit
different example.
```python
class Library: ...
class Book: ...
class FantasyBook(Book): ...
class DramaBook(Book): ...
def remove_while_used(func: Callable[[Library, Book], None]) -> Callable[[Library, Book], None]
"""This decorator removes a book from the library while `func` is running."""
def wrapper(library: Library, book: Book) -> None:
library.remove(book)
value = func(book)
library.add(book)
return value
return wrapper
# As we can see here, we can use the `remove_while_used` decorator with the
# `read_fantasy_book` function below, since this decorator expects a function
# of type: Callable[[Library, Book], None] to which we're assigning
# our function `read_fantasy_book`, which has a type of
# Callable[[Library, FantasyBook], None].
#
# Obviously, there's no problem with Library, it's the same type, but as for
# the type of the book argument, our read_fantasy_book func only expects fantasy
# books, and we're assigning it to `func` attribute of the decorator, which
# expects a general Book type. This is fine because a FantasyBook meets all of
# the necessary criteria for a general Book, it just includes some more special
# things, but the decorator function won't use those anyway.
#
# Since this assignment is be possible, it means that Callable[[Library, Book], None]
# is a subtype of Callable[[Library, FantasyBook], None], not the other way around.
# Even though Book isn't a subtype of FantasyBook, but rather it's supertype.
@remove_while_used
def read_fantasy_book(library: Library, book: FantasyBook) -> None:
book.read()
my_rating = random.randint(1, 10)
# Rate the fantasy section of the library
library.submit_fantasy_rating(my_rating)
```
This kind of behavior, where we can pass generics with more specific types to generics of less specific types
(supertypes), means that the generic is contravariant in that type. So for callables, we can write that:
`Callablle[[T], None]` is contravariant in `T`.
### Invariance
The last type of variance is called **invariance**, and it's certainly the easiest of these types to understand, and by
now you may have already figured out what it means. Simply, a generic is invariant in type when it's neither
covariant nor contravariant.
> If a generic `G[T]` is invariant in `T` and `A` is a subtype of `B`, then `G[A]` is neither a subtype nor a supertype
> of `G[B]`. This means that any variable of `G[A]` type can never be assigned as having the `G[B]` type, and
> vice-versa.
This means that the
generic will never be a subtype of itself no matter it's type.
What can be a bit surprising is that the `list` datatype is actually invariant in it's elements type. While an
immutable sequence such as a `tuple` is covariant in the type of it's elements, this isn't the case for mutable
sequences. This may seem weird, but there is a good reason for that.
```python
class Person:
def eat() -> None: ...
class Adult(Person):
def work() -> None: ...
class Child(Person):
def study() -> None: ...
person1 = Person()
person2 = Person()
adult1 = Adult()
adult2 = Adult()
child1 = Child()
child2 = Child()
people: List[Person] = [person1, person2, adult2, child1]
adults: List[Adult] = [adult1, adult2]
# At first, it is important to establish that list isn't contravariant. This is perhaps quite intuitive, but it is
# important nevertheless. In here, we tried to assign a list of people to `x` which has a type of list of children.
# This obviously can't work, because a list of people can include more types than just `Child`, and these types
# can lack some of the features that children have, meaning lists can't be contravariant.
x: list[Child] = people
```
Now that we've established that list type's elements aren't contravariant, let's see why it would be a bad idea to make
them covariant (like tuples). Essentially, the main difference here is the fact that a tuple is immutable, list isn't.
This means that you can add new elements to lists and alter them, but you can't do that with tuples, if you want to add
a new element there, you'd have to make a new tuple with those elements, so you wouldn't be altering an existing one.
Why does that matter? Well let's see this in an actual example
```python
def append_adult(adults: List[Person]) -> None:
new_adult = Adult()
adults.append(adult)
child1 = Child()
child2 = Child()
children: List[Child] = [child1, child2]
# This is where the covariant assignment happens, we assign a list of children
# to a list of people, `Child` being a subtype of Person`. Which would imply that
# list is covariant in the type of it's elements.
# This is the line on which a type-checker would complain. So let's see why allowing
# it is a bad idea.
people: List[Person] = children
# Since we know that `people` is a list of `Person` type elements, we can obviously
# pass it over to `append_adult` function, which takes a list of `Person` type elements.
# After we called this fucntion, our list got altered. it now includes an adult, which
# is fine since this is a list of people, and `Adult` type is a subtype of `Person`.
# But what also happened is that the list in `children` variable got altered!
append_adult(people)
# This will work fine, all people can eat, that includes adults and children
children[0].eat()
# Only children can study, this will also work fine because the 0th element is a child,
# afterall this is a list of children right?
children[0].study()
# Uh oh! This will fail, we've appended an adult to our list of children.
# But since this is a list of `Child` type elements, we expect all elements in that list
# to have all properties required of the `Child` type. But there's an `Adult` type element
# in there which doesn't actually have all of the properties of a `Child`, they lack the
# `study` method, causing an error on this line.
children[-1].study()
```
As we can see from this example, the reason lists can't be covariant is because we wouldn't be able assign a list of
certain type of elements to a list with elements of a supertype of those (a parent class of our actual element class).
Even though that type implements every feature that the super-type would, allowing this kind of
assignment could lead to mutations of the list where elements that don't belong were added, since while they may fit
the supertype requirement, they might no longer be of the original type.
That said, if we copied the list, re-typing in to a supertype wouldn't be an issue:
```python
class Game: ...
class BoardGame(Game): ...
class SportGame(Game): ...
board_games: list[BoardGame] = [tic_tac_toe, chess, monopoly]
games: list[Game] = board_games.copy()
games.append(voleyball)
```
This is why immutable sequences are covariant, they don't make it possible to edit the original, instead if a change is
desired, a new object must be made. This is why `tuple` or other `Sequence` types don't need to be copied when doing an
assignment like this. But elements of `MutableSequence` types do.
### Recap
- if G[T] is covariant in T, and A is a subtype of B, then G[A] is a subtype of G[B]
- if G[T] is contravariant in T, and A is a subtype of B, then G[B] is a subtype of G[A]
- if G[T] is invariant in T (the default), and A is a subtype of B, then G[A] and G[B] don't have any subtype relation
## Creating Generics
Now that we know what it means for a generic to have a covariant/contravariant/invariant type, we can explore how to
make use of this knowledge and actually create some generics with these concepts in mind
**Making an invariant generics:**
```python
from typing import TypeVar, Generic, List, Iterable
# We don't need to specify covariant=False nor contravariant=False, these are the default
# values, I do this here only to explicitly show that this typevar is invariant
T = TypeVar("T", covariant=False, contravariant=False)
class University(Generic[T]):
students: List[T]
def __init__(self, students: Iterable[T]) -> None:
self.students = [s for s in students]
def add_student(self, student: T) -> None:
students.append(student)
x: University[EngineeringStudent] = University(engineering_students)
y: University[Student] = x # NOT VALID! University isn't covariant
z: University[ComputerEngineeringStudent] = x # NOT VALID! University isn't contravariant
```
In this case, our University generic type is invariant in the student type, meaning that
if we have a `University[Student]` type and `University[EngineeringStudent]` type, neither
is a subtype of the other.
**Making covariant generics:**
In here, it is important to make 1 thing clear, whenever the typevar is in a function argument, it would become
contravariant, making it impossible to make a covariant generic which takes attributes of it's type as arguments
somewhere. However this rule does not extend to initialization/constructor of that generic, and this is very important.
Without this exemption, it wouldn't really be possible to construct a covariant generic, since the original type must
somehow be passed onto the instance itself, otherwise we wouldn't know what type to return in the actual logic. This is
why using a covariant typevar in `__init__` is allowed.
```python
from typing import TypeVar, Generic, Sequence, Iterable
T_co = TypeVar("T_co", covariant=True)
class Matrix(Sequence[Sequence[T_co]], Generic[T_co]):
__slots__ = ("rows", )
rows: tuple[tuple[T_co, ...], ...]
def __init__(self, rows: Iterable[Iterable[T_co]]):
self.rows = tuple(tuple(el for el in row) for row in rows)
def __setattr__(self, attr: str, value: object) -> None:
if hasattr(self, attr):
raise AttributeError(f"Can't change {attr} (read-only)")
return super().__setattr__(attr, value)
def __getitem__(self, row_id: int, col_id: int) -> T_co:
return self.rows[row_id][col_id]
def __len__(self) -> int:
return len(self.rows)
class X: ...
class Y(X): ...
class Z(Y): ...
a: Matrix[Y] = Matrix([[Y(), Z()], [Z(), Y()]])
b: Matrix[X] = x # VALID. Matrix is covariant
c: Matrix[Z] = x # INVALID! Matirx isn't contravariant
```
In this case, our Matrix generic type is covariant in the element type, meaning that if we have a `Matrix[Y]` type
and `Matrix[X]` type, we could assign the `University[Y]` to the `University[X]` type, hence making it it's
subtype.
We can make this Matrix covariant because it is immutable (enforced by slots and custom setattr logic). This allows
this matrix class (just like any other sequence class), to be covariant. Since it can't be altered, this covariance is
safe.
**Making contravariant generics:**
```python
from typing import TypeVar, Generic
import pickle
import requests
T_contra = TypeVar("T_contra", contravariant=True)
class Sender(Generic[T_contra]):
def __init__(self, url: str) -> None:
self.url = url
def send_request(self, val: T_contra) -> str:
s = pickle.dumps(val)
requests.post(self.url, data={"object": s})
class X: ...
class Y(X): ...
class Z(Y): ...
a: Sender[Y] = Sender("https://test.com")
b: Sender[Z] = x # VALID, sender is contravariant
c: Sender[X] = x # INVALID, sender is covariant
```
In this case, our `Sender` generic type is contravariant in it's value type, meaning that
if we have a `Sender[Y]` type and `Sender[Z]` type, we could assign the `Sender[Y]` type
to the `Sender[Z]` type, hence making it it's subtype.
This works because the type variable is only used in contravariant generics, in this case, in Callable's arguments.
This means that the logic of determining subtypes for callables will be the same for our Sender generic.
i.e. if we had a sender generic of Car type with `send_request` function, and we would be able to assign it to a sender
of Vehicle type, suddenly it would allow us to use other vehicles, such as airplanes to be passed to `send_request`
function, but this function only expects type of `Car` (or it's subtypes).
On the other hand, if we had this generic and we tried to assign it to a sender of `AudiCar`, that's fine, because now
all arguments passed to `send_request` function will be required to be of the `AudiCar` type, but that's a subtype of a
general `Car` and implements everything this general car would, so the function doesn't mind.
Note: This probably isn't the best example of a contravariant class, but because of my limited imagination and lack of
time, I wasn't able to think of anything better.
**Some extra notes**
- Usually, most of your generics will be invariant, however sometimes, it can be very useful to mark your generic as
covariant, since otherwise, you'd need to recast your variable manually when defining another type, or copy your
whole generic, which would be very wasteful, just to satisfy type-checkers. Less commonly, you can also find it
helpful to mark your generics as contravariant, though this will usually not come up, maybe if you're using
protocols, but with full standalone generics, it's quite rarely used. Nevertheless, it's important to
- Once you've made a typevar covariant or contravariant, you won't be able to use it anywhere else outside of some
generic, since it doesn't make sense to use such a typevar as a standalone thing, just use the `bound` feature of a
type variable instead, that will define it's upper bound types and any subtypes of those will be usable.
- Generics that can be covariant, or contravariant, but are used with a typevar that doesn't have that specified can
lead to getting a warning from the type-checker that this generic is using a typevar which could be covariant, but
isn't. However this is just that, a warning. You are by no means required to make your generic covariant even though
it can be, you may still have a good reason not to. If that's the case, you should however specify `covariant=False`,
or `contravariant=False` for the typevar, since that will usually satisfy the type-checker and the warning will
disappear, since you've explicitly stated that even though this generic could be using a covariant/contravariant
typevar, it shouldn't be and that's desired.
## Conclusion
This was probably a lot of things to process at once and you may need to read some things more times in order to really
grasp these concepts, but it is a very important thing to understand, not just in strictly typed languages, but as I
demonstrated even for a languages that have optional typing such as python.
Even though in most cases, you don't really need to know how to make your own typing generics which aren't invariant,
there certainly are some use-cases for them, especially if you enjoy making libraries and generally working on
back-end, but even if you're just someone who works with these libraries, knowing this can be quite helpful since even
though you won't often be the one writing those generics, you'll be able to easily recognize and know what you're working
with, immediately giving you an idea of how that thing works and how it's expected to be used.