diff --git a/content/posts/generics-and-variance.md b/content/posts/generics-and-variance.md new file mode 100644 index 0000000..0f913b3 --- /dev/null +++ b/content/posts/generics-and-variance.md @@ -0,0 +1,549 @@ +--- +title: Typing generics and variance +date: 2021-10-04 +tags: [programming, python, typing] +aliases: + - /posts/typing-variance-of-generics +changelog: + 2024-09-05: + - Complete overhaul of the article, rewrite most things + - Focus more on explaining generics as a concept too, not just their variance + - Rename the article from 'Variance of typing generics (covariance, contravariance and invariance)' to 'Typing generics and variance' + - Move section about type vars into ti's own [article]({{< ref "posts/type-vars" >}}) +--- + +Generics and variance are advanced concepts in Python's type system that offer powerful ways to write code that is +flexible and reusable across different types. By understanding these concepts, you can write more robust, maintainable +and less repetitive code, making the most of Python's type hinting capabilities. + +Even if you don't work in Python, the basic concepts of generics and especially their variance carry over to all kinds +of programming languages, making them useful to understand no matter what you're coding in. + +_**Pre-requisites**: This article assumes that you already have a [basic knowledge of python typing]({{< ref +"posts/python-type-checking" +>}}) and [type-vars]({{< ref "posts/type-vars" >}})._ + + + +## What are Generics + +Generics allow you to define functions and classes that operate on types in a flexible yet type-safe manner. They are a +way to specify that a function or class works with multiple types, without being restricted to a single type. + +### Basic generic classes + +Essentially when a class is generic, it just defines that something inside of it is of some dynamic type. A +good example would be for example a list of integers: `list[int]` (or in older python versions: `typing.List[int]`). +We've specified that our list will be holding elements of `int` type. + +Generics like this can be used for many things, for example with a dict, we actually provide 2 types, first is the type +of the keys and second is the type of the values: `dict[str, int]` would be a dict with `str` keys and `int` values. + +Here's a list of some definable generic types that are currently present in python 3.12: + +{{< table >}} +| Type | Description | +|-------------------|-----------------------------------------------------| +| list[str] | List of `str` objects | +| tuple[int, int] | Tuple of two `int` objects (immutable) | +| tuple[int, ...] | Tuple of arbitrary number of `int` (immutable) | +| dict[str, int] | Dictionary with `str` keys and `int` values | +| Iterable[int] | Iterable object containing ints | +| Sequence[bool] | Sequence of booleans (immutable) | +| Mapping[str, int] | Mapping from `str` keys to `int` values (immutable) | +{{< /table >}} + +### Custom generics + +In python, we can even make up our own generics with the help of `typing.Generic`: + +```python +from typing import TypeVar, Generic + +T = TypeVar("T") + + +class Person: ... +class Student(Person): ... + +# If we specify a type-hint for our building like Building[Student] +# it will mean that the `inhabitants` variable will be a of type: `tuple[Student, ...]` +class Building(Generic[T]): + def __init__(self, *inhabitants: T): + self.inhabitants = inhabitants + + +people = [Person() for _ in range(10)] +my_building: Building[Person] = Building(*people) + +students = [Student() for _ in range(10)] +my_dorm = Building[Student] = Building(*students) + +# We now know that `my_building` will contain inhabitants of `Person` type, +# while `my_dorm` will only have `Student`(s) as it's inhabitants. +``` + +I'll go deeper into creating our custom generics later, after we learn the differences between covariance, +contravariance and invariance. For now, this is just a very simple illustrative example. + +## Variance + +The concept of variance tells us about whether a generic of certain type can be assigned to a generic of another type. +So for example, variance tackles a question like: Can a value of type `Building[Student]` be assigned to a variable of +type `Building[Person]`? Let's see the different kinds of generic variances. + +### Covariance + +The first concept of generic variance is **covariance**, the definition of which looks like this: + +> If a generic `G[T]` is covariant in `T` and `A` is a subtype of `B`, then `G[A]` is a subtype of `G[B]`. This means +> that every variable of `G[A]` type can be assigned as having the `G[B]` type. + +So, in other words, covariance is a concept where if we have a generic of some type, we can assign it to a generic type +of some supertype of that type. This means that the generic type is a subtype of this new generic which we've assigned +it to. + +I know that this definition can sound really complicated, but it's actually not that hard, it just needs some code +examples. + +#### Tuple + +As an example, I'll use a `tuple`, which is an immutable sequence in python. If we have a tuple of `Car` type +(`tuple[Car, ...]`), `Car` being a subclass of `Vehicle`, can we assign this type to a tuple of Vehicles +(`tuple[Vehicle, ...]`)? The answer here is yes, so a tuple of cars is a subtype of tuple of vehicles. + +This indicates that the generic type parameter for tuple is covariant. + +Let's explore this further with some proper python code example: + +```python +class Vehicle: ... +class Boat(Vehicle): ... +class Car(Vehicle): ... + +my_vehicle = Vehicle() +my_boat = Boat() +my_car_1 = Car() +my_car_2 = Car() + + +vehicles: tuple[Vehicle, ...] = (my_vehicle, my_car_1, my_boat) +cars: tuple[Car, ...] = (my_car_1, my_car_1) + +# This line assigns a variable with the type of 'tuple of cars' to a 'tuple of vehicles' type +# this makes sense because a tuple of vehicles can hold cars, cars are vehicles +x: tuple[Vehicle, ...] = cars + +# This line however tries to assign a tuple of vehicles to a tuple of cars type. +# That however doesn't make sense because not all vehicles are cars, a tuple of +# vehicles can also contain other non-car vehicles, such as boats. These may lack +# some of the functionalities of cars, so a type checker would complain here +x: tuple[Car, ...] = vehicles + +# In here, both of these assignments are valid because both cars and vehicles will +# implement all of the logic that a basic `object` class needs (everything in python +# falls under the object type). This means that this assignment is also valid +# for a generic that's covariant. +x: tuple[object, ...] = cars +x: tuple[object, ...] = vehicles +``` + +#### Return type + +Another example of a covariant type would be the return value of a function. In python, the `collections.abc.Callable` type +(or `typing.Callable`) represents a type that supports being called. So for example a function. + +If specified like `Callable[[A, B], R]`, it denotes a function that takes in 2 parameters, first with type `A`, second +with type `B`, and returns a type `R` (`def func(x: A, y: B) -> R`). + +In this case, the return type for our function is also covariant, because we can return a more specific type (subtype) +as a return type. + +Consider the following: + +```python +class Car: ... +class WolkswagenCar(Car): ... +class AudiCar(Car): ... + +def get_car() -> Car: + # The type of this function is Callable[[], Car] + # yet we can return a more specific type (a subtype) of the Car type from this function + r = random.randint(1, 2) + elif r == 1: + return WolkswagenCar() + elif r == 2: + return AudiCar() + +def get_wolkswagen_car() -> WolkswagenCar: + # The type of this function is Callable[[], WolkswagenCar] + return WolkswagenCar() + + +# In the line below, we define a callable `x` which is expected to have a type of +# Callable[[], Car], meaning it's a function that returns a Car. +# Here, we don't mind that the actual function will be returning a more specififc +# WolkswagenCar type, since that type is fully compatible with the less specific Car type. +x: Callable[[], Car] = get_wolkswagen_car + +# However this wouldn't really make sense the other way around. +# We can't assign a function which returns any kind of Car to a variable with is expected to +# hold a function that's supposed to return a specific type of a car. This is because not +# every car is a WolkswagenCar, we may get an AudiCar from this function, and that may not +# support everything WolkswagenCar does. +x: Callable[[], WolkswagenCar] = get_car +``` + +All of this probably seemed fairly trivial, covariance is very intuitive and it's what you would assume a generic +parameter to be in most cases. + +### Contravariance + +Another concept is known as **contravariance**. It is essentially a complete opposite of **covariance**. + +> If a generic `G[T]` is contravariant in `T`, and `A` is a subtype of `B`, then `G[B]` is a subtype of `G[A]`. This +> means that every variable of `G[B]` type can be assigned as having the `G[A]` type. + +In this case, this means that if we have a generic of some type, we can assign it to a generic type of some subtype +(e.g. `G[Car]` can be assigned to `G[AudiCar]`). + +In all likelihood, this will feel very confusing, since it isn't at all obvious when a relation like this would make +sense. To answer this, let's look at the other portion of the `Callable` type, which contains the arguments to a +function. + +```python +class Car: ... +class WolkswagenCar(Car): ... +class AudiCar(Car): ... + +# The type of this function is Callable[[Car], None] +def drive_car(car: Car) -> None: + car.start_engine() + car.drive() + print(f"Driving {car.__class__.__name__} car.") + +# The type of this function is Callable[[WolkswagenCar], None] +def drive_wolkswagen_car(wolkswagen_car: WolkswagenCar) -> None: + # We need to login to our wolkswagen account with the wolkswagen ID, + # in order to be able to drive it. + wolkswagen_car.login(wolkswagen_car.wolkswagen_id) + drive_car(wolkswagen_car) + +# The type of this function is Callable[[AudiCar], None] +def drive_audi_car(audi_car: AudiCar) -> None: + # All audi cars need to report back with their license plate + # to Audi servers before driving is enabled + audi_car.contact_audi(audi_car.license_plate_number) + drive_car(wolkswagen_car) + + +# In here, we try to assign a function that takes a wolkswagen car to a variable +# which is declared as a callable taking any car. However this is a problem, +# because now we can call x with any car, including an AudiCar, but x is assigned +# to a fucntion that only works with wolkswagen cars! +# +# So, G[VolkswagenCar] is not a subtype of G[Car], that means this type parameter +# isn't covariant +x: Callable[[Car], None] = drive_wolkswagen_car + +# On the other hand, in this example, we're assigning a function that can take any +# car to a variable that is defined as a callable that only takes wolkswagen cars +# as arguments. This is fine, because x only allows us to pass in wolkswagen cars, +# and it is set to a function which accepts any kind of car, including wolkswagen cars. +# +# This means that G[Car] is a subtype of G[WolkswagenCar], so this type parameter +# is actually contravariant +x: Callable[[WolkswagenCar], None] = drive_car +``` + +So from this it should be clear that the type parameters for the arguments portion of the `Callable` type aren't +covariant and you should have a basic idea what contravariance is. + +To solidify this understanding a bit more, let's see contravariance again, in a slightly different scenario: + +```python +class Library: ... +class Book: ... +class FantasyBook(Book): ... +class DramaBook(Book): ... + +def remove_while_used(func: Callable[[Library, Book], None]) -> Callable[[Library, Book], None]: + """This decorator removes a book from the library while `func` is running.""" + def wrapper(library: Library, book: Book) -> None: + library.remove(book) + value = func(book) + library.add(book) + return value + return wrapper + + +# As we can see here, we can use the `remove_while_used` decorator with the +# `read_fantasy_book` function below, since this decorator expects a function +# of type: Callable[[Library, Book], None] to which we're assigning +# our function `read_fantasy_book`, which has a type of +# Callable[[Library, FantasyBook], None]. +# +# Obviously, there's no problem with Library, it's the same type, but as for +# the type of the book argument, our read_fantasy_book func only expects fantasy +# books, and we're assigning it to `func` attribute of the decorator, which +# expects a general Book type. This is fine because a FantasyBook meets all of +# the necessary criteria for a general Book, it just includes some more special +# things, but the decorator function won't use those anyway. +# +# Since this assignment is be valid, it means that Callable[[Library, Book], None] +# is a subtype of Callable[[Library, FantasyBook], None], stripping the unnecessary parts +# it means that G[Book] is a subtype of G[FantasyBook], even though Book isn't a subtype +# of FantasyBook, but rather it's supertype. +@remove_while_used +def read_fantasy_book(library: Library, book: FantasyBook) -> None: + book.read() + my_rating = random.randint(1, 10) + # Rate the fantasy section of the library + library.submit_fantasy_rating(my_rating) +``` + +Hopefully, this made the concept of contravariance pretty clear. An interesting thing is that contravariance doesn't +really come up anywhere else other than in function arguments. Even though you may see generic types with contravariant +type parameters, they are only contravariant because those parameters are being used as function arguments in that +generic type internally. +### Invariance + +The last type of variance is called **invariance**, and by now you may have already figured out what it means. Simply, +a generic is invariant in type when it's neither covariant nor contravariant. + +> If a generic `G[T]` is invariant in `T` and `A` is a subtype of `B`, then `G[A]` is neither a subtype nor a supertype +> of `G[B]`. This means that any variable of `G[A]` type can never be assigned as having the `G[B]` type, and +> vice-versa. + +This means that the generic type taking in a type parameter will only be assignable to itself, if the type parameter +has any different type, regardless of whether that type is a subtype or a supertype of the original, it would no longer +be assignable to the original. + +What can be a bit surprising is that the `list` datatype is actually invariant in it's elements type. While an +immutable sequence such as a `tuple` is covariant in the type of it's elements, this isn't the case for mutable +sequences. This may seem weird, but there is a good reason for it. Let's take a look: + +```python +class Person: + def eat() -> None: ... +class Adult(Person): + def work() -> None: ... +class Child(Person): + def study() -> None: ... + + +person1 = Person() +person2 = Person() +adult1 = Adult() +adult2 = Adult() +child1 = Child() +child2 = Child() + +people: list[Person] = [person1, person2, adult2, child1] +adults: list[Adult] = [adult1, adult2] + +# At first, it is important to establish that list isn't contravariant. This is perhaps quite intuitive, but it is +# important nevertheless. In here, we tried to assign a list of people to `x` which has a type of list of children. +# This obviously can't work, because a list of people can include other types than just `Child`, and these types +# can lack some of the features that children have, meaning lists can't be contravariant. +x: list[Child] = people +``` + +Now that we've established that list type's elements aren't contravariant, let's see why it would be a bad idea to make +them covariant (like tuples). Essentially, the main difference here is the fact that a tuple is immutable, list isn't. +This means that you can add new elements to lists and alter them, but you can't do that with tuples, if you want to add +a new element there, you'd have to make a new tuple with those elements, so you wouldn't be altering an existing one. + +Why does that matter? Well let's see this in an actual example + +```python +def append_adult(adults: list[Person]) -> None: + new_adult = Adult() + adults.append(adult) + +child1 = Child() +child2 = Child() +children: list[Child] = [child1, child2] + +# This is where the covariant assignment happens, we assign a list of children +# to a list of people, `Child` being a subtype of Person`. Which would imply that +# list is covariant in the type of it's elements. A type-checker should complain +# about this line, so let's see why allowing it is a bad idea. +people: list[Person] = children + +# Since we know that `people` is a list of `Person` type elements, we can obviously +# pass it over to `append_adult` function, which takes a list of `Person` type elements. +# After we called this fucntion, our list got altered. it now includes an adult, which +# should be fine assuming list really is covariant, since this is a list of people, and +# `Adult` type is a subtype of `Person`. +append_adult(people) + +# Let's go back to the `children` list now, let's loop over the elements and do some stuff with them +for child in children: + # This will work fine, all people can eat, that includes adults and children + child.eat() + + # Only children can study, but that's not an issue, because we're working with + # a list of children, right? Oh wait, but we appended an Adult into `people`, which + # also mutated `children` (it's the same list) and Adults can't study, uh-oh ... + child.study() # AttributeError, 'Adult' class doesn't have 'study' attribute +``` + +As we can see from this example, the reason lists can't be covariant because it would allow us to mutate them and add +elements of completely unrelated types that break our original list. + +That said, if we copied the list, re-typing in to a supertype wouldn't be an issue: + +```python +class Game: ... +class BoardGame(Game): ... +class SportGame(Game): ... + +board_games: list[BoardGame] = [tic_tac_toe, chess, monopoly] +games: list[Game] = board_games.copy() +games.append(voleyball) +``` + +This is why immutable sequences are covariant, they don't make it possible to edit the original, instead, if a change is +desired, a new object must be made. This is why `tuple` or other `typing.Sequence` types can be covariant, but lists and +`typing.MutableSequence` types need to be invariant. + +### Recap + +- if G[T] is covariant in T, and A (wolkswagen car) is a subtype of B (car), then G[A] is a subtype of G[B] +- if G[T] is contravariant in T, and A is a subtype of B, then G[B] is a subtype of G[A] +- if G[T] is invariant in T, and A is a subtype of B, then G[A] and G[B] don't have any subtype relation + +## Creating Generics + +Now that we know what it means for a generic to have a covariant/contravariant/invariant type, we can explore how to +make use of this knowledge and actually create some generics with these concepts in mind + +### Making an invariant generics + +```python +from typing import TypeVar, Generic +from collections.abc import Iterable + +# We don't need to specify covariant=False nor contravariant=False, these are the default +# values (meaning all type-vars are invariant by default), I specify these parameters +# explicitly just to showcase them. +T = TypeVar("T", covariant=False, contravariant=False) + +class University(Generic[T]): + students: list[T] + + def __init__(self, students: Iterable[T]) -> None: + self.students = [s for s in students] + + def add_student(self, student: T) -> None: + students.append(student) + +x: University[EngineeringStudent] = University(engineering_students) +y: University[Student] = x # NOT VALID! University isn't covariant +z: University[ComputerEngineeringStudent] = x # NOT VALID! University isn't contravariant +``` + +In this case, our University generic type is invariant in the student type, meaning that +if we have a `University[Student]` type and `University[EngineeringStudent]` type, neither +is a subtype of the other. + +### Making covariant generics + +In the example below, we create a covariant `TypeVar` called `T_co`, which we then use in our custom generic, this name +for the type-var is actually following a common convention for covariant type-vars, so it's a good idea to stick to it +if you can. + +```python +from collections.abc import Iterable, Sequence +from typing import Generic, TypeVar + +T_co = TypeVar("T_co", covariant=True) + +class Matrix(Sequence[Sequence[T_co]], Generic[T_co]): + _rows: tuple[tuple[T_co, ...], ...] + + def __init__(self, rows: Iterable[Iterable[T_co]]): + self._rows = tuple(tuple(el for el in row) for row in rows) + + def __getitem__(self, row_id: int, col_id: int) -> T_co: + return self._rows[row_id][col_id] + + def __len__(self) -> int: + return len(self._rows) + +class X: ... +class Y(X): ... +class Z(Y): ... + +x: Matrix[Y] = Matrix([[Y(), Z()], [Z(), Y()]]) +y: Matrix[X] = x # VALID. Matrix is covariant +z: Matrix[Z] = x # INVALID! Matirx isn't contravariant +``` + +In this case, our Matrix generic type is covariant in the element type, meaning that we can assign `Matrix[Y]` type +to `Matrix[X]` type, with `Y` being a subtype of `X`. + +This works because the type-var is only used in covariant generics, in this case, with a `tuple`. If we stored the +internal state in an invariant type, like a `list`, marking our type-var as covariant would be unsafe. Some +type-checkers can detect and warn you if you do this, but many won't, so be cautions. + +### Making contravariant generics + +Similarly to the above, the contravariant type var we create here is following a well established naming convention, +being called `T_contra`. + +```python +from typing import TypeVar, Generic +import pickle +import requests + +T_contra = TypeVar("T_contra", contravariant=True) + +class Sender(Generic[T_contra]): + def __init__(self, url: str) -> None: + self.url = url + + def send_request(self, val: T_contra) -> str: + s = pickle.dumps(val) + requests.post(self.url, data={"object": s}) + +class X: ... +class Y(X): ... +class Z(Y): ... + +a: Sender[Y] = Sender("https://test.com") +b: Sender[Z] = x # VALID, sender is contravariant +c: Sender[X] = x # INVALID, sender is covariant +``` + +In this case, our `Sender` generic type is contravariant in it's value type, meaning that +if we have a `Sender[Y]` type and `Sender[Z]` type, we could assign the `Sender[Y]` type +to the `Sender[Z]` type, hence making it it's subtype. + +i.e. if we had a sender generic of Car type with `send_request` function, and we would be able to assign it to a sender +of Vehicle type, suddenly it would allow us to use other vehicles, such as airplanes to be passed to `send_request` +function, but this function only expects type of `Car` (or it's subtypes). + +On the other hand, if we had this generic and we tried to assign it to a sender of `AudiCar`, that's fine, because now +all arguments passed to `send_request` function will be required to be of the `AudiCar` type, but that's a subtype of a +general `Car` and implements everything this general car would, so the function doesn't mind. + +This works because the type variable is only used in contravariant generics, in this case, in Callable's arguments. +This means that the logic of determining subtypes for callables will be the same for our Sender generic. Once again, be +cautions about marking a type-var as contravariant, and make sure to only do it when it really is safe. If you use this +type-var in any covariant or invariant structure, while also being used in a contravariant structure, the type-var +needs to be changed to an invariant type-var. + +## Conclusion + +Understanding generics and variance in Python's type system opens the door to writing more flexible, reusable, and +type-safe code. By learning the differences between covariance, contravariance, and invariance, you can design better +abstractions and APIs that handle various types in a safe manner. Covariant types are useful when you want to ensure +that your type hierarchy flows upwards, whereas contravariant types allow you to express type hierarchies in reverse +for certain use cases like function arguments. Invariance, meanwhile, helps maintain strict type safety in mutable +structures like lists. + +These principles of variance are not unique to Python — they are foundational concepts in many statically-typed +languages such as Java or C#. Understanding them will not only deepen your grasp of Python's type system but +also make it easier to work with other languages that implement similar type-checking mechanisms. diff --git a/content/posts/python-type-checking.md b/content/posts/python-type-checking.md new file mode 100644 index 0000000..440b0a9 --- /dev/null +++ b/content/posts/python-type-checking.md @@ -0,0 +1,449 @@ +--- +title: A guide to type checking in python +date: 2024-10-04 +tags: [programming, python, typing] +sources: + - https://dev.to/decorator_factory/type-hints-in-python-tutorial-3pel + - https://docs.basedpyright.com/#/type-concepts + - https://mypy.readthedocs.io/en/stable/ + - https://typing.readthedocs.io/en/latest/spec/special-types.html +--- + +Python is often known for its dynamic typing, which can be a drawback for those who prefer static typing due to its +benefits in catching bugs early and enhancing editor support. However, what many people don't know is that Python does +actually support specifying the types and it is even possible to enforce these types and work in a statically +type-checked Python environment. This article is an introduction to using Python in this way. + +## Regular python + +In regular python, you might end up writing a function like this: + +```python +def add(x, y): + return x + y +``` + +In this code, you have no idea what the type of `x` and `y` arguments should be. So, even though you may have intended +for this function to only work with numbers (ints), it's actually entirely possible to use it with something else. For +example, running `add("hello", "world)` will return `"helloworld"` because the `+` operator works on strings too. + +The point is, there's nothing telling you what the type of these parameters should be, and that could lead to +misunderstandings. Even though in some cases, you can judge what the type of these variables should be just based on +the name of that function, in most cases, it's not that easy to figure out and often requires looking through docs, or +just going over the code of that function. + +Annoyingly, python doesn't even prevent you from passing in types that are definitely incorrect, like: `add(1, "hi")`. +Running this would cause a `TypeError`, but unless you have unit-tests that actually run that code, you won't find out +about this bug until it actually causes an issue and at that point, it might already be too late, since your code has +crashed a production app. + +Clearly then, this isn't ideal. + +## Type-hints + +While python doesn't require it, it does have support for specifying "hints" that indicate what type should a given +variable have. So, when we take a look at the function above, adding type-hints to it would look like this: + +```python +def add(x: int, y: int) -> int: + return x + y +``` + +We've now made the types very explicit to the programmer, which means they'll no longer need to spend a bunch of time +looking through the implementation of that function, or going through the documentation just to know how to use this +function. Instead, the type hints will tell just you. + +This is incredibly useful, because most editors will be able to pick up these type hints, and show them to you while +calling the function, so you know what to pass right away, without even having to look at the function definition where +the type-hints are defined. + +Not only that, specifying a type-hint will greatly improve the development experience in your editor / IDE, because +you'll get much better auto-completion. The thing is, if you have a parameter like `x`, but your editor doesn't know +what type it should have, it can't really help you if you start typing `x.remove`, looking for the `removeprefix` +function. However, if you tell your editor that `x` is a string (`x: str`), it will now be able to go through all of +the methods that strings have, and show you those that start with `remove` (being `removeprefix` and `removesuffix`). + +This makes type-hints great at saving you time while developing, even though you have to do some additional work when +specifying them. + +## Run-time behavior + +Even though type-hints are a part of the Python language, the Python interpreter doesn't actually care about them. That +means that there isn't any optimizations or checking performed when you're running your code, so even with type hints +specified, they will not be enforced! This means that you can actually just choose to ignore them, and call the +function with incorrect types, like: `add(1, "hi")` without it causing any immediate runtime errors. + +Most editors are configured very loosely when it comes to type-hints. That means they will show you these hints when +you're working with the function, but they won't produce warnings. That's why they're called "type hints", they're only +hints that can help you out, but they aren't actually enforced. + +## Static type checking tools + +Even though python on it's own indeed doesn't enforce the type-hints you specify, there are tools that can run static +checks against your code to check for type correctness. + +{{< notice tip >}} +A static check is a check that works with your code in it's textual form. It will read the contents of your python +files without actually running that file and analyze it purely based on that text content. +{{< /notice >}} + +Using these tools will allow you to analyze your code for typing mistakes before you ever even run your program. That +means having a function call like `add(1, "hi")` anywhere in your code would be detected and reported as an issue. This +is very similar to running a linter like [`flake8`](https://flake8.pycqa.org/en/latest/) or +[`ruff`](https://docs.astral.sh/ruff/). + +Since running the type-checker manually could be quite annoying, so most of them have integrations with editors / IDEs, +which will allow you to see these errors immediately as you code. This makes it much easier to immediately notice any +type inconsistencies, which can help you catch or avoid a whole bunch of bugs. + +### Most commonly used type checkers + +- [**Pyright**](https://github.com/microsoft/pyright): Known for its speed and powerful features, it's written in + TypeScript and maintained by Microsoft. +- [**MyPy**](https://mypy.readthedocs.io/en/stable/): The most widely used type-checker, developed by the official + Python community. It's well integrated with most IDEs and tools, but it's known to be slow to adapt new features. +- [**PyType**](https://google.github.io/pytype/): Focuses on automatic type inference, making it suitable for codebases + with minimal type annotations. +- [**BasedPyright**](https://docs.basedpyright.com/): A fork of pyright with some additional features and enhancements, + my personal preference. + +## When to use type hints? + +Like you saw before with the `add` function, you can specify type-hints on functions, which allows you to describe what +types can be passed as parameters of that function alongside with specifying a return-type: + +```python +def add(x: int, y: int) -> int: + ... +``` + +You can also add type-hints directly to variables: + +```python +my_variable: str = "hello" +``` + +That said, doing this is usually not necessary, since most type-checkers can "infer" what the type of `my_variable` +should be, based on the value it's set to have. However, in some cases, it can be worth adding the annotation, as the +inference might not be sufficient. Let's consider the following example: + +```python +my_list = [] +``` + +In here, a type-checker can infer that this is a `list`, but they can't recognize what kind of elements will this list +contain. That makes it worth it to specify a more specific type: + +```python +my_list: list[int] = [] +``` + +Now the type-checker will recognize that the elements inside of this list will be integers. + +## Special types + +While in most cases, it's fairly easy to annotate something with the usual types, like `int`, `str`, `list`, `set`, ... +in some cases, you might need some special types to represent certain types. + +### None + +This isn't very special at all, but it may be surprising for beginners at first. You've probably seen the `None` type +in python before, but what you may not realize is that if you don't add any return statements into your function, it +will automatically return a `None` value. That means if your function doesn't return anything, you should annotate it +as returning `None`: + +```python +def my_func() -> None: + print("I'm a simple function, I just print something, but I don't explicitly return anything") + + +x = my_func() +assert x is None +``` + +### Union + +A union type is a way to specify that a type can be one of multiple specified types, allowing flexibility while still +enforcing type safety. + +There are multiple ways to specify a Union type. In modern versions of python (3.10+), you can do it like so: + +```python +x: int | str = "string" +``` + +If you need to support older python versions, you can also using `typing.Union`, like so: + +```python +from typing import Union + +x: Union[int, str] = "string" +``` + +As an example this function takes a value that can be of various types, and parses it into a bool: + +```python +def parse_bool_setting(value: str | int | bool) -> bool: + if isinstance(value, bool): + return value + + if isinstance(value, int): + if value == 0: + return False + if value == 1: + return True + raise ValueError(f"Value {value} can't be converted to boolean") + + # value can only be str now + if value.lower() in {"yes", "1", "true"}: + return True + if value.lower() in {"no", "0", "false"}: + return False + raise ValueError(f"Value {value} can't be converted to boolean") +``` + +One cool thing to notice here is that after the `isinstance` check, the type-checker will narrow down the type, so that +when inside of the block, it knows what type `value` has, but also outside of the block, the type-checker can narrow +the entire union and remove one of the variants since it was already handled. That's why at the end, we didn't need the +last `isinstance` check, the type checker knew the value was a string, because all the other options were already +handled. + +### Any + +In some cases, you might want to specify that your function can take in any type. This can be useful when annotating a +specific type could be way too complex / impossible, or you're working with something dynamic where you just don't care +about the typing information. + +```python +from typing import Any + +def foo(x: Any) -> None: + # a type checker won't warn you about accessing unknown attributes on Any types, + # it will just blindly allow anything + print(x.foobar) +``` + +{{< notice warning >}} +Don't over-use `Any` though, in vast majority of cases, it is not the right choice. I will touch more on it in the +section below, on using the `object` type. +{{< /notice >}} + +The most appropriate use for the `Any` type is when you're returning some dynamic value from a function, where the +developer can confidently know what the type will be, but which is impossible for the type-checker to figure out, +because of the dynamic nature. For example: + +```python +from typing import Any + +global_state = {} + +def get_state_variable(name: str) -> Any: + return global_state[name] + + +global_state["name"] = "Ian" +global_state["surname"] = "McKellen" +global_state["age"] = 85 + + +### + + +# Notice that we specified the annotation here manually, so that the type-checker will know +# what type we're working with. But we only know this type because we know what we stored in +# our dynamic state, so the function itself can't know what type to give us +full_name: str = get_state_variable("name") + " " + get_state_variable("surname") +``` + +### object + +In many cases where you don't care about what type is passed in, people mistakenly use `typing.Any` when they should +use `object` instead. Object is a class that every other class subclasses. That means every value is an `object`. + +The difference between doing `x: object` and `x: Any` is that with `Any`, the type-checker will essentially avoid +performing any checks whatsoever. That will mean that you can do whatever you want with such a variable, like access a +parameter that might not exist (`y = x.foobar`) and since the type-checker doesn't know about it, `y` will now also be +considered as `Any`. With `object`, even though you can still assign any value to such a variable, the type checker +will now only allow you to access attributes that are shared to all objects in python. That way, you can make sure that +you don't do something that not all types support, when your function is expected to work with all types. + +For example: + +```python +def do_stuff(x: object) -> None: + print(f"The do_stuff function is now working with: {x}") + + if isinstance(x, str): + # We can still narrow the type down to a more specific type, now the type-checker + # knows `x` is a string, and we can do some more things, that strings support, like: + print(x.removeprefix("hello")) + + if x > 5: # A type-checker will mark this as an error, because not all types support comparison against ints + print("It's bigger than 5") +``` + +### Collection types + +Python also provides some types to represent various collections. We've already seen the built-in `list` collection +type before. Another such built-in collection types are `tuple`, `set`, `forzenset` and `dict`. All of these types are +what we call "generic", which means that we can specify an internal type, which in this case represents the items that +these collections can hold, like `list[int]`. + +Here's a quick example of using these generic collection types: + +```python +def print_items(lst: list[str]) -> None: + for index, item in enumerate(lst): + # The type-checker knows `item` variable is a string now + print(f"-> Item #{index}: {item.strip()}") + +print_items([1, 2, 3]) +``` + +That said, in many cases, instead of using these specific collection types, you can use a less specific collection, so +that your function will work with multiple kinds of collections. Python has abstract classes for general collections +inside of the `collections.abc` module. One example would be the `Sequence` type: + +```python +from collections.abc import Sequence + +def print_items2(lst: Sequence[str]) -> None: + for index, item in enumerate(lst): + # The type-checker knows `item` variable is a string now + print(f"Item #{index}: {item.strip()}") + +print_items([1, 2, 3]) # fine +print_items((1, 2, 3)) # nope + +print_items2([1, 2, 3]) # works +print_items2((1, 2, 3)) # works +print_items2({1, 2, 3}) # works +``` + +You may think that you could also just use a union like: `list[str] | set[str] | tuple[str, ...]`, however that still +wouldn't quite cover everything, since people can actually make their own custom classes that have `__getitem__` and +work like a sequence, yet doesn't inherit from `list` or any of the other built-in types. By specifying +`collections.abc.Sequence` type-hint, even these custom classes that behave like sequences will work with your function. + +There are various other collections classes like these and it would take pretty long to explain them all here, so you +should do some research on them on your own to know what's available. + +{{< notice warning >}} +It is important to note that the built-in collection types like `list` weren't subscriptable in earlier versions of +python (before 3.9). If you still need to maintain compatibility with such older python versions, you can instead use +`typing.List`, `typing.Tuple`, `typing.Set` and `typing.Dict`. These types will support being subscripted even in those +older versions. + +Similarly, this also applies to the `collections.abc` abstract types, like `Sequence`, which also wasn't subscriptable +in these python versions. These also have alternatives in `typing` module: `typing.Sequence`, `typing.Mapping`, +`typing.MutableSequence`, `typing.Iterable`, ... +{{< /notice >}} + +#### Tuple type + +Python tuples are a bit more complicated than the other collection types, since we can specify which type is at which +position of the tuple. For example: `tuple[int, str, float]` will represent a tuple like: `(1, "hi", 5.3)`. The tricky +thing here is that specifying `tuple[int]` will not mean a tuple of integers, it will mean a tuple with a single +integer: `(1, )`. If you do need to specify a tuple with any amount of items of the same type, what you actually need +to do is: `tuple[int, ...]`. This annotation will work for `(1, )` or `(1, 1, 1)` or `(1, 1, 1, 1, 1)`. + +The reason for this is that we often use tuples to allow returning multiple values from a function. Yet these values +usually don't have the same type, so it's very useful to be able to specify these types individually: + +```python +def some_func() -> tuple[int, str]: + return 1, "hello" +``` + +That said, a tuple can also be useful as a sequence type, with the major difference between it and a list being that +tuples are immutable. This can make them more appropriate for storing certain sequences than lists. + +## Type casts + +Casting is a way to explicitly specify the type of a variable, overriding the type inferred by the type-checker. + +This can be very useful, as sometimes, we programmers have more information than the type-checker does, especially when +it comes to some dynamic logic that is hard to statically evaluate. The type checker's inference may end up being too +broad or sometimes even incorrect. + +For example: + +```python +from typing import cast + +my_list: list[str | int] = [] +my_list.append("Foo") +my_list.append(10) +my_list.append("Bar") + +# We know that the first item in the list is a string +# the type-checker would otherwise infer `x: str | int` +x = cast(str, my_list[0]) +``` + +Another example: + +```python +from typing import cast + +def foo(obj: object, type_name: str) -> None: + if type_name == "int": + obj = cast(int, obj) + ... # some logic + elif type_name == "str": + obj = cast(str, obj) + ... # some logic + else: + raise ValueError(f"Unknown type name: {type_name}") +``` + +{{< notice warning >}} +It is important to mention that unlike the casts in languages like Java or C#, in Python, type casts do not perform any +runtime checks to ensure that the variable really is what we claim it to be. Casts are only used as a hint to the +type-checker, and on runtime, the `cast` function just returns the value back without any extra logic. + +If you do wish to also perform a runtime check, you can use assertions to narrow the type: + +```python +def foo(obj: object) -> None: + print(obj + 1) # can't add 'object' and 'int' + assert isinstance(obj, int) + print(obj + 1) # works +``` + +Alternatively, you can just check with if statements: + +```python +def foo(obj: object) -> None: + print(obj + 1) # can't add 'object' and 'int' + if not isinstance(obj, int): + raise TypeError("Expected int") + print(obj + 1) # works +``` + +{{< /notice >}} + +## Closing notes + +In summary, Python’s type hints are a powerful tool for improving code clarity, reliability, and development +experience. By adding type annotations to your functions and variables, you provide valuable information to both your +IDE and fellow developers, helping to catch potential bugs early and facilitating easier code maintenance. + +Type hints offer significant benefits: + +- Enhanced Readability: Clearly specifies the expected types of function parameters and return values, making the code + more self-documenting. +- Improved Development Experience: Provides better auto-completion and in-editor type checking, helping you avoid + errors and speeding up development. +- Early Error Detection: Static type checkers can catch type-related issues before runtime, reducing the risk of bugs + making it into production. + +For further exploration of Python’s type hints and their applications, you can refer to additional resources such as: + +- The [Type Hinting Cheat Sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html) from mypy for a quick + reference on various type hints and their usage. +- My other articles on more advanced typing topics like [TypeVars]({{< ref "posts/type-vars" >}}) and [Generics]({{< ref + "posts/generics-and-variance" >}}) for deeper insights into Python's typing system. + +Embracing type hints can elevate your Python programming experience, making your code more robust and maintainable in +the long run. diff --git a/content/posts/type-vars.md b/content/posts/type-vars.md new file mode 100644 index 0000000..76b5a16 --- /dev/null +++ b/content/posts/type-vars.md @@ -0,0 +1,268 @@ +--- +title: Type variables in python typing +date: 2024-10-04 +tags: [programming, python, typing] +sources: + - https://mypy.readthedocs.io/en/stable/generics.html#generic-functions + - https://docs.basedpyright.com/#/type-concepts-advanced?id=value-constrained-type-variables + - https://dev.to/decorator_factory/typevars-explained-hmo + - https://peps.python.org/pep-0695/ + - https://typing.readthedocs.io/en/latest/spec/generics.html +--- + +Python’s type hinting system offers great flexibility, and a crucial part of that flexibility comes from **Type +Variables**. These allow us to define [generic types]({{< ref "posts/generics-and-variance" >}}), enabling us to write +functions and classes that work with different types while maintaining type safety. Let’s dive into what type variables +are, how to use them effectively, and why they are useful. + +_**Pre-requisites**: This article assumes that you already have a [basic knowledge of python typing]({{< ref +"posts/python-type-checking" >}})._ + +## What's a Type Variable + +A type variable (or a `TypeVar`) is basically representing a variable type. It essentially acts as a placeholder for a +specific type within a function or a class. Instead of locking down a function to operate on a specific type, type +variables allow it to adapt to whatever type is provided. + +For example: + +```python +from typing import TypeVar + +T = TypeVar("T") + +def identity(item: T) -> T: + """ + Return the same item that was passed in, without modifying it. + The type of the returned item will be the same as the input type. + """ + return item +``` + +In this example, `T` is a type variable, meaning that the function `identity` can take any type of argument and will return +an object of that same type. If you pass an integer, it returns an integer; if you pass a string, it returns a string. +The function adapts to the type of input while preserving the type in the output. + +```python +identity(5) # Returns 5 (int) +identity("hello") # Returns "hello" (str) +identity([1, 2, 3]) # Returns [1, 2, 3] (list) +``` + +Whenever the function is called, the type-var gets "bound" to the type used in that call, that allows the type checker +to enforce the type consistency across the function with this bound type. + +## Type Variables with Upper Bounds + +You can also restrict a type variable to only types that are subtypes of a specific type by using the `bound` argument. +This is useful when you want to ensure that the type variable is always a subclass of a particular type. + +```python +from typing import TypeVar +from collections.abc import Sequence + +T = TypeVar("T", bound=Sequence) + +def split_sequence(seq: T, chunks: int) -> list[T]: + """ Split a given sequence into n equally sized chunks of itself. + + If the sequence can't be evenly split, the last chunk will contain + the additional elements. + """ + new = [] + chunk_size = len(seq) // chunks + for i in range(chunks): + start = i * chunk_size + end = i * chunk_size + chunk_size + if i == chunks - 1: + # On last chunk, include all remaining elements + new.append(seq[start:]) + else: + new.append(seq[start:end]) + return new +``` + +In this example, `T` is bounded by `Sequence`, so `split_sequence` can work with any type of sequence, such as lists or +tuples. The return type will be a list with elements being slices of the original sequence, so the list items will +match the type of the input sequence, preserving it. + +If you pass a `list[int]`, you'll get a `list[list[int]]`, and if you pass a `tuple[str]`, you'll get a +`list[tuple[str]]`. + +## Type Variables with Specific Type Restrictions + +Type variables can also be restricted to specific types, which can be useful when you want to enforce that a type +variable can only be one of a predefined set of types. + +One common example is `AnyStr`, which can be either `str` or `bytes`. In fact, this type is so common that the `typing` +module actually contains it directly (`typing.AnyStr`). Here is an example of how to define this type-var: + +```python +from typing import TypeVar + +AnyStr = TypeVar("AnyStr", str, bytes) + + +def concat(x: AnyStr, y: AnyStr) -> AnyStr: + return x + y + +concat("a", "b") # valid +concat(b"a", b"b") # valid +concat(1, 2) # error +``` + +**Why not just use `Union[str, bytes]`?** + +You might wonder why we don’t just use a union type, like this: + +```python +from typing import Union + +def concat(x: Union[str, bytes], y: Union[str, bytes]) -> Union[str, bytes]: + return x + y +``` + +While this might seem similar, the key difference lies in type consistency. A Union would allow one of the variables to +be `str` while the other is `bytes`, however, combining these isn't possible, meaning this would break at runtime! + +```python +concat(b"a", "b") # No type error, but implementation fails! +``` + +**How about `TypeVar("T", bound=Union[str, bytes])`?** + +This actually results in the same issue. The thing is, type-checkers are fairly smart and if you call `concat(b"a", +"b")`, it will use the narrowest type from that top `Union[str, bytes]` type bound when binding the type var. This +narrowest type will end up being the union itself, so the type-var will essentially become the union type, leaving you +with the same issue. + +For that reason, it can sometimes be useful to use specific type restrictions with a type-var, rather than just binding +it to some common top level type. + +## New TypeVar syntax + +In python 3.12, it's now possible to use a new, much more convenient syntax for generics, which looks like this: + +```python +def indentity[T](x: T) -> T: + return x +``` + +To specify a bound for a type var like this, you can do: + +```python +def car_identity[T: Car](car: T) -> T: + return car +``` + +This syntax also works for generic classes: + +```python +class Foo[T]: + def __init__(self, x: T): + self.x = x +``` + +## TypeVarTuple + +A `TypeVarTuple` is defined similarly to a regular `TypeVar`, but it is used to represent a variable-length tuple. This +can be useful when you want to work with functions or classes that need to preserve a certain shape of tuples, or +modify it in a type-safe manner. + +```python +from typing import TypeVar, TypeVarTuple, reveal_type, cast + +T = TypeVar("T") +Ts = TypeVarTuple("Ts") + +def tuple_append(tup: tuple[*Ts], val: T) -> tuple[*Ts, T]: + return (*tup, val) + +x = (2, "hi", 0.8) +y = tuple_append(x, 10) +reveal_type(y) # tuple[int, str, float, int] +``` + +Or with the new 3.12 syntax: + +```python +from typing import cast, reveal_type + +def tuple_sum[*Ts](*tuples: tuple[*Ts]) -> tuple[*Ts]: + summed = tuple(sum(tup) for tup in zip(*tuples)) + reveal_type(summed) # tuple[int, ...] + # The type checker only knows that the sum function returns an int here, but this is way too dynamic + # for it to understand that summed will end up being tuple[*Ts]. For that reason, we can use a cast + return cast(tuple[*Ts], summed) + +x = (10, 15, 20.0) +y = (5, 10, 15.0) +z = (1, 2, 3.2) +res = tuple_sum(x, y, z) +print(res) # (16, 27, 38.2) +reveal_type(res) # tuple[int, int, float] +``` + +## ParamSpec + +In addition to `TypeVar`, Python 3.10 introduced `ParamSpec` for handling type variables related to function parameters. +Essentially, a `ParamSpec` is kind of like having multiple type-vars for all of your parameters, but stored in a single +place. It is mainly useful in function decorators: + +```python +from typing import TypeVar, ParamSpec +from collections.abc import Callable + +P = ParamSpec('P') +R = TypeVar('R') + +def decorator(func: Callable[P, R]) -> Callable[P, R]: + def wrapper(*args: P.args, **kwargs: P.kwargs) -> R: + print("Before function call") + result = func(*args, **kwargs) + print("After function call") + return result + return wrapper + +@decorator +def say_hello(name: str) -> str: + return f"Hello, {name}!" + +print(say_hello("Alice")) +print(say_hello(55)) # error: 'int' type can't be assigned to parameter name of type 'str' +``` + +In this example, the `ParamSpec` is able to fully preserve the input parameters of the decorated function, just like +the `TypeVar` here preserves the single return type parameter. + +With the new 3.12 syntax, `ParamSpec` can also be specified like this: + +```python +from collections.abc import Callable + +def decorator[**P, R](func: Callable[P, R]) -> Callable[P, R]: + ... +``` + +### Concatenate + +In many cases, `ParamSpec` is used in combination with `typing.Concatenate`, which can allow for consuming or adding +function parameters, for example by specifying: `Callable[Concatenate[int, P], str]` we limit our decorator to only +accept functions that take an int as the first argument and return a string. This also allows the decorator to remove +that argument after decoration, by specifying the return type as `Callable[P, str]`: + +```python +from typing import ParamSpec, Concatenate +from collections.abc import Callable + +P = ParamSpec('P') + +def id_5(func: Callable[Concatenate[int, P], str]) -> Callable[P, str]: + def wrapper(*args: P.args, **kwargs: P.kwargs) -> str: + return func(5, *args, **kwargs) + return wrapper + +@id_5 +def log_event(id: int, event: str) -> str: + return f"Got an event on {id=}: {event}!" +``` diff --git a/content/posts/typing-variance-of-generics.md b/content/posts/typing-variance-of-generics.md deleted file mode 100644 index 086e55d..0000000 --- a/content/posts/typing-variance-of-generics.md +++ /dev/null @@ -1,670 +0,0 @@ ---- -title: Variance of typing generics (covariance, contravariance and invariance) -date: 2021-10-04 -tags: [programming, python] ---- - -In many programming languages where typing matters we often need to define certain properties for the types of generics -so that they can work properly. Specifically, when we use a generic type of some typevar `X` we need to know when that -generic type with typevar `Y` should be treated as it's subtype. I know this probably sounds pretty confusing but don't -worry, I'll explain what that sentence means in quite a lot of detail here. (That's why I wrote a whole article about -it). It's actually not that difficult to understand, it just needs a few examples to explain it. - -As a very quick example of what I mean: When we use a sequence of certain types, say a sequence containing elements of -type Shirt that is a subtype of a Clothing type, can we assign this sequence as having a type of sequence of clothing -elements? If yes, than this sequence would be covariant in it's elements type. What about a sequence of Clothing -elements? Can we assign this sequence as having a type of a sequence of Shirts? If yes, then this sequence generic -would be contravariant in it's elements type. Or, if the answer to both of these was no, then the sequence is -invariant. - -For simplicity, I'll be using python in the examples. Even though python isn't a strictly typed language, because of -tools such as pyright, mypy or many others, python does have optional support for typing that can be checked for -outside of run time (it's basically like strictly typed languages that check this on compile time, except in python, -it's optional and doesn't actually occur on compilation, so we say that it occurs "on typing time" or "linting time"). - -Do note that this post is a bit more advanced than the other ones I made and if you don't already feel comfortable with -basic typing concepts in python, it may not be very clear what's going on in here so I'd suggest learning something -about them before reading this. - -## Pre-conceptions - -This section includes some explanation of certain concepts that I'll be using in later the article, if you already know -what these are, you can skip them, however if you don't it is crucial that you read through this to understand the rest -of this article. I'll go through these concepts briefly, but it should be sufficient to understand the rest of this -article. If you do want to know more though, I'd suggest looking at mypy documentation or python documentation. - -### Type Variables - -A type variable (or a TypeVar) is basically representing a variable type. What this means is that we can have a -function that takes a variable of type T (which is our TypeVar) and returns the type T. Something like this will mean -that we return an object of the same type as the object that was given to the function. - -```python -from typing import TypeVar, Any - -T = TypeVar("T") - - -def set_a(obj: T, a_value: Any) -> T: - """ - Set the value of 'a' attribute for given `obj` of any type to given `a_value` - Return the same object after this adjustment was made. - """ - obj.a = a_value - # Note that this function probably doesn't really need to return this - # because `obj` is obviously mutable since we were able to set the it's value to something - # that wasn't previously there - return obj -``` - -If you've understood this example, you can move onto the next section, however if you want to know something extra -about these type variables or you didn't quite understand everything, I've included some more subsections about them -with more examples on some interesting things that you can do with them. - -#### Type variables with value restriction - -By default, a type variable can be replaced by any type. This is usually what we want, but sometimes it does make sense -to restrict a TypeVar to only certain types. - -A commonly used variable with such restrictions is `typing.AnyStr`. This typevar can only have values `str` and -`bytes`. - -```python -from typing import TypeVar - -AnyStr = TypeVar("AnyStr", str, bytes) - - -def concat(x: AnyStr, y: AnyStr) -> AnyStr: - return x + y - -concat("a", "b") -concat(b"a", b"b) -concat(1, 2) # Error! -``` - -This is very different from just using a simple `Union[str, bytes]`: - -```python -from typing import Union - -UnionAnyStr = Union[str, bytes] - -def concat(x: UnionAnyStr, y: UnionAnyStr) -> UnionAnyStr: - return x + y -``` - -Because in this case, if we pass in 2 strings, we don't know whether we will get a `str` object back, or a `bytes` one. -It would also allow us to use `concat("x", b"y")` however we don't know how to concatenate string object with bytes. -With a TypeVar, the type checker will reject something like this, but with a simple Union, this would be treated as -a valid function call and the argument types would be marked as correct, even though the implementation will fail. - -#### Type variable with upper bounds - -We can also restrict a type variable to having values that are a subtype of a specific type. This specific type is -called the upper bound of the type variable. - -```python -from typing import TypeVar, Sequence - -T = TypeVar("T", bound=Sequence) - -# Signify that the return type of this function will be the list containing -# sequences of the same type sequence as the type we got from the argument -def split_sequence(seq: T, chunks: int) -> list[T]: - """ - Split a given sequence into n equally sized chunks of itself. - - If the sequence can't be evenly split, the last chunk will contain - the additional elements. - """ - new = [] - chunk_size = len(seq) // chunks - for i in range(chunks): - start = i * chunk_size - end = i * chunk_size + chunk_size - if i == chunks - 1: - # On last chunk, include all remaining elements - new.append(seq[start:]) - else: - new.append(seq[start:end]) - return new -``` - -In here, we know that this function function will work for any type of sequence, however just using input argument type -of sequence wouldn't be ideal, because it wouldn't preserve that type when returning a list of chunks of those -sequences. With that kind of approach, we'd lost the type definition of our sequence from for example `list[int]` only to -`Sequence[object]`. - -For that reason, we can use a type-var, in which we can enforce that the type must be a sequence, but we still don't -know what kind of sequence it may be, so it can be any subtype that implements the necessary functions for a sequence. -This means if we pass in a list, we know we will get back a list of lists, if we pass a tuple, we'll get a list of -tuples, and if we pass a list of integers, we'll get a list of lists of integers. This means the original type won't be -lost even after going through a function. - -### Generic Types - -Essentially when a class is generic, it just defines that something inside of our generic type is of some other type. A -good example would be for example a list of integers: `list[int]` (or in older python versions: `typing.List[int]`). -We've specified that our list will be holding elements of `int` type. - -Generics like this can be used for many things, for example with a dict, we actually provide 2 types, first is the type -of the keys and second is the type of the values: `dict[str, int]` would be a dict with `str` keys and `int` values. - -Here's a list of some definable generic types that are currently present in python 3.9: - -{{< table >}} -| Type | Description | -|-------------------|-----------------------------------------------------| -| list[str] | List of `str` objects | -| tuple[int, int] | Tuple of two `int` objects | -| tuple[int, ...] | Tuple of arbitrary number of `int` | -| dict[str, int] | Dictionary with `str` keys and `int` values | -| Iterable[int] | Iterable object containing ints | -| Sequence[bool] | Sequence of booleans (immutable) | -| Mapping[str, int] | Mapping from `str` keys to `int` values (immutable) | -{{< /table >}} - -In python, we can even make up our own generics with the help of `typing.Generic`: - -```python -from typing import TypeVar, Generic - -T = TypeVar("T") - -# If we specify a type-hint for our building like Building[Student] -# it will mean that the `inhabitants` variable will be a of type: `list[Student]` -class Building(Generic[T]): - def __init__(self, *inhabitants: T): - self.inhabitants = inhabitants - -class Person: ... -class Student(Person): ... - -people = [Person() for _ in range(10)] -my_building: Building[Person] = Building(*people) - -students = [Student() for _ in range(10)] -my_dorm = Building[Student] = Building(*students) - -# We now know that `my_building` will contain inhabitants of `Person` type, -# while `my_dorm` will only have `Student`(s) as it's inhabitants. -``` - -I'll go deeper into creating our custom generics later, after we learn the differences between covariance, -contravariance and invariance. For now, this is just a very simple illustrative example. - -## Variance - -As I've quickly explained in the start, the concept of variance tells us about whether a generic of certain type can be -assigned to a generic of another type. But I won't bother with trying to define variance more meaningfully since the -definition would be convoluted and you probably wouldn't really get what is it about until you'll see the examples of -different types of variances. So for that reason, let's just take a look at those. - -### Covariance - -The first concept of generic variance is **covariance**, the definition of which looks like this: - -> If a generic `G[T]` is covariant in `T` and `A` is a subtype of `B`, then `G[A]` is a subtype of `G[B]`. This means -> that every variable of `G[A]` type can be assigned as having the `G[B]` type. - -As I've very quickly explained initially, covariance is a concept where if we have a generic of some type, we can -assign it to a generic type of some supertype of that type. This means that the actual generic type is a subtype of -this new generic which we've assigned it to. - -I know that this definition can sound really complicated, but it's actually not that hard. As an example, I'll use a `tuple`, -which is an immutable sequence in python. If we have a tuple of `Car` type, `Car` being a subclass of `Vehicle`, can we -assign this tuple a type of tuple of Vehicles? The answer here is yes, because every `Car` is a `Vehicle`, so a -tuple of cars is a subtype of tuple of vehicles. So is a tuple of objects, `object` being the basic class that -pretty much everything has in python, so both tuple of cars, and tuple of vehicles is a subtype of tuple of objects, -and we can assign those tuples to a this tuple of objects. - -```python -from typing import Tuple - -class Vehicle: ... -class Boat(Vehicle): ... -class Car(Vehicle): ... - -my_vehicle = Vehicle() -my_boat = Boat() -my_car_1 = Car() -my_car_2 = Car() - - -vehicles: Tuple[Vehicle, ...] = (my_vehicle, my_car_1, my_boat) -cars: Tuple[Car, ...] = (my_car_1, my_car_1) - -# This line assigns a variable with the type of 'tuple of cars' to a 'tuple of vehicles' type -# this makes sense because a tuple of vehicles can hold cars -# since cars are vehicles -x: Tuple[Vehicle, ...] = cars - -# This line however tries to assign a tuple of vehicles to a tuple of cars type -# this however doesn't make sense because not all vehicles are cars, a tuple of -# vehicles can also contain other non-car vehicles, such as boats. These may lack -# some of the functionalities of cars, so a type checker would complain here -x: Tuple[Car, ...] = vehicles - -# In here, both of these assignments are valid because both cars and vehicles will -# implement all of the logic that a basic `object` class needs. This means this -# assignment is also valid for a generic that's covariant. -x: Tuple[object, ...] = cars -x: Tuple[object, ...] = vehicles -``` - -Another example of a covariant type would be the return value of a function. In python, the `typing.Callable` type is -initialized like `Callable[[argument_type1, argument_type2], return_type]`. In this case, the return type for our -function is also covariant, because we can return a more specific type (subtype) as a return type. This is because we -don't mind treating a type with more functionalities as their supertype which have less functionalities, since the type -still has all of the functionalities we want i.e. it's fully compatible with the less specific type. - -```python -class Car: ... -class WolkswagenCar(Car): ... -class AudiCar(Car) - -def get_car() -> Car: - # The type of this function is Callable[[], Car] - r = random.randint(1, 3) - if r == 1: - return Car() - elif r == 2: - return WolkswagenCar() - elif r == 3: - return AudiCar() - -def get_wolkswagen_car() -> WolkswagenCar: - # The type of this function is Callable[[], WolkswagenCar] - return WolkswagenCar() - - -# In the line below, we define a function `x` which is expected to have a type of -# Callable[[], Car], meaning it's a function that returns a Car. -# Here, we don't mind that the actual function will be returning a more specififc -# WolkswagenCar type, since that type is fully compatible with the less specific Car type. -x: Callable[[], Car] = get_wolkswagen_car - -# However this wouldn't really make sense the other way around. -# We can't assign a function which returns any kind of Car to a variable with is expected to -# hold a function that's supposed to return a specific type of a car. This is because not -# every car is a WolkswagenCar, we may get an AudiCar from this function, and that may not -# support everything WolkswagenCar does. -x: Callable[[], WolkswagenCar] = get_car -``` - -### Contravariance - -Another concept is known as **contravariance**. It is essentially a complete opposite of **covariance**. - -> If a generic `G[T]` is contravariant in `T`, and `A` is a subtype of `B`, then `G[B]` is a subtype of `G[A]`. This -> means that every variable of `G[B]` type can be assigned as having the `G[A]` type. - -In this case, this means that if we have a generic of some type, we can assign it to a generic type of some subtype of -that type. This means that the actual generic type is a subtype of this new generic which we've assigned it to. - -This explanation is probably even more confusing if you only look at the definition. But even when we think about it as -an opposite of covariance, there's a question that comes up: Why would we ever want to have something like this? When -is it actually useful? To answer this, let's look at the other portion of the `typing.Callable` type which contains the -arguments to a function. - -```python -class Car: ... -class WolkswagenCar(Car): ... -class AudiCar(Car): ... - -# The type of this function is Callable[[Car], None] -def drive_car(car: Car) -> None: - car.start_engine() - car.drive() - print(f"Driving {car.__class__.__name__} car.") - -# The type of this function is Callable[[WolkswagenCar], None] -def drive_wolkswagen_car(wolkswagen_car: WolkswagenCar) -> None: - # We need to login to our wolkswagen account on the car first - # with the wolkswagen ID, in order to be able to drive it. - wolkswagen_car.login(wolkswagen_car.wolkswagen_id) - drive_car(wolkswagen_car) - -# The type of this function is Callable[[AudiCar], None] -def drive_audi_car(audi_car: AudiCar) -> None: - # All audi cars need to report back with their license plate - # to Audi servers before driving is enabled - audi_car.contact_audi(audi_car.license_plate_number) - drive_car(wolkswagen_car) - - -# In here, we try to assign a function that takes a wolkswagen car -# to a variable which is defined as a function/callable which takes any regular car. -# However this is a problem, because now we can use x with any car, including an -# AudiCar, but x is assigned to a fucntion that only accept wolkswagen cars, this -# may cause issues because not every car has the properties of a wolkswagen car, -# which this function may need to utilize. -x: Callable[[Car], None] = drive_wolkswagen_car - -# On the other hand, in this example, we're assigning a function that can -# take any car to a variable that is defined as a function/callable that only -# takes wolkswagen cars as arguments. -# This is fine, because x only allows us to pass in wolkswagen cars, and it is set -# to a function which accepts any kind of car, including wolkswagen cars. -x: Callable[[WolkswagenCar], None] = drive_car -``` - -So from this it's already clear that the `Callable` type for the arguments portion can't be covariant, and hopefully -you can now recognize what it means for something to be contravariant. But to reinforce this, here's one more bit -different example. - -```python -class Library: ... -class Book: ... -class FantasyBook(Book): ... -class DramaBook(Book): ... - -def remove_while_used(func: Callable[[Library, Book], None]) -> Callable[[Library, Book], None] - """This decorator removes a book from the library while `func` is running.""" - def wrapper(library: Library, book: Book) -> None: - library.remove(book) - value = func(book) - library.add(book) - return value - return wrapper - - -# As we can see here, we can use the `remove_while_used` decorator with the -# `read_fantasy_book` function below, since this decorator expects a function -# of type: Callable[[Library, Book], None] to which we're assigning -# our function `read_fantasy_book`, which has a type of -# Callable[[Library, FantasyBook], None]. -# -# Obviously, there's no problem with Library, it's the same type, but as for -# the type of the book argument, our read_fantasy_book func only expects fantasy -# books, and we're assigning it to `func` attribute of the decorator, which -# expects a general Book type. This is fine because a FantasyBook meets all of -# the necessary criteria for a general Book, it just includes some more special -# things, but the decorator function won't use those anyway. -# -# Since this assignment is be possible, it means that Callable[[Library, Book], None] -# is a subtype of Callable[[Library, FantasyBook], None], not the other way around. -# Even though Book isn't a subtype of FantasyBook, but rather it's supertype. -@remove_while_used -def read_fantasy_book(library: Library, book: FantasyBook) -> None: - book.read() - my_rating = random.randint(1, 10) - # Rate the fantasy section of the library - library.submit_fantasy_rating(my_rating) -``` - -This kind of behavior, where we can pass generics with more specific types to generics of less specific types -(supertypes), means that the generic is contravariant in that type. So for callables, we can write that: -`Callablle[[T], None]` is contravariant in `T`. - -### Invariance - -The last type of variance is called **invariance**, and it's certainly the easiest of these types to understand, and by -now you may have already figured out what it means. Simply, a generic is invariant in type when it's neither -covariant nor contravariant. - -> If a generic `G[T]` is invariant in `T` and `A` is a subtype of `B`, then `G[A]` is neither a subtype nor a supertype -> of `G[B]`. This means that any variable of `G[A]` type can never be assigned as having the `G[B]` type, and -> vice-versa. - -This means that the -generic will never be a subtype of itself no matter it's type. - -What can be a bit surprising is that the `list` datatype is actually invariant in it's elements type. While an -immutable sequence such as a `tuple` is covariant in the type of it's elements, this isn't the case for mutable -sequences. This may seem weird, but there is a good reason for that. - -```python -class Person: - def eat() -> None: ... -class Adult(Person): - def work() -> None: ... -class Child(Person): - def study() -> None: ... - - -person1 = Person() -person2 = Person() -adult1 = Adult() -adult2 = Adult() -child1 = Child() -child2 = Child() - -people: List[Person] = [person1, person2, adult2, child1] -adults: List[Adult] = [adult1, adult2] - -# At first, it is important to establish that list isn't contravariant. This is perhaps quite intuitive, but it is -# important nevertheless. In here, we tried to assign a list of people to `x` which has a type of list of children. -# This obviously can't work, because a list of people can include more types than just `Child`, and these types -# can lack some of the features that children have, meaning lists can't be contravariant. -x: list[Child] = people -``` - -Now that we've established that list type's elements aren't contravariant, let's see why it would be a bad idea to make -them covariant (like tuples). Essentially, the main difference here is the fact that a tuple is immutable, list isn't. -This means that you can add new elements to lists and alter them, but you can't do that with tuples, if you want to add -a new element there, you'd have to make a new tuple with those elements, so you wouldn't be altering an existing one. - -Why does that matter? Well let's see this in an actual example - -```python -def append_adult(adults: List[Person]) -> None: - new_adult = Adult() - adults.append(adult) - -child1 = Child() -child2 = Child() -children: List[Child] = [child1, child2] - -# This is where the covariant assignment happens, we assign a list of children -# to a list of people, `Child` being a subtype of Person`. Which would imply that -# list is covariant in the type of it's elements. -# This is the line on which a type-checker would complain. So let's see why allowing -# it is a bad idea. -people: List[Person] = children - - -# Since we know that `people` is a list of `Person` type elements, we can obviously -# pass it over to `append_adult` function, which takes a list of `Person` type elements. -# After we called this fucntion, our list got altered. it now includes an adult, which -# is fine since this is a list of people, and `Adult` type is a subtype of `Person`. -# But what also happened is that the list in `children` variable got altered! -append_adult(people) - -# This will work fine, all people can eat, that includes adults and children -children[0].eat() - -# Only children can study, this will also work fine because the 0th element is a child, -# afterall this is a list of children right? -children[0].study() -# Uh oh! This will fail, we've appended an adult to our list of children. -# But since this is a list of `Child` type elements, we expect all elements in that list -# to have all properties required of the `Child` type. But there's an `Adult` type element -# in there which doesn't actually have all of the properties of a `Child`, they lack the -# `study` method, causing an error on this line. -children[-1].study() -``` - -As we can see from this example, the reason lists can't be covariant is because we wouldn't be able assign a list of -certain type of elements to a list with elements of a supertype of those (a parent class of our actual element class). -Even though that type implements every feature that the super-type would, allowing this kind of -assignment could lead to mutations of the list where elements that don't belong were added, since while they may fit -the supertype requirement, they might no longer be of the original type. - -That said, if we copied the list, re-typing in to a supertype wouldn't be an issue: - -```python -class Game: ... -class BoardGame(Game): ... -class SportGame(Game): ... - -board_games: list[BoardGame] = [tic_tac_toe, chess, monopoly] -games: list[Game] = board_games.copy() -games.append(voleyball) -``` - -This is why immutable sequences are covariant, they don't make it possible to edit the original, instead if a change is -desired, a new object must be made. This is why `tuple` or other `Sequence` types don't need to be copied when doing an -assignment like this. But elements of `MutableSequence` types do. - -### Recap - -- if G[T] is covariant in T, and A is a subtype of B, then G[A] is a subtype of G[B] -- if G[T] is contravariant in T, and A is a subtype of B, then G[B] is a subtype of G[A] -- if G[T] is invariant in T (the default), and A is a subtype of B, then G[A] and G[B] don't have any subtype relation - -## Creating Generics - -Now that we know what it means for a generic to have a covariant/contravariant/invariant type, we can explore how to -make use of this knowledge and actually create some generics with these concepts in mind - -**Making an invariant generics:** - -```python -from typing import TypeVar, Generic, List, Iterable - -# We don't need to specify covariant=False nor contravariant=False, these are the default -# values, I do this here only to explicitly show that this typevar is invariant -T = TypeVar("T", covariant=False, contravariant=False) - -class University(Generic[T]): - students: List[T] - - def __init__(self, students: Iterable[T]) -> None: - self.students = [s for s in students] - - def add_student(self, student: T) -> None: - students.append(student) - -x: University[EngineeringStudent] = University(engineering_students) -y: University[Student] = x # NOT VALID! University isn't covariant -z: University[ComputerEngineeringStudent] = x # NOT VALID! University isn't contravariant -``` - -In this case, our University generic type is invariant in the student type, meaning that -if we have a `University[Student]` type and `University[EngineeringStudent]` type, neither -is a subtype of the other. - -**Making covariant generics:** - -In here, it is important to make 1 thing clear, whenever the typevar is in a function argument, it would become -contravariant, making it impossible to make a covariant generic which takes attributes of it's type as arguments -somewhere. However this rule does not extend to initialization/constructor of that generic, and this is very important. -Without this exemption, it wouldn't really be possible to construct a covariant generic, since the original type must -somehow be passed onto the instance itself, otherwise we wouldn't know what type to return in the actual logic. This is -why using a covariant typevar in `__init__` is allowed. - -```python -from typing import TypeVar, Generic, Sequence, Iterable - -T_co = TypeVar("T_co", covariant=True) - -class Matrix(Sequence[Sequence[T_co]], Generic[T_co]): - __slots__ = ("rows", ) - rows: tuple[tuple[T_co, ...], ...] - - def __init__(self, rows: Iterable[Iterable[T_co]]): - self.rows = tuple(tuple(el for el in row) for row in rows) - - def __setattr__(self, attr: str, value: object) -> None: - if hasattr(self, attr): - raise AttributeError(f"Can't change {attr} (read-only)") - return super().__setattr__(attr, value) - - def __getitem__(self, row_id: int, col_id: int) -> T_co: - return self.rows[row_id][col_id] - - def __len__(self) -> int: - return len(self.rows) - -class X: ... -class Y(X): ... -class Z(Y): ... - -a: Matrix[Y] = Matrix([[Y(), Z()], [Z(), Y()]]) -b: Matrix[X] = x # VALID. Matrix is covariant -c: Matrix[Z] = x # INVALID! Matirx isn't contravariant -``` - -In this case, our Matrix generic type is covariant in the element type, meaning that if we have a `Matrix[Y]` type -and `Matrix[X]` type, we could assign the `University[Y]` to the `University[X]` type, hence making it it's -subtype. - -We can make this Matrix covariant because it is immutable (enforced by slots and custom setattr logic). This allows -this matrix class (just like any other sequence class), to be covariant. Since it can't be altered, this covariance is -safe. - -**Making contravariant generics:** - -```python -from typing import TypeVar, Generic -import pickle -import requests - -T_contra = TypeVar("T_contra", contravariant=True) - -class Sender(Generic[T_contra]): - def __init__(self, url: str) -> None: - self.url = url - - def send_request(self, val: T_contra) -> str: - s = pickle.dumps(val) - requests.post(self.url, data={"object": s}) - -class X: ... -class Y(X): ... -class Z(Y): ... - -a: Sender[Y] = Sender("https://test.com") -b: Sender[Z] = x # VALID, sender is contravariant -c: Sender[X] = x # INVALID, sender is covariant -``` - -In this case, our `Sender` generic type is contravariant in it's value type, meaning that -if we have a `Sender[Y]` type and `Sender[Z]` type, we could assign the `Sender[Y]` type -to the `Sender[Z]` type, hence making it it's subtype. - -This works because the type variable is only used in contravariant generics, in this case, in Callable's arguments. -This means that the logic of determining subtypes for callables will be the same for our Sender generic. - -i.e. if we had a sender generic of Car type with `send_request` function, and we would be able to assign it to a sender -of Vehicle type, suddenly it would allow us to use other vehicles, such as airplanes to be passed to `send_request` -function, but this function only expects type of `Car` (or it's subtypes). - -On the other hand, if we had this generic and we tried to assign it to a sender of `AudiCar`, that's fine, because now -all arguments passed to `send_request` function will be required to be of the `AudiCar` type, but that's a subtype of a -general `Car` and implements everything this general car would, so the function doesn't mind. - -Note: This probably isn't the best example of a contravariant class, but because of my limited imagination and lack of -time, I wasn't able to think of anything better. - -**Some extra notes** - -- Usually, most of your generics will be invariant, however sometimes, it can be very useful to mark your generic as - covariant, since otherwise, you'd need to recast your variable manually when defining another type, or copy your - whole generic, which would be very wasteful, just to satisfy type-checkers. Less commonly, you can also find it - helpful to mark your generics as contravariant, though this will usually not come up, maybe if you're using - protocols, but with full standalone generics, it's quite rarely used. Nevertheless, it's important to -- Once you've made a typevar covariant or contravariant, you won't be able to use it anywhere else outside of some - generic, since it doesn't make sense to use such a typevar as a standalone thing, just use the `bound` feature of a - type variable instead, that will define it's upper bound types and any subtypes of those will be usable. -- Generics that can be covariant, or contravariant, but are used with a typevar that doesn't have that specified can - lead to getting a warning from the type-checker that this generic is using a typevar which could be covariant, but - isn't. However this is just that, a warning. You are by no means required to make your generic covariant even though - it can be, you may still have a good reason not to. If that's the case, you should however specify `covariant=False`, - or `contravariant=False` for the typevar, since that will usually satisfy the type-checker and the warning will - disappear, since you've explicitly stated that even though this generic could be using a covariant/contravariant - typevar, it shouldn't be and that's desired. - -## Conclusion - -This was probably a lot of things to process at once and you may need to read some things more times in order to really -grasp these concepts, but it is a very important thing to understand, not just in strictly typed languages, but as I -demonstrated even for a languages that have optional typing such as python. - -Even though in most cases, you don't really need to know how to make your own typing generics which aren't invariant, -there certainly are some use-cases for them, especially if you enjoy making libraries and generally working on -back-end, but even if you're just someone who works with these libraries, knowing this can be quite helpful since even -though you won't often be the one writing those generics, you'll be able to easily recognize and know what you're working -with, immediately giving you an idea of how that thing works and how it's expected to be used.