--- title: A guide to type checking in python date: 2024-10-04 tags: [programming, python, typing] sources: - https://dev.to/decorator_factory/type-hints-in-python-tutorial-3pel - https://docs.basedpyright.com/#/type-concepts - https://mypy.readthedocs.io/en/stable/ - https://typing.readthedocs.io/en/latest/spec/special-types.html --- Python is often known for its dynamic typing, which can be a drawback for those who prefer static typing due to its benefits in catching bugs early and enhancing editor support. However, what many people don't know is that Python does actually support specifying the types and it is even possible to enforce these types and work in a statically type-checked Python environment. This article is an introduction to using Python in this way. ## Regular python In regular python, you might end up writing a function like this: ```python def add(x, y): return x + y ``` In this code, you have no idea what the type of `x` and `y` arguments should be. So, even though you may have intended for this function to only work with numbers (ints), it's actually entirely possible to use it with something else. For example, running `add("hello", "world)` will return `"helloworld"` because the `+` operator works on strings too. The point is, there's nothing telling you what the type of these parameters should be, and that could lead to misunderstandings. Even though in some cases, you can judge what the type of these variables should be just based on the name of that function, in most cases, it's not that easy to figure out and often requires looking through docs, or just going over the code of that function. Annoyingly, python doesn't even prevent you from passing in types that are definitely incorrect, like: `add(1, "hi")`. Running this would cause a `TypeError`, but unless you have unit-tests that actually run that code, you won't find out about this bug until it actually causes an issue and at that point, it might already be too late, since your code has crashed a production app. Clearly then, this isn't ideal. ## Type-hints While python doesn't require it, it does have support for specifying "hints" that indicate what type should a given variable have. So, when we take a look at the function above, adding type-hints to it would look like this: ```python def add(x: int, y: int) -> int: return x + y ``` We've now made the types very explicit to the programmer, which means they'll no longer need to spend a bunch of time looking through the implementation of that function, or going through the documentation just to know how to use this function. Instead, the type hints will tell just you. This is incredibly useful, because most editors will be able to pick up these type hints, and show them to you while calling the function, so you know what to pass right away, without even having to look at the function definition where the type-hints are defined. Not only that, specifying a type-hint will greatly improve the development experience in your editor / IDE, because you'll get much better auto-completion. The thing is, if you have a parameter like `x`, but your editor doesn't know what type it should have, it can't really help you if you start typing `x.remove`, looking for the `removeprefix` function. However, if you tell your editor that `x` is a string (`x: str`), it will now be able to go through all of the methods that strings have, and show you those that start with `remove` (being `removeprefix` and `removesuffix`). This makes type-hints great at saving you time while developing, even though you have to do some additional work when specifying them. ## Run-time behavior Even though type-hints are a part of the Python language, the Python interpreter doesn't actually care about them. That means that there isn't any optimizations or checking performed when you're running your code, so even with type hints specified, they will not be enforced! This means that you can actually just choose to ignore them, and call the function with incorrect types, like: `add(1, "hi")` without it causing any immediate runtime errors. Most editors are configured very loosely when it comes to type-hints. That means they will show you these hints when you're working with the function, but they won't produce warnings. That's why they're called "type hints", they're only hints that can help you out, but they aren't actually enforced. ## Static type checking tools Even though python on it's own indeed doesn't enforce the type-hints you specify, there are tools that can run static checks against your code to check for type correctness. {{< notice tip >}} A static check is a check that works with your code in it's textual form. It will read the contents of your python files without actually running that file and analyze it purely based on that text content. {{< /notice >}} Using these tools will allow you to analyze your code for typing mistakes before you ever even run your program. That means having a function call like `add(1, "hi")` anywhere in your code would be detected and reported as an issue. This is very similar to running a linter like [`flake8`](https://flake8.pycqa.org/en/latest/) or [`ruff`](https://docs.astral.sh/ruff/). Since running the type-checker manually could be quite annoying, so most of them have integrations with editors / IDEs, which will allow you to see these errors immediately as you code. This makes it much easier to immediately notice any type inconsistencies, which can help you catch or avoid a whole bunch of bugs. ### Most commonly used type checkers - [**Pyright**](https://github.com/microsoft/pyright): Known for its speed and powerful features, it's written in TypeScript and maintained by Microsoft. - [**MyPy**](https://mypy.readthedocs.io/en/stable/): The most widely used type-checker, developed by the official Python community. It's well integrated with most IDEs and tools, but it's known to be slow to adapt new features. - [**PyType**](https://google.github.io/pytype/): Focuses on automatic type inference, making it suitable for codebases with minimal type annotations. - [**BasedPyright**](https://docs.basedpyright.com/): A fork of pyright with some additional features and enhancements, my personal preference. ## When to use type hints? Like you saw before with the `add` function, you can specify type-hints on functions, which allows you to describe what types can be passed as parameters of that function alongside with specifying a return-type: ```python def add(x: int, y: int) -> int: ... ``` You can also add type-hints directly to variables: ```python my_variable: str = "hello" ``` That said, doing this is usually not necessary, since most type-checkers can "infer" what the type of `my_variable` should be, based on the value it's set to have. However, in some cases, it can be worth adding the annotation, as the inference might not be sufficient. Let's consider the following example: ```python my_list = [] ``` In here, a type-checker can infer that this is a `list`, but they can't recognize what kind of elements will this list contain. That makes it worth it to specify a more specific type: ```python my_list: list[int] = [] ``` Now the type-checker will recognize that the elements inside of this list will be integers. ## Special types While in most cases, it's fairly easy to annotate something with the usual types, like `int`, `str`, `list`, `set`, ... in some cases, you might need some special types to represent certain types. ### None This isn't very special at all, but it may be surprising for beginners at first. You've probably seen the `None` type in python before, but what you may not realize is that if you don't add any return statements into your function, it will automatically return a `None` value. That means if your function doesn't return anything, you should annotate it as returning `None`: ```python def my_func() -> None: print("I'm a simple function, I just print something, but I don't explicitly return anything") x = my_func() assert x is None ``` ### Union A union type is a way to specify that a type can be one of multiple specified types, allowing flexibility while still enforcing type safety. There are multiple ways to specify a Union type. In modern versions of python (3.10+), you can do it like so: ```python x: int | str = "string" ``` If you need to support older python versions, you can also using `typing.Union`, like so: ```python from typing import Union x: Union[int, str] = "string" ``` As an example this function takes a value that can be of various types, and parses it into a bool: ```python def parse_bool_setting(value: str | int | bool) -> bool: if isinstance(value, bool): return value if isinstance(value, int): if value == 0: return False if value == 1: return True raise ValueError(f"Value {value} can't be converted to boolean") # value can only be str now if value.lower() in {"yes", "1", "true"}: return True if value.lower() in {"no", "0", "false"}: return False raise ValueError(f"Value {value} can't be converted to boolean") ``` One cool thing to notice here is that after the `isinstance` check, the type-checker will narrow down the type, so that when inside of the block, it knows what type `value` has, but also outside of the block, the type-checker can narrow the entire union and remove one of the variants since it was already handled. That's why at the end, we didn't need the last `isinstance` check, the type checker knew the value was a string, because all the other options were already handled. ### Any In some cases, you might want to specify that your function can take in any type. This can be useful when annotating a specific type could be way too complex / impossible, or you're working with something dynamic where you just don't care about the typing information. ```python from typing import Any def foo(x: Any) -> None: # a type checker won't warn you about accessing unknown attributes on Any types, # it will just blindly allow anything print(x.foobar) ``` {{< notice warning >}} Don't over-use `Any` though, in vast majority of cases, it is not the right choice. I will touch more on it in the section below, on using the `object` type. {{< /notice >}} The most appropriate use for the `Any` type is when you're returning some dynamic value from a function, where the developer can confidently know what the type will be, but which is impossible for the type-checker to figure out, because of the dynamic nature. For example: ```python from typing import Any global_state = {} def get_state_variable(name: str) -> Any: return global_state[name] global_state["name"] = "Ian" global_state["surname"] = "McKellen" global_state["age"] = 85 ### # Notice that we specified the annotation here manually, so that the type-checker will know # what type we're working with. But we only know this type because we know what we stored in # our dynamic state, so the function itself can't know what type to give us full_name: str = get_state_variable("name") + " " + get_state_variable("surname") ``` ### object In many cases where you don't care about what type is passed in, people mistakenly use `typing.Any` when they should use `object` instead. Object is a class that every other class subclasses. That means every value is an `object`. The difference between doing `x: object` and `x: Any` is that with `Any`, the type-checker will essentially avoid performing any checks whatsoever. That will mean that you can do whatever you want with such a variable, like access a parameter that might not exist (`y = x.foobar`) and since the type-checker doesn't know about it, `y` will now also be considered as `Any`. With `object`, even though you can still assign any value to such a variable, the type checker will now only allow you to access attributes that are shared to all objects in python. That way, you can make sure that you don't do something that not all types support, when your function is expected to work with all types. For example: ```python def do_stuff(x: object) -> None: print(f"The do_stuff function is now working with: {x}") if isinstance(x, str): # We can still narrow the type down to a more specific type, now the type-checker # knows `x` is a string, and we can do some more things, that strings support, like: print(x.removeprefix("hello")) if x > 5: # A type-checker will mark this as an error, because not all types support comparison against ints print("It's bigger than 5") ``` ### Collection types Python also provides some types to represent various collections. We've already seen the built-in `list` collection type before. Another such built-in collection types are `tuple`, `set`, `forzenset` and `dict`. All of these types are what we call "generic", which means that we can specify an internal type, which in this case represents the items that these collections can hold, like `list[int]`. Here's a quick example of using these generic collection types: ```python def print_items(lst: list[str]) -> None: for index, item in enumerate(lst): # The type-checker knows `item` variable is a string now print(f"-> Item #{index}: {item.strip()}") print_items([1, 2, 3]) ``` That said, in many cases, instead of using these specific collection types, you can use a less specific collection, so that your function will work with multiple kinds of collections. Python has abstract classes for general collections inside of the `collections.abc` module. One example would be the `Sequence` type: ```python from collections.abc import Sequence def print_items2(lst: Sequence[str]) -> None: for index, item in enumerate(lst): # The type-checker knows `item` variable is a string now print(f"Item #{index}: {item.strip()}") print_items([1, 2, 3]) # fine print_items((1, 2, 3)) # nope print_items2([1, 2, 3]) # works print_items2((1, 2, 3)) # works print_items2({1, 2, 3}) # works ``` You may think that you could also just use a union like: `list[str] | set[str] | tuple[str, ...]`, however that still wouldn't quite cover everything, since people can actually make their own custom classes that have `__getitem__` and work like a sequence, yet doesn't inherit from `list` or any of the other built-in types. By specifying `collections.abc.Sequence` type-hint, even these custom classes that behave like sequences will work with your function. There are various other collections classes like these and it would take pretty long to explain them all here, so you should do some research on them on your own to know what's available. {{< notice warning >}} It is important to note that the built-in collection types like `list` weren't subscriptable in earlier versions of python (before 3.9). If you still need to maintain compatibility with such older python versions, you can instead use `typing.List`, `typing.Tuple`, `typing.Set` and `typing.Dict`. These types will support being subscripted even in those older versions. Similarly, this also applies to the `collections.abc` abstract types, like `Sequence`, which also wasn't subscriptable in these python versions. These also have alternatives in `typing` module: `typing.Sequence`, `typing.Mapping`, `typing.MutableSequence`, `typing.Iterable`, ... {{< /notice >}} #### Tuple type Python tuples are a bit more complicated than the other collection types, since we can specify which type is at which position of the tuple. For example: `tuple[int, str, float]` will represent a tuple like: `(1, "hi", 5.3)`. The tricky thing here is that specifying `tuple[int]` will not mean a tuple of integers, it will mean a tuple with a single integer: `(1, )`. If you do need to specify a tuple with any amount of items of the same type, what you actually need to do is: `tuple[int, ...]`. This annotation will work for `(1, )` or `(1, 1, 1)` or `(1, 1, 1, 1, 1)`. The reason for this is that we often use tuples to allow returning multiple values from a function. Yet these values usually don't have the same type, so it's very useful to be able to specify these types individually: ```python def some_func() -> tuple[int, str]: return 1, "hello" ``` That said, a tuple can also be useful as a sequence type, with the major difference between it and a list being that tuples are immutable. This can make them more appropriate for storing certain sequences than lists. ## Type casts Casting is a way to explicitly specify the type of a variable, overriding the type inferred by the type-checker. This can be very useful, as sometimes, we programmers have more information than the type-checker does, especially when it comes to some dynamic logic that is hard to statically evaluate. The type checker's inference may end up being too broad or sometimes even incorrect. For example: ```python from typing import cast my_list: list[str | int] = [] my_list.append("Foo") my_list.append(10) my_list.append("Bar") # We know that the first item in the list is a string # the type-checker would otherwise infer `x: str | int` x = cast(str, my_list[0]) ``` Another example: ```python from typing import cast def foo(obj: object, type_name: str) -> None: if type_name == "int": obj = cast(int, obj) ... # some logic elif type_name == "str": obj = cast(str, obj) ... # some logic else: raise ValueError(f"Unknown type name: {type_name}") ``` {{< notice warning >}} It is important to mention that unlike the casts in languages like Java or C#, in Python, type casts do not perform any runtime checks to ensure that the variable really is what we claim it to be. Casts are only used as a hint to the type-checker, and on runtime, the `cast` function just returns the value back without any extra logic. If you do wish to also perform a runtime check, you can use assertions to narrow the type: ```python def foo(obj: object) -> None: print(obj + 1) # can't add 'object' and 'int' assert isinstance(obj, int) print(obj + 1) # works ``` Alternatively, you can just check with if statements: ```python def foo(obj: object) -> None: print(obj + 1) # can't add 'object' and 'int' if not isinstance(obj, int): raise TypeError("Expected int") print(obj + 1) # works ``` {{< /notice >}} ## Closing notes In summary, Python’s type hints are a powerful tool for improving code clarity, reliability, and development experience. By adding type annotations to your functions and variables, you provide valuable information to both your IDE and fellow developers, helping to catch potential bugs early and facilitating easier code maintenance. Type hints offer significant benefits: - Enhanced Readability: Clearly specifies the expected types of function parameters and return values, making the code more self-documenting. - Improved Development Experience: Provides better auto-completion and in-editor type checking, helping you avoid errors and speeding up development. - Early Error Detection: Static type checkers can catch type-related issues before runtime, reducing the risk of bugs making it into production. For further exploration of Python’s type hints and their applications, you can refer to additional resources such as: - The [Type Hinting Cheat Sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html) from mypy for a quick reference on various type hints and their usage. - My other articles on more advanced typing topics like [TypeVars]({{< ref "posts/type-vars" >}}) and [Generics]({{< ref "posts/generics-and-variance" >}}) for deeper insights into Python's typing system. Embracing type hints can elevate your Python programming experience, making your code more robust and maintainable in the long run.