itsdrike.com/content/posts/software-licenses.md

30 KiB

title date tags changelog
Software Licenses 2022-01-23
programming
licensing
2023-12-10
Rewrite many long and hard-to-read sections
Make use of the new notice boxes
Add licensing tag to keep posts about licensing together
Move sections about CLA and DCO into a [standalone post]({{< ref "posts/licensing-documents" >}})
Remove sections about changing license, in favor of a [standalone post]({{< ref "posts/changing-license" >}})
Create new related [post about multi-licensing]({{< ref "posts/multi-licensing" >}}) (not previously mentioned at all)

I've recently been coming across more and more developers which had no idea about the importance of software licensing, how it works and what options there even are. What's worse, this lack of knowledge means people often end up not choosing a license at all, thinking it means anyone can just use that code without limitations, when in fact no-one can.

I've always been very interested in software licenses, and have accumulated a lot of knowledge about them over time, so I've decided to share this knowledge and explain the basics of how licenses work, what enforces them, what different types of licenses we have, and perhaps most importantly, why you need to use a license if you want to maintain an open-source project. Along with just explanations, I've also included some of my personal thoughts on what I like to use.

Why even have a license?

Many people often assume that unlicensed code is free, and anyone can use it anywhere they want, because there is no license prohibiting that use. This is a big misconception which has lead to many lawsuits against small developers, who just didn't know any better.

The reason you can't use unlicensed code is simple, whenever you publish essentially anything, not just source code, you automatically gain copyright over that published work. This copyright then imposes some pretty big legal restrictions to others on this published content, including preventing you from simply using it in your own code.

Generally, this means that when publishing your code without a license you are the sole copyright holder and no one else has any rights to that code. We often call public code like this "source-available", instead of "open-source", because while people can look at the code, inspect it, perhaps even find and report issues about it, they can't use this code in almost any other way. This means that an unlicensed project can't have any (legal) contributors, because even the action of making a fork will be considered illegal, since the contributor will be copying your code without permission.

{{< notice note >}} The exact restrictions will vary from country to country, because each country can have their own copyright law, and there are even some countries which don't respect any form of international copyright law (but that's quite rare).

Even though in most of the cases the copyright law is pretty similar with only minor differences in punishments and enforcement, it does not mean that there can't be any differences. Because of that, the info above may not necessarily be true for every country, and as always, you should do your own research (or delegate it to a lawyer you trust). {{< /notice >}}

For these unlicensed repositories, it leads to losing many potential contributors, who would like to make some changes to this project to improve it, but decide not to, because by doing so, they would put themselves in danger of getting sued for copyright infringement. That's why whenever I see a project without a license, unless it was intentional, I open an issue, or try to otherwise contact the author to inform them about this.

{{< notice warning >}} The worst case scenario that you can get into with an unlicensed code-base is that people do in fact decide to contribute, and you accept that contribution.

Not only did the contributor just break the copyright law by modifying your code without legal permission, now that their unlicensed code is in your project, you can't legally modify it either. So it's not just the contributor who may get in trouble, it's also you who the contributor can now press charges against for violation of their copyrighted code.

You as the author can't even go and add a license now, since there is code from others that you don't own, and so you can't simply relicense them. Solving this could then be very difficult, see my: [relicensing post]({{< ref "posts/changing-license#unintentionally-proprietary" >}}) for more details.

This is why you should steer far away from any unlicensed projects, and make absolutely certain that your project always have a license, or if they should be proprietary, make sure to explicitly mention that in your README file, and don't accept any contributions. {{< /notice >}}

How can a license make code open-sourced

In order to make your source code classify as "open-source", it needs to be made freely available for anyone to copy and modify, without the contributors worrying that they might in legal trouble just for trying to improve the project.

To give someone these rights, you can use a license which specifies that you, as the owner of the copyright to your code, freely give out certain rights to everyone. This license is here to essentially "weaken" your copyright in that you reduce the restrictions it generally imposes on others, because they were freely given out through the terms of your software license.

These licenses can then also specify certain conditions, which limit when can the code be used by others. Picking a good license for your projects then means understanding what freedoms does the license give out, and under what conditions.

How are these licenses enforced

It is important to understand how exactly the software licenses really work. How does simply having a text file with some legal gibberish allow others to use that project and contribute to it? After all, they didn't need to legally sign anything, so why should they be forced to follow that license's terms?

Well, the answer to that is simple, you don't need to follow the terms. The only issue is that if you decide not to, you lose the only thing giving you the extended copyright permissions to use that code, so while you don't have to follow the license, if you don't, you won't be able to use the code. You can think of it as the copyright over that code itself being extended to be usable by anyone who meets certain conditions. So it extends over that category of people. If you're in that category, you can use the code, if you're not, you don't have those right.

However, in some cases, you might have a different license given to you by the copyright holder, or maybe you are the copyright owner (copyright ownership can even be transferred, and it's often sold), in which case, you already have these rights even without following the license, so you can actually completely ignore it.

It's actually very common for many projects to be licensed under a stricter license for general usage, that doesn't allow using this code in closed-sourced projects, with the authors selling a more permissive license for profit to corporations which do need to use this project in their proprietary code bases. In many cases, this is the main way these open-sourced projects make money. I explain this a bit further in my [post about multi-licensing]({{< ref "posts/multi-licensing" >}})

So essentially, licenses aren't enforced, nobody will force you to follow them, you just need to do so if you want to gain the rights that license grants you.

Contributing to a licensed project

A question that you might have now is about how will your code be licensed if you contribute to a project. The answer to this may be pretty obvious, but it is important to explain nevertheless.

Basically, if you pick a repository under say a GPL v3 license, fork it, change some things and make a pull-request back to the original repo, you're offering this code to that project under the same (GPL v3) license. This is because the LICENSE file in that project generally applies to the entire project, and you've added some code to a project with this LICENSE file, meaning just as if you added this file to your own project, you've contributed to this project and hence offering your code to it under this license.

However this isn't the only way to do it. There may be cases when you want to explicitly state that the code you've contributed is actually under some other license, like MIT for example. To do that, projects generally use another file, like LICENSE-THIRD-PARTY, where all of the differently licensed code chunks are stated, along with the copyright holder (usually a name and an email) and the full-text of this differing license.

However you can do this in different ways, all that's necessary is to clearly mark that the code you've added is under a different license, the way to do so is more or less just a convention (although some licenses to have specific requirements on where they need to be mentioned, be careful with those). Another pretty common way to do it is to add a comment at the start or end of the source code file containing the fulltext of the license, or a reference to the license.

There's a lot of reasons why you may want to contribute to a project under a different license, for example it might be that the code you're contributing isn't yours, but is instead from another project, with a license that requires you to do something specific. However it could also just be personal preference, maybe you like another license, and you're only willing to give out your code under that license, not under the one the project uses everywhere else.

Available licenses

There are countless amount of licenses, each with different set of conditions and different rights they give out. As an author of your project, it is up to you to pick a license which best suits your needs, or if you can't find a license you like, perhaps even write your own.

But know that picking some random license, which could've just been written by some individual without any legal knowledge may be dangerous, because of the potential for some legal loop-holes which it may contain. For that reason, if you're making your own license, or picking one which isn't commonly used, you should always consult a lawyer. Usually, you will likely just want to stick to an already well-established license, verified by thousands, if not millions of users.

Open-Source software licenses

I've mentioned before that the distinction between "source-available" and "open-sourced" code was having a license that allows the user to do some additional things (like modifying the code and using it in different projects). However it's not just as simple as having any license that does this. It's technically up to anyone to make their own mind about what they'd consider open-sourced, but it's probably fair to say that if your license is discriminating against some group of people, most probably wouldn't consider it as open-source license.

To address this, an organization called the "Open Source Initiative" (OSI) came up with their list of approved licenses, which were fact-checked by their lawyers and are commonly used by tons of people already. You can find this curated list of OSI-Approved licenses here. All of these are generally considered to be open-sourced by almost everyone.

Free software licenses

“Free software” means software that respects users' freedom and community. Roughly, it means that the users have the freedom to run, copy, distribute, study, change and improve the software. Thus, “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.” We sometimes call it “libre software,” borrowing the French or Spanish word for “free” as in freedom, to show we do not mean the software is gratis.

-- Explanation of "free software" by the Free Software Foundation

As seen in the quote above, free software is essentially about software that has a license which guarantees you certain freedoms. Unlike the simple "open-source" software, which is merely about a project having a license which allows others to fearlessly contribute to it, "free" software extends that and also guarantees contributors the freedoms mentioned above.

Really, all you need in an open-sourced license is the ability for the contributor to change the code and pull-request it back. The rights to distribute the software on their own, use this code in a different project, or do a bunch of other things aren't all that necessary for pure open-source projects. So, to signify that these freedoms are being respected, we often call such projects "free and open-sourced", instead of just "open-sourced".

To easily recognize which licenses do follow this ideology, the Free Software Foundation (FSF) also has their own list of licenses that they confirm do meets these guarantees.

License types

Because there are thousands of very different open source licenses, we split them into several categories, which describe the main goal of the license. While the individual licenses in the same category can still be quite different, these differences are mostly minor details.

This categorization allows us to only remember what category a license is, and already know with fairly good confidence what the general guidelines of that license will be. Knowing what license category you prefer can save you a lot of time in finding a license you like, since you'll just need to look for licenses your preferred category.

Copy-Left licenses

The main point of copy-left licenses is to ensure that the code licensed under them will never go closed-sourced, and will always remain free and accessible to anyone.

To achieve this, copy-left licenses require everyone who would want to use the code licensed under them to publish their code under the same (or a compatible) license. This basically means that if some other project would like to use a part of copy-left licensed code, that other project would need to itself become entirely copy-left. We often call this "propagating the copy-left".

Usually, if your project is compilable into a single executable file (or multiple executables) copy-left licenses will also require shipping the license and the entire source-code that was used to compile it (or at least a link to a website where this source-code and license's fulltext is accessible, which is often done because the source code might be pretty big). This ensures that anyone wanting to use this binary will always have access to the source code it was made from, and are free to change this code and to then use their own modified versions if they want to (this is often called creating "derivatives"). For example they might make a version without any intrusive telemetry.

There can be many little caveats to these licenses and you should always make sure you understand what that license is allowing/denying before you decide to use them.

The most popular copy-left license is the GNU General Public License version 3 (GPL v3), but version 2 is also still quite popular.

Permissive licenses

Similarly to copy-left licenses, this type of license allows usage of the original code in almost any way (derivatives, using small chunks, using code as library, ...), however unlike with copy-left, permissive licenses will allow other projects using the code under them to be relicensed under a completely different license, we call this allowing "sublicensing".

However even though these licenses are generally very "weak", in that they allow the code to be used very easily and without many restrictions, it doesn't mean there can't be any extra conditions to gain these rights. One example of a very common requirement in permissive licenses is to mention the original copyright holder(s), or a requirement that prevents making software patents, requires listing all changes made to the original code, or various other things.

This means that permissive licenses give others a lot more freedom because they allow anyone to use your code under any license they like, usually even a proprietary one. This could then mean that someone might simply take your entire project, and make it close-sourced. Perhaps with some added features, this version can then be sold for profit.

However the proponents of this license like this fact because even though it may mean their software will be used in closed-sourced repositories, it at least means that there is a big chunk of open code in these projects, promoting the open-source idea, rather than having the companies write their own internal tooling resulting in existence of even more proprietary code. A big advantage here is also that companies using this code often end up contributing to these open repositories, hence helping them to grow.

The most commonly used permissive licenses are the MIT License and the Apache 2.0 License. Another really popular set of licenses are the BSD licenses, most notably the BSD 2-Clause license.

Public domain licenses

There are also the so called "public domain licenses", which are actually a subcategory of permissive licenses, however when we talk about permissive licenses, we generally don't mean the public domain ones, even though they do technically meet the definition of a permissive license. This is why I separated this category as it's really quite different from what people would expect from usual permissive licenses.

Public domain licenses essentially strip the copyright away completely. They don't impose any extra restrictions (such as requirement to mention the author) at all. This means giving everyone the right to do absolutely anything with that code, and anyone using them can essentially consider themselves as the owners of that works.

Depending on the country you're in, these licenses may act a bit differently as not all countries allow "removing" the copyright from your work, instead you can "assign" the copyright to the public domain, or give all rights to everyone without imposing any restrictions.

The most notable public domain licenses are: "Unlicense" license, Creative Commons CC0 1.0 Universal, and WTFPL.

Strong and weak copy-left

Coming back to copy-left licenses, they're actually split into 2 sub-categories, which further define how strict they are. The main problem with copy-left is that people who like permissive licenses will often simply refuse to use copy-left projects, because they don't want to be forced to also license all of the code in their project under a copy-left license. For that reason, copy-left licenses needed a bit of a change, adding another category.

This new weaker form of copy-left is essentially the same as normal copy-left, but with some exceptions that allow using the code as if it was permissively licensed. The most common case for this is an exception for using the copy-left licensed software as a library. This means that code licensed like this can be used in any other projects under a weaker set of rules (essentially under permissive rules), when the project is used as a library dependency. This would then even allow use in closed-sourced projects. However this weaker ruleset only applies if the project is being used as a library. If you wanted to make an actual fork of the project, or use some part of the source-code directly in your code-base, this exception doesn't apply and the copy-left still needs to be propagated. The more exceptions to this there are, the "weaker" that copy-left license is considered.

With this change, to distinguish between these licenses, original copy-left was classified as "strong copy-left", and this new weaker form of copy-left licenses was called "weak copy-left".

{{< notice warning >}} Weak copy-left licenses could end up causing some legal "gray zones" when that copy-left license isn't clear about when the copy-left should propagate, and when it can be permissive, and it is important to say that these weak copy-left licenses weren't yet tested in court.

Nevertheless, many people and companies do use them and they're generally trusted to be quite safe (that is, if you pick one that has been checked by a lawyer and is used commonly, ideally one of those listed in the OSI or FSF's approved list of licenses). {{< /notice >}}

Most commonly used strong copyleft licenses

  • General Public license (GPL): The license that defined copy-left.
  • Affero General Public License (AGPL): Extension of the GPL (an even stronger copy-left) that enforces source code publishing even for a "service" use case (I won't get into details about that here, look it up if you're interested).
  • Sybase Open Watcom Public License: One of the strongest copy-left licenses which prevents the "private usage" loophole of the GPL (which allows source-code modification when you "deploy" the software for private use only, even for testing while developing a project covered by this license) and requires source code publishing in any use-case. However this can be way too limiting and basically makes it really hard to even develop the software covered by it because each time an executable is built, the srouce-code has to be made available, which lead to FSF not accepting this license as "free software license".
  • Design Science License: The interesting thing about this license is that it can apply to any work, not just software/documentation, but also literature, artworks, music, ..., however it became irrelevant after the creation of "creative commons" licenses.

Most commonly used weak copyleft licenses

  • Lesser General Public License (LGPL): Libraries are treated permissively, major derivatives and direct source usage forces inheriting copy-left.
  • Adaptive Public License (APL): An incredibly detailed license which provides a template for users to slightly modify it to their exact needs.
  • Mozilla Public License (MPL): Uses files as the boundaries between MPL-licensed, and proprietary/otherwise licensed parts. If a file contains MPL code, copy-left is propagated to that file only.
  • Eclipse Public License (EPL): This license is made to be very similar to GPL, but with the intention of being more business-friendly, by allowing to link code under under it to proprietary applications, and licensing binaries under a proprietary license, as long as the source code is available under EPL.

Picking your license

Whenever you make a new project, you should also figure out how will you want to license it category-wise. Should it be permissive? Should it be copy-left? Is it a library? If so, should it be weak copyleft? So on and so forth. After you know that, you should start considering the possible licenses in the category of your choosing, so for example if you ended up on strong copy-left, should it be GPL-3, GPL-2, AGPL, or maybe something else entirely.

{{< notice tip >}} If you need some help on picking the license, check out some of these pages, which are made to simplify this process a bit:

  • The healthermeeker's license picker gives you a bunch of questions about your project, and shows you the license you'll want based on your answers
  • TLDRLegal allows you to quickly see some key points about a given license, without the need to read the entire license's full-text.
  • Choosealicense is a webpage made by GitHub, which can show you points about some of the most commonly used licenses helping you pick the right one. {{< /notice >}}

{{< notice warning >}} If you want to add a license to an existing project, which didn't have any license before, or if you want to change the license, depending on many factors, you may face some difficulties. To understand how to go about this, check out [my other post]({{< ref "posts/changing-license" >}}) that talks specifically about changing licenses, or adding licenses to previously unlicensed code. {{< /notice >}}

Limitations

Before picking your license, you should be aware of the limitations it may create for your project. While you can obviously pick any license for your projects whatsoever (including no license at all), if you pick a permissive license, you immediately loose the right to use any copy-left licensed code (unless it's a weak copy-left and you're in the exception, or unless you [relicense]({{< ref "posts/changing-license" >}})). It is therefore very important to know the licenses of your dependencies and of the code you're going to be using in the codebase directly.

In vast majority of cases, if you're using a packaging system (such as pip for python, npm for node, cargo for rust, ...) these will create dependency listings, which will often fulfill the license's mention requirement since it's source url will be mentioned by the package manager, and the original source is often included somewhere along with the license, fulfilling the include license requirement.

In many cases, it is therefore enough to just add these libraries via the package manager without any worries. Especially since most utility libraries are generally going to be licensed under weak copy-left allowing library usage or permissive licenses. However it may not always be the case, and when that happens, you may need to do something extra to comply with that license (like making sure you're not using any software patents, not using the project's trademark, etc. For that reason, you should ALWAYS make sure that you are in fact following the licenses of your dependencies.

My personal preference

In most cases, I prefer strong copy-left, specifically I usually stick to the simple GPL v3 license, as it meets all of the things I'd want from a license, however when I'm making libraries which I know many people may end up using, I don't want to discourage that usage by forcing them to propagate my license as many people simply don't like licensing their code under a copy-left license, so weak copy-left works better for me there, specifically I choose LGPL v3 for cases like that.

Although to truly support open-source, it can sometimes make sense to use strong copy-left even on libraries, as it forces everyone using it to also keep their project open, and therefore this could really significantly help grow the open-source community, by forcing even companies to make their code open, because otherwise they just won't be able to use your library. Depending on how complex this library is, they might end up reimplementing it themselves, bypassing you entirely though, but if doing that is not so easy, it might simply not be worth it, and the company will instead decide to just go open-source, following your license.

Another benefit with using full GPL on libraries is that you get to utilize code from other GPLed code-bases, while with LGPL, this actually isn't possible, because GPL isn't compatible with it (you can't relicense GPL code into LGPL).

I'm generally against fully permissive licenses, as in my opinion, they simply give out too many permissions, and while in the ideal world, this would be fine, that's just not the world we're in. I get the people who license their code under these licenses, since they at least mean that the bigger companies will have a lot of open code in their code-bases, but I'm just fundamentally against my code being used in a proprietary project, at least without being compensated appropriately for it.

When I publish my code, I do it so that everyone can use it freely, to support the open-source community. I don't do it to support big companies, only to use have my code used in their closed-sourced projects, or even worse, to have my entire project copied and relicensed, with a bunch of added telemetry and other user-intrusive features, to the point where it's basically spyware, with them then happily releasing it under their banner, selling and advertising it to everyone, without having any say in it, since my license allowed it.

Instead, I just use a license which allows anyone to use my code, as long as they give out the same guarantees as I did with my project, allowing anyone to modify it, etc. And if some company would want to use my code in a different way, they can always contact me and we could discuss me giving them a different license over it (see [post about multi-licensing]({{< ref "posts/multi-licensing" >}})), for some compensation. I don't want to write code for companies which I'm not gonna be payed for. But I'm absolutely for writing code for people like me, who just want to use it in their own open-sourced projects, and for giving back to the community as a whole.

{{< notice tip >}} Another great license to consider is the Mozilla Public License which uses a very interesting file-based weak copy-left propagation and therefore enforces all of your code to always stay open, even if it lives in an otherwise proprietary codebase, by requiring the file with MPLed code be MPLed in it's entirety, but the rest of the files, even if they then use the code (via imports/linking) won't be affected.

I still personally prefer strong copy-left, because it has this "give-back" property, but for people who just want their code to stay open, without caring as much about it being in a project that abuses it, this may be an amazing choice instead of going full permissive. {{< /notice >}}

Then again, this is purely my stance and you should make up your own mind and decide what works best for you, I'm not here to force my opinion on anyone, it's your code and I firmly believe that you should be able to do anything you want with it, but I did want to express why I feel this way about permissive licenses and what I like using, in hopes that someone will perhaps not have considered these arguments against permissive licenses. But again, I wouldn't want to force that on anyone, it's your work and you can license it in any way you want.