I have been following the whole GitHub Copilot issue for a while, and I was not happy about Microsoft using open-source code hosted in GitHub without consent for training Copilot.
Consent is only one part of a bigger issue though. As everybody knows, Microsoft has used open-source code on GitHub to train Copilot, disregarding the licenses. This means many repositories containing GPL licensed code have been used as well.
Using GPL-licensed code in this manner is problematic on two fronts. The first is legal. GPL family of licenses doesn't allow source code to be closed. If you lift a function from a GPL-licensed codebase and integrate it into your code, you have to share your code under the same GPL license. The same is true for derivations of the code lifted from a GPL licensed codebase. However, Copilot disregards licenses on both ends while deriving code from its training set. It can start from a GPL function or mix GPL licensed code to a function licensed any other way. Since copilot is available to anyone, there's a serious risk of injecting GPL-licensed code into any codebase, regardless of license compatibility.
This is a clear breach of the GPL license and is inexcusable.
The second issue is ethical. Microsoft is not transparently communicating how they trained Copilot. They give vague answers and try to downplay the issue while denying all responsibility. See the second paragraph of this document, which is quoted below:
You are responsible for ensuring the security and quality of your code. We recommend you take the same precautions when using code generated by GitHub Copilot that you would when using any code you didn't write yourself. These precautions include rigorous testing, IP scanning, and tracking for security vulnerabilities. GitHub provides a number of features to help you monitor and improve code quality, such as GitHub Actions, Dependabot, CodeQL and code scanning. All these features are free to use in public repositories. For more information, see "Understanding GitHub Actions" and "GitHub security features."
However the issue is pretty serious to be downplayed. Copilot is a paid product, marketed to all developers writing all sorts of software. Using open-source code regardless of its license and giving it to anyone to do anything is exploiting the developers who explicitly stated that they want their code to stay open and developed in a certain way.
Moreover, GitHub contains all sorts of code, from leaked source code to codebases with wildly incompatible licenses. Not all of these codebases can be used while developing your product.
This is embrace, extend, extinguish at its finest, with a minefield attached to it.
In the light of these events, I decided to stop using GitHub, and move to Source Hut. Source Hut is Drew DeVault's brainchild, a Free Software advocate. It's built with different priorities in mind, and without a social component deliberately. The software can be self-hosted if desired, however, I decided to pay for the service and support its development further.
While I have started my move to Source Hut, there won't be much activity visible at first. My private research repositories (which will be eventually opened when they mature enough) are going first, then I'll archive my public archives and other public repositories, step by step.
This slow migration will allow me to ponder on the plans for each project and make sure that I move them with utmost care and deliberation.
Meanwhile, many people will inevitably wonder about their code's visibility and discoverability if they move from the most crowded open-source hub to somewhere relatively obscure and has no social components.
I thought about this and decided that software freedom, ethics, self-respect, and doing the things you praise are much more important than being discoverable on a particular website. My homepage has a link to my Source Hut page, and if people click that, it's great.
Hope to see you over here.