by Jilayne Lovejoy, Legal team Co-Chair
On December 28th, 2017, version 3.0 of the SPDX License List was released. This marks the achievement of several significant milestones for the project and represents over a year’s worth of work by members of both the legal and tech teams. Below is a summary of some of the key changes that are the foundation for improved maintainability and usability of the SPDX License List into the future.
A new master format for the SPDX License List
Since its inception, the master format for the SPDX License List was a spreadsheet and text files. From these master files, other consumable formats, such as the web pages and other formats were generated. While this format was straight-forward, it was mostly maintained by one person, did not lend itself to collaboration, and thus, was not scalable for the growing magnitude and use of the SPDX License List.
Various proposals were discussed to both move away from the spreadsheet format and also improve the implementation of the matching guidelines in the master files. A format using an XML-style template for each license and tags for the various fields and matching guidelines was developed.
Converting from the old format to the new was no small task. This required an initial automated conversion followed by human checking of all 396 licenses and exceptions, some clean-up after that, and a lot of discussion to finalize the XML format along the way. The result is a new format for the SPDX License List master files as of version 3.0.
Better guidance for matching
The SPDX License List has long included matching guidelines to ensure consistent identification of licenses and use of SPDX identifiers. While the matching guidelines are human-readable/understandable, they pose challenges for full implementation in tools. This means tool-makers have been left to sort out the implementation, which can result in inconsistent matching across tools.
The new XML format captures items like bullets, copyright notices, licenses, and optional text in XML elements. This additional information allows for tools to provide better matching as per the matching guidelines.
Since release 1.2 of the SPDX specification, SPDX supported a template format which includes variable (or alternate) text and optional text. Elements such as bullets, titles and copyrights are translated into templates including the appropriate optional and variable text. With the introduction of the XML format and the significant work provided by the SPDX legal team, the number of variable or optional tags has increase from 85 in the 2.6 version of the license list to over 6,400 in the new 3.0 license list.
While this format will initially only be used internal to the SPDX legal team, that is, to generate the other consumable formats, eventually once hardened, the XML format will enable more consistent and precise implementation of the matching guidelines by license matching tools/scanners.
Clarified Identifiers for GNU Licenses
In a collaborative effort between the Free Software Foundation (FSF) and the SPDX working group to help facilitate clarity and better license identification practices, we have updated the short identifiers for the GNU family of licenses to support more precise and consistent usage.
SPDX has always had a way to identify the “this version only” and “any later version” options, for example via GPL-2.0 and GPL-2.0+ respectively. In practice, however, GPL-2.0 was not always used to mean “only version 2” as defined in the SPDX License List. It was often used by default to refer to the GPL version 2 text as drafted to include this version or any later version. The FSF was concerned about the potential confusion this could cause. Richard Stallman has posted an article explaining the background for this aspect of the GNU licenses and the root concern about copyright holders not identifying or being unclear regarding which option is intended.
By providing identifiers that are explicit as to “this version only” and “any later version”, we can be sure that SPDX users are reminded of the difference and that the right information is communicated.
As such, the next release of the SPDX License List v3.0 will reflect the changes to the GNU family of licenses following this pattern:
For each GNU license, the SPDX License List now contains a specific item and identifier for the two variations. The + operator is retained and can still be used with other licenses.
Why 3.0 and not 2.7?
A major version numbering change was used with this release to act as a clear signal to users of the license list, that this release wasn’t a business as usual. Given the significance of the changes on how licenses are represented internally, new tooling is now possible to improve the fidelity of license recognition. In addition, the short form identifier naming changes to address the FSF concerns with how the GPL licenses are represented and deprecation of the prior GNU license identifiers will impact some projects.
We want to thank everyone who made this release possible, as it was truly a team effort!