The Linux Foundation Projects
Skip to main content

Overview

Getting started with SPDX is easier than you think. There are three different ways in which you can engage with SPDX. They are mutually exclusive, meaning you do not have to do one to do another. That said, each one is valuable in its own way for helping with license identification and ultimately compliance.

In addition to the resources here you are encouraged to attend the General Monthly Meeting. Frequently we have a guest speaker from business or the community who presents on their use of SPDX. It’s a great way to see what others are doing and to share or ask questions.

The System Package Data Exchange (SPDX®) specification is an open standard designed to facilitate the communication of Bill of Materials (BOM) information across diverse domains, including software, artificial intelligence (AI), datasets, and system components. SPDX enables organizations to document, share, and manage metadata critical to understanding and maintaining software supply chains, ensuring transparency, compliance, and security.

What is SPDX?

SPDX is a collaborative effort driven by the Linux Foundation and supported by a global community of developers, organizations, and industry experts. By adopting SPDX, you can contribute to building a more transparent, secure, and efficient software ecosystem.

SPDX provides a standardized framework for creating and exchanging detailed metadata about system components, their relationships, and associated information. It defines an underlying data model and supports multiple serialization formats, enabling interoperability across tools, platforms, and industries. Originally focused on software licensing, security, and composition, SPDX 3.0 (a major revision to SPDX 2.2.1, aka free ISO/IEC 5962:2021 – SPDX® Specification V2.2.1) has expanded to cover broader areas such as AI models, datasets, and system lifecycle information such as build information.

Key Features of SPDX 3.0

SPDX 3.0 introduces significant enhancements to support the evolving needs of modern software ecosystems and related domains. Key features include:

Software Composition:

Metadata for collections of software (Packages), individual Files, and portions of files (Snippets).

Detailed information about dependencies, bundled components, and optional elements.

Software Build Information:

Documentation of build processes, tools, and configurations used to create software artifacts.

Artificial Intelligence (AI) Models:

Support for describing AI models, including their training datasets, provenance, and associated metadata.

Datasets:

Metadata for datasets used in software development, AI training, or other applications, including licensing, provenance, and integrity.

Creator, Supplier, and Distributor Identity:

Information about the entities involved in creating, supplying, and distributing software or system components.

Provenance and Integrity:

Tracking the origin and history of components, including checksums and cryptographic hashes to ensure integrity.

Licenses and Copyrights:

Comprehensive licensing information, including:

A curated list of SPDX license identifiers and exceptions.

License expressions for multi-license scenarios.

Copyright notices and statements.

Security Vulnerabilities and Quality Data:

Integration of security vulnerability data, defect reports, and other quality-related information to support risk assessment and mitigation.

Relationships Between System Elements:

Explicit relationships between components, such as dependencies, inclusion, or exclusion, enabling detailed modeling of complex systems.

Software Usage and Lifecycle:

Metadata describing how software is used, maintained, and retired, including lifecycle stages and usage policies.

Annotations and Linking:

Mechanisms to annotate SPDX elements with additional information and link between multiple SPDX documents for distributed systems.

Why Use SPDX?

SPDX provides a robust solution for managing BOMs and related metadata, offering the following benefits:

  • Transparency: Gain visibility into the composition, provenance, and licensing of software and system components.
  • Compliance: Simplify the process of managing licensing obligations, copyright notices, and security vulnerabilities.
  • Interoperability: Standardize communication across tools, organizations, and industries, reducing friction in the software supply chain.
  • Automation: Enable tools to generate, validate, and consume SPDX documents for faster and more accurate workflows.
  • Scalability: Support complex systems with diverse components, including software, AI models, and datasets.
  • Integration: Connect graphs of bill of materials information with other graphed types for analysis.

SPDX Formats

SPDX documents can be serialized in multiple formats to suit different use cases:

  • JSON-LD: A lightweight, JSON-based format for representing Linked Data, making it easy to integrate structured data into web applications while being both human-readable and machine-processable.
  • Turtle (Terse RDF Triple Language): A user-friendly, compact format for writing RDF data, using a simple syntax to represent subjects, predicates, and objects in a way that’s easy to read and edit.
  • N-Triples: A simple, line-based format for representing RDF data as individual statements, where each line consists of a subject, predicate, and object, making it easy to read and process.
  • RDF/XML: An XML-based format for representing RDF data, allowing structured information about resources and their relationships to be stored and shared in a machine-readable way.

SPDX and Industry Standards

SPDX 3.0 aligns with global standards for BOMs and software supply chain security, including:

  • NIST Guidelines: Supporting SBOM requirements outlined by the National Institute of Standards and Technology.
  • ISO/IEC Standards: Ensuring compatibility with international standards for software and system documentation.
  • OpenSSF Initiatives: Enhancing security and transparency in open source software supply chains.

The specification is also endorsed by the Object Management Group (OMG), ensuring broad industry adoption and interoperability.

Learn More

To explore SPDX 3.0 in detail, visit the following resources:

The easiest things to start with are the SPDX License List or by using the license identifiers in your code. Just start using them wherever it is appropriate or makes sense for you to do so. The pinnacle of SPDX is producing and/or consuming SPDX artifacts. Let’s look at each of these in turn.

License List

The SPDX License List is a list of commonly found licenses and exceptions used for open source and other collaborative software. The purpose of the SPDX License List is to enable easy and efficient identification of such licenses and exceptions in an SPDX document or elsewhere

Some relevant points about the list:

  • More than 690 Licenses and exceptions as of June 2025
  • Available on SPDX website – URLs won’t change
  • Simple expression language for expressing conjunctive and disjunctive licensing
  • Short license identifiers for easy reference
  • Exact text of licenses
  • License Matching Guidelines – for matching licenses against those included on the SPDX License List
  • License Templates denote license text which is optional or replaceable per the license matching guidelines

Where to use it

There are many possible uses of the SPDX License List and its collateral. Here are just a few key ones:

  • Use the standardized short identifier anywhere you would display or exchange the identity of an open source license. You can even link back to the SPDX License List for the text of the license. The license list links are immutable and the links will not change, ever. Note: Only use the identifier if the license text you are matching agrees with the text in the SPDX License List per the list matching guidelines. The power of the short identifier is that when you see it, you know exactly what the license text is!
  • Use the License List itself for internal reference or processes.
  • Use the matching guidelines and templates provided by the SPDX License List to help determine if the license text you see is the license.

Pre-requisites

Familiarity with the SPDX License List and where and how you will use it.

Examples

The following list is not meant to be exhaustive but to rather give you an idea of what some people and organizations are doing:

  • Github – GitHub uses the short identifiers from the SPDX License List in their Licenses API to represent the licensing of a project
  • Yotta – Yotta projects use the identifiers from the license list and link back to the license list text to represent the licensing of their project

Further information

The following links are meant to provide further information to references and resources you may need when working with the license list.

SPDX License Identifiers in Source

The need to identify the license for open source software is critical for both reporting purposes and license compliance.

However, determining the license can be difficult due to a lack of information or ambiguous information. Even when licensing information is present, a lack of consistent notation for providing license information can make automating the task of license detection very difficult, thus requiring vast amounts of human effort. The SPDX Work-group proposes to use SPDX license identifiers to indicate the license at the file level. The advantages of doing this are numerous:

  • It is precise; there is no ambiguity due to variations in license header text
  • It is language neutral
  • It is easy to machine process
  • It is concise
  • The license travels with the file (as sometimes not entire projects are used or license files are removed)
  • It is simple and can be used without much cost in interpreted environments like java Script, etc.
  • An SPDX license identifier is immutable.
  • It provides simple guidance for developers who want to make sure the license for their code is respected

For more details, see https://spdx.dev/ids