SECADVENT DAY 7

A Quick Introduction to Software Bill of Materials and CycloneDX

Knowing what software you’re running sounds like a simple challenge.

The reality is very different. Modern software is often composed of a large number of independent services. Those services are themselves often built from a large number of open source libraries. Ecosystems vary, but it’s not unusual to have 1000s of individual open source libraries in use by a single application.

Wikipedia describes a bill of materials (or BOM) as:

A list of the raw materials, sub-assemblies, intermediate assemblies, sub-components, parts, 
and the quantities of each needed to manufacture an end product.

 

BOMs are widely used across industries, especially where complex supply chains come together in the manufacturing of goods from individual components. A software bill of materials (or SBOM) is simply the application of this idea to software systems. (A good place to start reading about this is this SecAdvent post on Security Fractals.)

There are lots of use cases for SBOM, including but not limited to:

  • Software license compliance
  • Identifying out of date libraries
  • Finding vulnerable dependencies
  • Export restrictions and other compliance matters
  • Managing aggregate risk across a company software stack

Languages and standards

For those familiar with specific languages you might be thinking “ah, like package.json” or “like pom.xml?”. There are a wide range of different package management tools with just as wide a range of different formats. These are generally slightly different to a SBOM in important ways. For instance these tools are generally about providing a good user interface for developers to define the main dependencies. The package management tools then use that information, and the state of the world (for instance what versions of that software are currently available) to determine what to install.

For instance in Python you might create a requirements.txt file containing a single line, like so:

structlog

When you install this with pip or other suitable Python tool it will 

  1. Determine a specific version of the package
  2. Determine whether the package has any dependencies
  3. Determine versions for those dependent packages
  4. Install all of them

So on their own, package manifests don’t work well as a bill of materials.

Several package management tools will be able to save this more detailed information as well, describing all of the versions and maybe filesystem hashes of the required packages and dependencies. You’ll find files like package-lock.json or Gemfile.lock. Back to our Python example, you might have something like this:

six==1.13.0 \
--hash=sha256:1f1b7d42e254082a9db6279deae68afb421ceba6158efa6131de7b3003ee93fd \   
--hash=sha256:30f610279e8b2578cab6db20741130331735c781b56053c59c4076da27f06b66structlog==19.2.0 \
--hash=sha256:6640e6690fc31d5949bc614c1a630464d3aaa625284aeb7c6e486c3010d73e12 \
--hash=sha256:4287058cf4ce1a59bc5dea290d6386d37f29a37529c9a51cdf7387e51710152

While in some cases this can work for the purposes of a general bill of materials, lots of software applications or systems are composed of components written in different languages, and using different package management tools. Some of the software might also be dependent services run by others, or unpackaged. You might have some Java code, a front end in Javascript and have packaged things up as a container with a specific JDK and supporting operating system software. Lock files or similar are intended as useful implementation details of the relevant package managers more than a general purpose bill of materials.

What we really need is something that covers all of the various languages and domains that make up a typical application, and something that’s focused on being consumed by other tools, more than just used within that package manager or ecosystem. What we need are standards.

What we really need is something that covers all of the various languages and domains that make up a typical application…What we need are standards.
TWEET THIS

What is CycloneDX?

CycloneDX is one project that’s been working on solving this problem for a while. Originally designed as part of work on OWASP Dependency-Track, the project now operates independently, with an active group of maintainers evolving the specification as well as supporting tools.

CycloneDX provides schemas for both XML and for JSON, defining a format for describing simple and complex compositions of software components. Here’s a quick sample.

{
  "bomFormat": "CycloneDX","specVersion": "1.2",
  "serialNumber": "urn:uuid:3e671687-395b-41f5-a30f-a58921a69b79",
  "version": 1,
  "components": [
    {
      "type": "library",
      "publisher": "Apache",
      "group": "org.apache.tomcat",
      "name": "tomcat-catalina",
      "version": "9.0.14",
      "hashes": [
        {
          "alg": "MD5",
          "content": "3942447fac867ae5cdb3229b658f4d48"
        },
        {
          "alg": "SHA-1",
          "content": "e6b1000b94e835ffd37f4c6dcbdad43f4b48a02a"
        },
        {
          "alg": "SHA-256",
          "content": "f498a8ff2dd007e29c2074f5e4b01a9a01775c3ff3aeaf6906ea503bc5791b7b"
        },
        {
          "alg": "SHA-512",
          "content": "e8f33e424f3f4ed6db76a482fde1a5298970e442c531729119e37991884bdffab4f9426b7ee11fccd074eeda0634d71697d6f88a460dce0ac8d627a29f7d1282"
        }
      ],
      "licenses": [
        {
          "license": {
            "id": "Apache-2.0"
          }
        }
      ],
      "purl": "pkg:maven/org.apache.tomcat/tomcat-catalina@9.0.14"
    }
  ]
}

CycloneDX also has a wide range of tools written for a variety of languages, to help with generating SBOMs from existing package management manifests or other inputs. For instance let’s look at the CycloneDX Python module.

We can use that like so:

$ pip install cyclonedx-bom
$ pip freeze > requirements.txt
$ cyclonedx-py -j

cyclonedx-py will read a requirements.txt file (it requires the explicit version numbers for the dependencies which pip freeze will add) and generate an bom.json file like the following:

{
    "bomFormat": "CycloneDX",
    "components": [
        {
            "description": "AWS Lambda Context class for type checking and testing",
            "hashes": [
                {
                    "alg": "MD5",
                    "content": "1876882e134cb2f95e45948994248b17"
                },
                {
                    "alg": "SHA-256",
                    "content": "d03b16aaf8abac30b71bc5d66ed8edadd8805e0d581f73f1e9b4b171635c817d"
                }
            ],
            "licenses": [],
            "modified": false,
            "name": "aws-lambda-context",
            "publisher": "New10 B.V.",
            "purl": "pkg:pypi/aws-lambda-context@1.1.0",
            "type": "library",
            "version": "1.1.0"
        },
        {
            "description": "Python 2 and 3 compatibility utilities",
            "hashes": [
                {
...

You could run this as part of your CI system to ensure SBOMs are kept up-to-date, and even store individual SBOMs for each commit or each released version of your software. Similar tools exist, both from the CycloneDX project and from the wider open source community, for Ruby, Javascript, Maven, Gradle, Go, Rust and lots more. 

CycloneDX also supports various signing approaches, including XML Signatures, JSON Signature Format (JSF) and more, so you can ensure the integrity of SBOMs where non-repudiation is important.

Alternative formats exist too, for instance SPDX (Software Package Data Exchange) and Software Identification (SWID) Tags, and I wouldn’t be surprised to see others emerge as well. But CycloneDX’s active community and widespread tooling support makes it worth further investigation.

What happens next?

While a software bill of materials as an idea isn’t new, there does appear to be an increasing amount of interest in the area, with several active industry working groups working on this at present. For instance the work under the NTIA (National Telecommunications and Information Administration, part of the US Department of Commerce) on Software Component Transparency and the work under CISQ (Consortium for Information and Software Quality) on Software Bill of Materials exchange.

I think in 2021 we might just see wider adoption of some of these ideas and see standards drive some of that adoption into the software ecosystem. Widespread adoption here will mean not just producing lots of SBOMs but, most importantly for success, new and interesting tools that consume SBOMs and allow consumers to solve real world software supply chain problems.

Related Posts

Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.