Key Technologies for Software Supply Chain Security—Techniques for Generating and Using the List of Software Compositions (Part 1)

Key Technologies for Software Supply Chain Security—Techniques for Generating and Using the List of Software Compositions (Part 1)

February 13, 2023 | NSFOCUS

The list of software compositions and the software bill of materials (SBOM) are different in the requirements for the granularity of the “minimum elements” of the software, without a substantial difference in technical ideas and implementation steps. Considering the relatively mature SBOM generation tools and techniques, this document focuses on various key SBOM techniques and explains the process to generate and use the list of software compositions.

The SBOM that is being promoted internationally can be regarded as a standard to produce the list of software compositions. According to the guide of NTIA, the SBOM that software companies are required to provide is a formal and machine-readable list to meet the requirements for automated identification and management of the software supply chain. Ideally, each link in the supply chain requests an SBOM from the upstream link, while the upstream link provides an SBOM to the downstream link. An SBOM is expected to support multi-level component information (such as the operating system, installer, package, and file), and also to vary with components changing (via updates, patches, etc.). Currently, there are three leading SBOM formats, namely, SPDX [1], SWID [2], and CycloneDX [3].

The Software Package Data Exchange (SPDX) project, led by the LINUX Foundation, aims to help reduce software ambiguity by defining standards for reporting information. SPDX reduces redundant work by providing companies and communities with a common format to share important data. The SPDX specification is recognized as an international open standard (ISO/IEC 5962:2021) for communicating security, license compliance, and other software supply chain artifacts. SPDX, in support of numerous file formats (.xls, .spdx, .rdf, .json, .yml, and .xml), is characterized by comprising components, licenses, copyrights, and open standards.

NIST Software Identification (SWID), only in the .xml format, is composed of a structured set of data elements that identify the software product, the version, the organizations and individuals in the production and distribution of the product, component information, relationships between software products and other descriptive metadata.

OWASP CycloneDX is a lightweight SBOM standard that supports .json and .xml.

Irrespective of the selected standard and format, it is necessary to establish a mechanism for generating and using the list of software compositions. It contributes to dealing with the increasing software supply chain attacks, but also enables the middle and downstream links of the supply chain to understand the intentions of the upstream links, resolves conflicts in the internal environment, and makes enterprises better understand the software structure and manage software risk.

Generation Techniques for the List of Software Compositions

Two circumstances are worth considering when a list of software compositions is generated. One is active, that is, the software suppliers actively provide a list of software compositions. At this time, the tool for generating this list can be included in the software version control system, while to some extent, it is able to track composition updates and trace historical versions. The other is passive, that is, downstream links need to analyze software components by themselves.

  • Suppliers actively provide compositions

From the perspective of DevOps, the life cycle of software includes the stages of planning, development, building, testing, release, deployment, operation and maintenance, and monitoring. If it is a supplier in the middle link of the software supply chain, the procurement link can also be separated from the development stage.

If a software supplier generates a list of software compositions after software development, only existing design documents and test reports can be relied on for information extraction, and software compositions are restored manually. Undoubtedly, it is low in efficiency and also easy to miss information about compositions of updated software. Worse still, it may become a situation almost the same as the passive analysis mode, increasing the difficulty of analysis and compromising traceability.

Then, how should a supplier produce a list of software compositions? NTIA provides an “SBOM assembly line” as shown in the figure below. This figure shows the software developers integrate the SBOM generation process with the entire DevOps process. SBOM is generated in a standardized, efficient, and streamlined manner by following every step in DevOps.

NTIA – Survey of Existing SBOM Formats and Standards 2021[4]

The first is the planning phase. In the planning stage, the design and planning of the software can be clarified and added to the SBOM document. As we mentioned above, the procurement process can be separated as an independent process. The software purchased here can be third-party tools, plug-ins, software packages, resource libraries, etc. necessary for development, as long as the corresponding third-party SBOM documents can be provided, and their statements can be referenced in the SBOM documents. However, if the other party is unable to provide SBOM documents, either finding alternatives or accepting them passively would be considered. Then the second type of SBOM generation, that is, the “passive” generation method, is involved. It will be explained later.

In the development stage, which includes the initial programming development and the later preparation of software patches. In the development process, SCM (Software Configuration Management), VCS (Version Control System) and other management systems are required. At the same time, for advanced DevSecOps, this stage often requires the application of SCA (Software Component Analysis), SAST (Static Code Analysis) and other analysis technologies. At this time, the source code, generated files and patch information should be entered into SBOM.

In the building stage, when the building is completed, the building information should be written into the SBOM document. So far, the initial development of the software has been completed. The testing stage begins.

The testing process evaluates software performance, stability, availability, and other measurement criteria. DevSecOps also conducts security tests such as black-box testing, internal and external penetration testing, and interactive application security testing (IAST). After passing the test, the software is signed and certified, with all the standards and certificate information used written into the SBOM. The development cycle in the software life cycle is over for the time being.

In the software distribution stage, SBOM information should be improved to include but not limited to the minimum standard information required by NTIA (author information, supplier, product name, version number, hash value of component information, and identification) and digital signature. Open-source software should also declare the license in the SBOM.

In the deployment stage, clauses, plug-ins, and configuration information can be attached to the SBOM.

Finally, in the maintenance/monitoring stage, the best is to insert known security vulnerability information into a document, such as a VEX document. The VEX (vulnerability exploitability exchange) concept and format were developed by the US National Telecommunications and Information Administration (NTIA)[4]. The VEX lists vulnerabilities in one version of a certain software/component, enabling users to obtain their status and information and evaluate their exploitability on this basis (to assist users in determining whether these vulnerabilities affect software – without impacts/fixed/under investigation/affected; and, if affected, whether there are actions recommended to remediate).

It is worth mentioning that among the three existing SBOM standards, except for SWID which supports few languages, SPDX and CycloneX provide corresponding license libraries and SBOM documentation plugins in java/python/javascript/golang/Maven languages.

  • Self-analysis of software compositions

If the software supplier’s first-hand list of software compositions or related information cannot be obtained, it is necessary to conduct a self-analysis of the software and generate a list of software compositions if necessary. This method also applies to verify the composition information of products and software provided by the supplier.

  • When the object is a closed source program, software companies or end users need to use code compilation information and configuration files, binary analysis tools, reverse engineering, and other methods. The analytical ability can be improved by searching for the composition analysis tables of homologous and near-source programs through artificial intelligence algorithms based on similarity.
  • When the object is an open-source program, software companies or end users cannot directly obtain the list of software compositions and updated services, and their self-maintenance is required. At present, enterprise users widely adopt SCA (Open Source Software Composition Analysis), a professional system of the composition analysis and management of open-source software, to manage the security risks of introduced open-source software. The automated SCA tool automatically scans the source code of the application, including modules, frameworks, libraries, containers, registries, and other artifacts, to identify and inventory all components and dependencies of open-source software, and to identify known security vulnerabilities or potential license authorization issues. The risk investigation before the application system is put into production is also applicable to the diagnostic analysis during the operation of the application system. In addition to providing visibility into open-source software components, some SCA tools prioritize remediation of open-source vulnerabilities or provide corresponding workarounds based on the level of the vulnerability risk.

In the software supply chain, both upstream and downstream enterprises are recommended to establish their own list of software compositions. They can obtain a certain list of software compositions through the database and with comparison to this list information, determine whether it is tampered with, or establish a sound mechanism for list generation and storage through existing techniques that improve and generate the list of software compositions. Both can play a positive role in software security.

References

[1] International Open Standard (ISO/IEC 5962:2021) – Software Package Data Exchange (SPDX)

[2] NVD – SWID (nist.gov)

[3] OWASP CycloneDX Software Bill of Materials (SBOM) Standard

[4] NTIA SBOM formats and standards whitepaper.[Online.] Available: https://www.ntia.gov/files/ntia/publications/ntia_ sbom_formats_and_standards_whitepaper_-_version_20191025.pdf

 Previous posts on software supply chain security: