Analysis Tools of List of Software Compositions
According to the classification by the LINUX Foundation [1], SBOM tools are grouped into three categories, namely, to produce, consume and transform. Each category has three functions. For the producing category, functions mainly include software composition analysis, automatic creation of SBOM documents, and manual editing; for the consuming category, browsing (human readable), SBOM file comparison, and SBOM file import; for the transforming category, translation, merging and tool support. One analysis tool can be grouped into one of the three categories, but also equipped with functions of more than one category.
Classification of SBOM Tools
Risk Management of Open Source License Authorization
Software developers, when using open source software in the software supply chain, face not only technical security risks, but also legal risks of violating license authorization.
An open source license is the clause that legally defines and constrains behaviors such as distribution, modification, and reuse of software codes and binary files in open-source projects. It stipulates the rights and obligations that software developers and software users should exercise. To disclose part of the source code of its own product to (lack of object), the enterprise should choose an appropriate open source license to preserve its own rights. Modification and reuse in violation of the open source license will also cause potential legal risks. Therefore, the analysis of the open source software licenses cited by software products is a prerequisite for the legal use of open source software. Especially for large software systems, the verification process of the license information is of particular importance.
In the field of open source, well-known international organizations for open source license certification include the open source community OSI (Open Source Initiative) [2] and FSF (Free Software Foundation)[3]. The OSI currently endorses more than 110 open source software licenses, and the FSF lists more than 90 open source software licenses. The huge quantity and various types of licenses have brought many difficulties to the analysis of open source licenses, while causing license compatibility conflicts over components containing different open source licenses in the software industry. Relevant research on how to choose license combinations for commercial software to reduce legal issues is also common.
Among them, the well-known license compliance software of FOSSology and the Open Source License Checker (OSLC) can inspect the open source license of the source code of software projects.
With the introduction of the SBOM concept, Georgia M. et al. present their work on automating license compatibility by proposing a process that examines the structure of Software Package Data Exchange (SPDX), the standard of the software component analysis [4] [5]. In this sense, we can see that specifying the requirements for open source licenses in the list of software compositions enables solving the license security risks evaluated by end users in an easier and more effective way. To better secure the software supply chain, software license risks should also be taken into account, which requires the ability to identify and manage licenses. However, the major premise of managing license risks remains unchanged, that is, to clearly list the components of the software and their dependencies.
Open Source Software Composition Management
The security risks caused by transitive dependencies are particularly prominent in open source projects. One example is the Apache Log4j2 security vulnerability (CVE-2021-44228) that occurred in December 2021. Log4j2 is a basic log library almost the same as the standard library, on which countless open source Java components are directly or indirectly dependent. According to the analysis of NSFOCUS researchers [32], the outbreak of Log4j2 vulnerability this time, given its wide range of impacts and serious harm, is worth special attention from all scenarios and fields. Knowledge graph technology can be applied to analyze this huge and complex relationship chain.
Knowledge graph technology has been widely used in the industry in recent years. For example, there are good application cases in the vertical fields such as network security, medical care, law, and finance. The knowledge graph in essence is a large semantic network. The graph describes the concepts, entities and events in the real world and their relationship. With the entity and concept as the node and the relationship as the edge, the graph provides a way to see the world from the relationship perspective.
With reference to the standard formatted description of open source component compositions and their relationship provided by CycloneDX, the compositions of open source components can be described through a relationship diagram, as shown in the following figure.
Graph Relational Model Based on CycloneDX BOM
The knowledge graph can be used to solve the problem of unclear representation of complex dependencies of open source software, making the compositions of open source software semantic, visualized, machine-readable and reasonable:
The graphic representation of the knowledge graph establishes the deep relationship of the compositions of the open source software (such as dependencies, references and licenses);
The vulnerability knowledge base related to open source software is used to establish a relationship with it, and through the hierarchical propagation path search algorithm of the knowledge graph, it effectively analyzes the vulnerability impact of open source software to measure the severity of project vulnerabilities;
The graph algorithm is applied to analyze the knowledge graph (dependency graph) for obtaining the riskiest key dependencies, with effective suggestions on remedy recommended.
Open Source Software Knowledge Graph Ontology Model
As for security analysis research, the knowledge graph of open source software assists security personnel in a comprehensive risk assessment analysis of open source software:
1. Based on the attribute characteristics of open source components (such as the hash value of the source code, package and other files), it can determine whether the component is at risk of being packaged for fraudulent use or itself being tampered with;
2. With the dependencies of different open source components, combined with component version information, other affected components can be quickly identified in emergency response to high-risk vulnerabilities, and graph optimization algorithms such as graph aggregation and subgraph splitting can efficiently and continuously output the risk analysis results.
The figure below shows the relationship between multiple high-risk vulnerabilities in the log4j-core@2.3 component. If an open source component with a high-risk vulnerability is referenced in the software, it will undoubtedly impose potential threats to the software product. Similarly, the software products of the software supplier face the same security risks.
log4j Component Dependency Graph
References
[1] https://linuxfoundation.org/wp-content/uploads/LF-Live-Generating-SBOMs.pdf
[2] https://opensource.org/
[3] The GNU Operating System and the Free Software Movement
[4] Kapitsaki G M, Kramer F. Open Source License Violation Check for SPDX Files[C]// International Conference on Software Reuse. Springer, Cham, 2015:90-105
[5] Kapitsaki G M, Kramer F, Tselikas N D. Automating the license compatibility process in open source software with SPDX[J]. Journal of Systems & Software, 2016.
Previous posts on software supply chain security:
- Software Supply Chain Security: Overview
- Threats against Software Supply Chain Security
- The Increasing Trend of Software Supply Chain Attacks
- The Increasingly Complex and Varied Vectors to Attack Software Supply Chain
- Security Concept for Software Supply Chain (Part 1) — Transparency of Software Supply Chain Compositions
- Security Concept for Software Supply Chain (Part 2) — Assessable Capabilities of Software Supply Chain Compositions
- Security Concept for Software Supply Chain (Part 3) – Building Trusted Software Supply Chain
- Relationship Between Security Concept and Security Assessment for Software Supply Chain
- Technical Framework of Software Supply Chain Security
- Key Technologies for Software Supply Chain Security—Techniques for Generating and Using the List of Software Compositions (Part 1)