New Research Shows Open Source Software Supports 96% of Modern Applications, But Concerns Remain
A recent study titled "Free and Open Source Software Census III", jointly conducted by Harvard Business School, Harvard's Laboratory for Innovation Science, the Linux Foundation, and the Open Source Security Foundation (OpenSSF), has been released. Building upon two previous studies, this research delves into the application-level components that form the foundation of modern software.
The study analyzed over 12 million data points on open source software usage from 10,000 companies. The research team collaborated with the industry to collect anonymous data from multiple platforms, including automated scans of production codebases and comprehensive manual reviews of software compositions, to gain a deeper understanding of open source software usage and its indirect dependencies within software supply chains.
Key findings of the study include:
- Open source components are present in 96% of codebases.
- The use of proprietary packages related to cloud services (i.e., non-open source) is rapidly increasing.
- The industry's continued reliance on outdated Python 2 poses security risks.
- Since the second survey, the adoption of Rust has surged by 500%, marking a shift towards memory-safe programming.
- The lack of standardized naming for software components increases security risks.
- A small group of contributors drives major open source software, raising concerns about sustainability.
The report is available for free download on the Linux Foundation's official website for those interested in the full text: https://www.linuxfoundation.org/research/census-iii?hsLang=en
Exposure of Risk in Single-Maintainer Projects:
The report points out that 40% of top projects have only 1 to 2 developers who contribute over 80% of the code. This high concentration of contributors/maintainers represents a potential security risk.
An example is this year's supply chain poisoning incident with XZ Utils, where a hacker gained the trust of the main maintainer by frequently contributing to the project. After becoming a maintainer, they began injecting malicious code into the project, affecting a large number of downstream projects.
The OpenSSF Foundation is working to address these challenges by ensuring that the reviewed source code is what people are actually running. A major advantage of open source software is its ability to undergo extensive reviews to identify intentional or unintentional vulnerabilities.
However, if the reviewed content is not what's used to build the final product, then the review becomes meaningless. Therefore, OpenSSF's efforts include strengthening the build and distribution processes to ensure that the code running in reality has been reviewed.
The Perennial Issue of Python 2:
The Python Foundation released Python 2 in 2000 and the Python 3 series in 2008. Currently, Python is mainly evolving in the Python 3.x series.
The troublesome part is that some industries still have a 20% to 30% usage rate of Python 2. Using outdated versions of Python means security risks, but the industry hasn't found a better way to transition to newer versions yet.
OpenSSL believes that making new version upgrades extremely easy could help promote the adoption of new versions of open source software. In almost all cases, new versions should be fully backward compatible with older ones, especially previous versions, despite the extra effort required from developers. This approach is seen as the right way forward.