An engineer on Apple’s static security tools team announced publicly that they have prototyped a tool to apply security hardening across entire C++ codebases. Ultimately their plan is to open-source and upstream this static analysis based tool into LLVM.
This new Apple tool was brought up as part of an LLVM Request For Comments (RFC) issued by Apple’s Jan Korous around a scalable static analysis framework. The focus is on being able to carry out static analysis and source code rewriting across the codebases of large software projects.
Here are some key takeaways from their RFC:
“We have a prototype of a source code rewriting tool that uses static analysis methods to apply security hardening across whole C++ codebases. We want to complete and upstream the tool. We are starting to work on other tools that need to reason about source code across large C, Objective-C and C++ projects. We also have a long-standing goal of enhancing the Clang Static Analyzer with analyses across translation units to improve its accuracy and precision, thereby reducing false positive rates. While there is an existing effort for cross-translation-unit analysis in Clang based on ASTImporter, we don’t think it models the software build with the accuracy we need, and it won’t be able to support the scale of the projects we target. This motivates us to create a summary-based cross-translation unit static analysis framework that would cover common needs of tools we want to create.
We plan to use it immediately to develop the tools mentioned above and other tools in the future. We intend to design and implement the framework incrementally in parallel with the initial client tools to make sure we are building the right thing.
This RFC is intended to kick off the development; it presents the high level ideas. We will post follow-up RFCs for each tool as well as specific proposals for parts of the design.
Problem
We intend to develop a number of tools based on static analysis methods that will need to infer, represent, and analyze facts about program entities of large software projects composed of numerous separately built targets. The tools will need to accurately represent relations within all the input source code of such software. They will need an efficient way to infer, represent, process, aggregate, and consume metadata about large amounts of code in order to implement the specific analyses.Proposal
We will create a new framework that static analysis tools and source code rewriting tools can use to reason about the source code of large software projects. The framework will consist of new APIs, new data formats, new tools, and possibly new features in clang.”
Those interested in the topic around a scalable static analysis framework itself can find the RFC proposal and details via the LLVM Discourse. It will be interesting to see where this leads and ultimately Apple’s tool to help security harden large C++ codebases.
