Stop Signing the Container π¦, Start Signing the Content
2026-01-19
Jason Smith

In my current work with the OpenSSF SBOM Working Group, I am leading a group to create a best practices guide for SBOM signing. The #1 hill I’m prepared to die on? Canonicalization is a MUST HAVE.
Today, most tools treat an SBOM like a generic “binary blob”. If an IDE changes 2-space indents to 4, the signature breaks. If you change between a “pretty-print” version and a “minified” version, the signature breaks. That’s because these tools are signing the formatting (the container), not the data (the content).
π οΈ The JCS Standard (RFC 8785)
For JSON-based SBOMs, the industry standard for fixing this is RFC 8785: JSON Canonicalization Scheme (JCS). The very first sentence of the abstract says it all:
Cryptographic operations like hashing and signing need the data to be expressed in an invariant format so that the operations are reliably repeatable. One way to address this is to create a canonical representation of the data. Canonicalization also permits data to be exchanged in its original form on the “wire” while cryptographic operations performed on the canonicalized counterpart of the data in the producer and consumer endpoints generate consistent results.
By using JCS, we ensure that the JSON is normalized (deterministic property sorting, no whitespace, etc.) before the hash is calculated for signing and verification to generate consistent results.
π€― The “Verification Guesswork” Problem
Without a canonical “invariant format,” verification becomes a game of guesswork. In my own SBOM verification library, I’ve had to implement “multi-pass” logic just to handle the lack of industry consistency.
My current “guesswork” workflow:
- The Standard Path: Canonicalize the JSON, normalize the data, remove the “noise”, compute the hash, and verify.
- The Fallback (The “Blob” Guess): If the standard path fails, attempt a raw binary blob verification on the file as-is.
- The Failure: If both methods fail, I conclude the signature is invalid.
But even this is a simplified view. In reality, the guesswork goes deeper. If verification fails, I find myself trying to “fix” the file to find the original state. Was the SBOM minified prior to signing but delivered as a pretty-print version? Were the JSON properties re-ordered by a middle-man tool? Did some other non-standard normalization occur?
This “verification tax” is exactly what we are trying to solve. In a mature ecosystem, there should be one path to trust. A consumer shouldn’t have to play detective to determine whether a producer signed the data or the file.
By adopting RFC 8785 (JCS), we ensure that as long as the facts remain the same, the signature remains valid with no guesswork required.
The core problems we need to solve are:
- π Repeatability: Identical data always yields the same hash.
- π€ Interoperability: A producer in one environment and a consumer in another will reach the same result.
- π‘οΈ Resiliency: The signature survives a trip across different filesystems and tools.
π Repeatability
This is the mathematical “ground truth.” It ensures that regardless of how many times a file is processed, as long as the underlying information is the same, the resulting hash is identical. Without an invariant format like JCS, the signature becomes “one-time-use,” failing the moment the file is re-saved or slightly altered. Repeatability transforms signing from a fragile snapshot into a reliable identity.
π€ Interoperability
This is about breaking “tool-lock.” True interoperability means that a producer using a Go-based tool on a Linux server and a consumer using a Python-based verifier on a Windows machine will always reach the same conclusion. By performing cryptographic operations on a standardized canonical counterpart, we remove “language bias” and ensure that the software supply chain isn’t dependent on a single vendor’s implementation.
π‘οΈ Resiliency
This is the ability of a signature to survive the “real world.” In a modern CI/CD pipeline, SBOMs are moved across different filesystems, uploaded to cloud storage, and opened by various security tools - all of which might change line endings, property orderings, or indentation. Crucially, a resilient signature survives the transition between a ‘minified’ version used for machine efficiency and a ‘pretty-printed’ version used for human review. A resilient signature ignores these “atmospheric” changes to the file and remains valid as long as the core data remains untouched, allowing the chain of trust to stay intact from build to production.
πͺοΈ Why CycloneDX is the Gold Standard for This
CycloneDX doesn’t just “support” signing, it is built for data-aware integrity. While SPDX treats signing as an afterthought or an external “wrapper”, CycloneDX uses two built-in architectural features that make it uniquely resilient: embedded signatures and property exclusion. Because these features are baked into the formal CycloneDX specifications, supporting them is not optional, it is a mandate for any tool claiming spec-compliance.
Embedded Signatures
In CycloneDX, the signature may live inside the SBOM. This makes a traditional “blob” verification mathematically impossible. To verify the file, a tool must:
- “Reach in” and locate the signature block.
- Extract that block before performing the verification math.
The Catch: Once you remove the signature block, you are left with a “hole” in the file. Do you remove the trailing comma? The extra newline? The surrounding whitespace?
Without JCS, you are back to guesswork. JCS solves this by providing a deterministic way to re-normalize the remaining data, ensuring the hash matches the producer’s original intent regardless of how the file was edited to remove the signature.
Property Exclusion
The CycloneDX signature specification explicitly allows for property exclusion. This means a producer can choose to exclude specific fields or objects from the cryptographic hash while still maintaining a valid signature for the rest of the document.
The debate over why or if a file should be re-signed after a change is a moot point here. The reality is that because the specification supports exclusion, the signing and verification process must be built to handle it.
Why this necessitates JCS: When a property is excluded during the signing or verification process, you aren’t just “skipping” a line of text. You are computationally removing a piece of a data structure. This creates a fundamental problem for “blob” signing:
- How do you handle the remaining whitespace or commas?
- How do you ensure the remaining data is structured exactly as it was when the hash was first calculated?
If you treat the SBOM as a “blob,” any exclusion instantly breaks the byte-for-byte match. JCS is the only way to fulfill the CycloneDX specification. It allows the verifier to strip the excluded properties and then re-normalize the remaining data into a “canonical” state.
Without JCS, a verifier cannot support one of the core features of the CycloneDX integrity model. To be spec-compliant, you must be data-aware.
π£οΈ The Road Ahead
While I’ve focused heavily on CycloneDX and JSON, these principles of data-aware integrity are universal. To achieve true global scale, the same logic must be applied across all formats:
- SPDX: Because SPDX currently lacks an internal signing specification, it relies on external tools (like Cosign) to handle signing and verification. For the ecosystem to scale and remain interoperable, these tools MUST adopt these same data-aware best practices. If external tools only sign the “blob,” they inherit all the fragility discussed here.
- XML: The logic remains identical. For XML-based SBOMs, we achieve repeatability and resiliency by using c14n11 (Canonical XML Version 1.1) to normalize the data before hashing.
The Bottom Line: If our signatures aren’t repeatable and resilient to formatting changes, the “Chain of Trust” is an illusion. It will break the moment it hits a different developer’s machine, a different cloud provider, or a different CI/CD tool. We need to move toward a future where we sign the facts, not the formatting.
What’s your take? Should a signature survive a “Pretty Print” command, or is a bit-for-bit file match the only way? Would love to hear your opinion, especially if you disagree. π¬π