AWS CloudHSM key attestations: trust but verify


(Or, scraping attestations from half-baked AWS utilities)

Verifying key provenance

Special-purpose cryptographic hardware such as HSMs are one of the best options for managing high value cryptographic secrets, such as private keys controlling blockchain assets. When significant time is spent implementing such a heavyweight solution, it is often useful to be able to demonstrate this to third-parties. For example the company may want to convince customers, auditors or even regulators that critical value key material exists only on fancy cryptographic hardware, and not on a USB drive in the CEO’s pocket or some engineer’s commodity laptop running Windows XP. This is where key attestations come in handy. An attestation is effectively a signed assertion from the hardware that a specific cryptographic key exists on that device.

At first that may not sound particularly reassuring. While that copy of the key is protected by expensive, fancy hardware, what about other copies and backups lying around on that poorly guarded USB drive? These concerns are commonly addressed by design constraints in HSMs which guarantee keys are generated on-board the hardware and can never be extracted out in the clear. This first part guarantees no copies of the key existed outside the trusted hardware boundary before it was generated, while the second part guarantees no other copies can exist after generation. This notion of being “non-extractable” means it is not possible to observe raw bits of the key, save them to a file, write them on a Post-It note, turn it into a QR code, upload it to Pastebin or any of the dozens of other creative ways ops personnel have compromised key security over the years. (To the extent backups are possible, it involves cloning the key to another unit from the same manufacturer with the same guarantees. Conveniently that creates lock-in to one particular model in the name of security— or what vendors prefer to call “customer loyalty.” 🤷🏽)

CloudHSM, take #2

Different platforms handle attestation in different ways. For example in the context of Trusted Platform Modules, the operations are standardized by the TPM2 specification. This blog post looks at AWS CloudHSM, which is based on the Marvell Nitrox HSMs, previously named Cavium. Specifically, this is the second version of Amazon hosted HSM offering. The first iteration (now deprecated) was built on Thales née Gemalto née Safenet hardware. (While the technology inside an HSM advances slowly due to FIPS certification requirements, the nameplate on the outside can change frequently with mergers & acquisitions of manufacturers.)

Attestations only make sense for asymmetric keys, since it is difficult to convey useful information about a symmetric key without actually leaking the key itself. For asymmetric cryptography, there is a natural way to uniquely identify private keys: the corresponding public-key. It is sufficient for the hardware then to output a signed statement to the effect “the private key corresponding to public key K is resident on this device with serial number #123.” When the authenticity of that statement can be verified, the purpose of attestation is served. Ideally that verification involves a chain of trust going all the way back to the hardware manufacturer who is always part of the TCB. Attestations are signed with a key unique to each particular unit. But how can one be confident that unit is supposed to come with that key? Only the manufacturer can vouch for that, typically by signing another statement to the effect “device with serial #123 has attestation-signing key A.” Accordingly every attestation can be verified given a root key associated with the hardware manufacturer.

If this sounds a lot like the hierarchical X509 certificate model, that is no coincidence. The manufacturer vouches for a specific unit of hardware it built, and that unit in turn vouches for the pedigree of a specific user-owned key. X509 certificates seem like a natural fit. But not not all attestation models historically follow the standard. For example TPM2 specification defines its own (non-ASN1) binary format for attestations. It also diverges from the X509 format, relying on a complex interactive protocol to improve privacy, by having a separate, static endorsement key (itself validated by a manufacturer issued X509 certificate, confusingly enough) and any number of attestation keys that sign the actual attestations. Luckily Marvell has hewed closely to the X509 model, with the exception of the attestations themselves where another home-brew (again, non-ASN1) binary format is introduced.

Trust & don’t bother to verify?

There is scarcely any public documentation from AWS on this proprietary format. In fact given the vast quantity of guidance on CloudHSM usage, there is surprisingly no mention of proving key provenance. There is one section on verifying the HSM itself— neither necessary nor sufficient for our objective. That step only covers verifying the X509 certificate associated with the HSM, proving at best that there is some Marvell unit lurking somewhere in the AWS cloud. But that is a long ways from proving that the particular blackbox we are interacting with, identified by a private IP address within the VPC, is one of those devices. (An obvious question is whether TLS could have solved that problem. In fact the transport protocol does use certificates to authenticate both sides of the connection but in an unexpected twist, CloudHSM requires the customer to issue that certificate to the HSM. If there was a preexisting certificate already provisioned in the HSM that chains up to a Marvell CA, it would indeed have proven the device at the other end of the connection is a real HSM.)

Neither CloudHSM documentation or the latest version of CloudHSM client SDK (V5) have much to say on obtaining attestations for a specific key generated on the HSM. There are references to attestations in certain subcommands of key_mgmt_util, specifically for key generation. For example the documentation for genRSAKeyPair states:


-attest

Runs an integrity check that verifies that the firmware on which the cluster runs has not been tampered with.

This is at best an unorthodox definition of key attestation. While missing from the V5 SDK documentation, there are also references in the “previous” V3 SDK (makes you wonder what happened to V4?) to the same optional flag being available when querying key attributes with “getAttribute” subcommand. That code path will prove useful for understanding attestations: each key is only generated once, but one can query attributes any number of times to retrieve attestations.

Focusing on the V3 SDK which is no longer available for download, one immediately run into problems with ancient dependencies and incompatibility with modern Linux distributions It is linked against OpenSSL 1.x which will prevent installation out-of-the-box on modern distributions.

But even after jumping through the necessary hoops to make it work, the result is underwhelming: while the utility claims to retrieve and verify Marvell attestations, it does not expose the attestation to the user. In effect these utilities are asserting: “Take our word for it, this key lives on the HSM.” That defeats the whole point of generating attestations, namely being able to convince third-parties that keys are being managed according to certain standards. (It also raises the question of whether Amazon itself understands the threat model of a service for which it is charging customers a pretty penny.)

Step #1: Recovering the attestation

When existing AWS utilities will not do the job, the logical next step is writing code from scratch to replicate their functionality while saving the attestation, instead of throwing it away after verification. But that requires knowledge of the undocumented APIs offered by Marvell. While CloudHSM is compliant with the standard PKCS#11 API for accessing cryptographic hardware, PKCS#11 itself does not have a concept of attestations. Whatever this Amazon utility is doing to retrieve attestations involves proprietary APIs or at least proprietary extensions to APIs such as a new object attribute which neither Marvell nor Amazon have documented publicly. (Marvell has a support portal behind authentication, which may have an SDK or header files accessible to registered customers.) 

Luckily recovering the raw attestation from the AWS utility is straightforward. An unexpected assist comes from the presence of debugging symbols, making it much easier to reverse engineer this otherwise blackbox binary. Looking at function names with the word “attest”, one stands out prominently:

[ec2-user@ip-10-9-1-139 1]$ objdump -t /opt/cloudhsm/bin/key_mgmt_util  | grep -i attest
000000000042e98b l     F .text 000000000000023b              appendAttestation
000000000040516d g     F .text 0000000000000196              verifyAttestation

We can set a break point on verifyAttestation with GDB:

(gdb) info functions verifyAttestation
All functions matching regular expression "verifyAttestation":
File Cfm3Util.c:
Uint8 verifyAttestation(Uint32, Uint8 *, Uint32);
(gdb) break verifyAttestation
Breakpoint 1 at 0x40518b: file Cfm3Util.c, line 351.
(gdb) cont
Continuing.

Next generate an RSA key pair and request an attestation with key_mgmt_util:

Command:  genRSAKeyPair -sess -m 2048 -e 65537 -l verifiable -attest
Cfm3GenerateKeyPair returned: 0x00 : HSM Return: SUCCESS
Cfm3GenerateKeyPair:    public key handle: 1835018    private key handle: 1835019

The breakpoint is hit at this point, after key generation has already completed and key handles for public/private halves returned. (This makes sense; an attestation is only available after key generation has completed successfully.)

Breakpoint 1, verifyAttestation (session_handle=16809986, response=0x1da86e0 "", response_len=952) at Cfm3Util.c:351
351 Cfm3Util.c: No such file or directory.
(gdb) bt
#0  verifyAttestation (session_handle=16809986, response=0x1da86e0 "", response_len=952) at Cfm3Util.c:351
#1  0x0000000000410604 in genRSAKeyPair (argc=10, argv=0x697a80 <vector>) at Cfm3Util.c:4555
#2  0x00000000004218f5 in CfmUtil_main (argc=10, argv=0x697a80 <vector>) at Cfm3Util.c:11360
#3  0x0000000000406c86 in main (argc=1, argv=0x7ffdc2bb67f8) at Cfm3Util.c:1039

Owing to the presence of debugging symbols, we also know which function argument contains the pointer to the attestation in memory (“response”) and its size (“response_len”.) GDB can save that memory region to file for future review:

(gdb) dump memory /tmp/sample_attestation response response+response_len

Side note before moving on to the second problem, namely making sense of the attestation: While this example showed interactive use of GDB, in practice the whole setup would be automated. GDB allows defining automatic commands to execute after a breakpoint, and also allows launching a binary with a debugging “script.” Combining these capabilities:

  • Create a debugger script to set a breakpoint on verifyAttestation. The breakpoint will have an associated command to write the memory region to file and continue execution. In that sense the breakpoint is not quite “breaking” program flow but taking a slight detour to capture memory along the way.
  • Invoke GDB to load this script before executing the AWS utility.

Step #2: Verifying the signature

Given attestations in raw binary format, next step is parsing and verify the contents, mirroring what the AWS utility is doing in the “verifyAttestation” function. Here we specifically focus on attestations returned when querying key attributes because that presents a more general scenario: key generation takes place only once, while attributes of an existing key can be queried anytime. 

By “attributes” we are referring to PKCS#11 attributes associated with a cryptographic object present on the HSM. Some examples:

  • CKA_CLASS: Type of object (symmetric key, asymmetric key…)
  • CKA_KEYTYPE: Algorithm associated with a key (eg AES, RSA, EC…)
  • CKA_PRIVATE: Does using the key require authentication?
  • CKA_EXTRACTABLE: Can the raw key material be exported out of the HSM? (PKCS#11 has an interesting rule that this attribute can only be changed from true→false, it can not go in the other direction.)
  • CKA_NEVEREXTRACTABLE: Was the CKA_EXTRACTABLE attribute ever set to true? (This is important when establishing whether an object is truly HSM-bound. Otherwise one can generate an initially extractable key, make a copy out of the HSM and later flip the attribute.)

Experiments show the exact same breakpoint for verifying attestations is triggered through this alternative code path when “-attest” flag is present:

Command:  getAttribute -o 524304 -a 512 -out /tmp/attributes -attest
Attestation Check : [PASS]
Verifying attestation for value
Attestation Check : [PASS]
Attribute size: 941, count: 27
Written to: /tmp/attributes file
Cfm3GetAttribute returned: 0x00 : HSM Return: SUCCESS

Here is an example of the text file written with all attributes for an RSA key is. Once again the attestation itself is verified and promptly discarded by the utility under normal execution. But debugger tricks described earlier help capture a copy of the original binary block returned. There is no public documentation from AWS or Marvel on the internal structure of these attestations. Until recently there was a public article on the Marvell website (no longer resolves) which linked to two python scripts that are still accessible as of this writing:

These scripts are unable to parse attestations from step #1, possibly because they are associated with a different product line or perhaps different version of the HSM firmware. But they offer important clues about the format, including the signature format: it turns out to be the last 256 bytes of the attestation, carrying a 2048-bit RSA signature. In fact one of the scripts can successfully verify the signature on a CloudHSM attestation, when given the partition certificate from the HSM:

[ec2-user@ip-10-9-1-139 clownhsm]$ python3 verify_attest.py partition_cert.pem sample_attestation.bin 
*************************************************************************
Usage: ./verify_attest.py <partition.cert> <attestation.dat>
*************************************************************************
verify_attest.py:29: DeprecationWarning: verify() is deprecated. Use the equivalent APIs in cryptography.
  crypto.verify(cert_obj, signature, blob, 'sha256')
Verify3 failed, trying with V2
RSA signature with raw padding verified
Signature verification passed!

Step #3: Parsing fields in the attestation

Looking at the remaining two scripts we can gleam how PKCS#11 attributes are encoded in general. Marvel has adopted the familiar tag-length-value model from ASN1 and yet is inexplicably not ASN1. Instead each attribute is represented as concatenation of:

  • Tag containing the PKCS#11 attribute identifier, as 32-bit integer in big-endian format 
  • Length of the attributes in bytes, also 32-bit integer in same format
  • Variable length byte array containing the value of the attribute 

One exception to this pattern are the first 32-bytes of an attestation. That appears to be a fixed-size header containing metadata, which does not conform to this TLV pattern. Disregarding that section, here is a sample Python script for parsing attributes and outputting them using friendly PKCS11 names and appropriate formatting where possible. (For example CKA_LABEL as string, CKA_SENSITIVE as boolean and CKA_MODULUS_BITS as plain integer.)

CP

Leave a comment