Introduction to Code Signing

 

Code signing is used to indicate and verify the publisher of software programs and make sure the software has not been modified after publishing. It offers authentication (who signed the code), integrity (the code cannot be changed undetected) and non-repudiation (the original signer cannot deny having signed the file), assuming that the signing process is performed in a secure and correct way.

 

Objective

The objective of code signing is to provide evidence that software does not impose security threats to the systems executing it:

  • Using the information about the publisher’s identity, users can make educated decisions about whether to install and execute a specific program, and system administrators can define security policies that will be enforced, allowing only trusted programs to gain elevated rights.
  • Cryptographic signatures can guarantee that no alteration, such as a virus infection, has taken place after the publisher has signed it.

 

Application

Code signing uses public key cryptography for signing and verification, in a way very similar to e-mail and PDF file signatures. Publishers are identified using certificates they purchase from trusted certificate authorities, much like HTTPS certificates. Code signing certificates always identify legal entities such as businesses, educational institutions and government agencies, and sometimes individual persons.

Today, operating systems, web browsers, development platforms, add-in systems, app stores, anti-malware utilities and enterprise management software already take code signatures into consideration. For instance, when Windows checks a signature, it also looks at the code signing certificate and its reputation. Based on this reputation, there are several warning levels:

 

Validation and reputation Action
The program is not signed, or the signature is invalid The user is warned not to start the program
The program has a valid signature, but the certificate has little or no reputation The name of the software publisher is displayed, and the user is prompted to proceed or abort
The program is signed, and the certificate has reputation The program is executed or installed

 

There are two ways to gain reputation:

  • A certificate is encountered several times in the wild, and no malign usage was reported. (The data is collected from Windows users by Microsoft’s SmartScreen program.)
  • Extended Validation (EV) certificates have full reputation right from the beginning

 

What are EV certificates?

Certificate authorities are required to take greater care when issuing Extended Validation certificates. The identity vetting process is more involved, and therefore EV certificates are more expensive. Software publishers using EV certificates are required to store their private keys on dedicated hardware, so they cannot be stolen by hackers. For normal code signing certificates, this is only a recommendation. (Note however that hardware keys can still be physically stolen, especially when stored on inexpensive USB devices. Additionally, even if the key is not stolen, it could be abused by a hacker who gains access.)

 

Difference to HTTPS certificates

HTTPS certificates always use Domain Validation (DV) and are sometimes enriched by also using Organization Validation (OV). Code signing certificates always use Organization Validation only. Both can use Extended Validation (EV). For HTTPS certificates, browsers usually reward EV validation by displaying the organization’s legal name in a green box next to the URL field.

 

Elements of code signing

All common types of code signing are based on public-key cryptography.

    • Software publishers use a secret private key to sign their code
    • Certificate authorities validate publishers’ identities and issue X.509 certificates
    • Certificates connect the signature to identities

Private keys

The basic concept of public-key cryptography is that keys are always generated in pairs: a public key and a private key. The private key is only known to its owner, while the public key is, well, public. Signing works like this: The signer, let’s call her Alice, uses her secret private key to sign a file. The receiver, Bob, knows Alice’s public key. When he receives a file signed by Alice, he can use this public key to verify that the file was signed by Alice, and that it has not been modified since. Private and public keys are therefore created together, but only the public key is exposed via the certificate and the private key is kept in a secure location, for instance a key store on SignPath.io.

Hardware security moduls (HSM)

The private key of a certificate must be properly protected. Theft of private keys is the main attack vector for code signing, thereby compromising both users and publishers. Since it’s not possible to effectively protect private keys in files or certificate stores managed by Windows, it is widely recommended that hardware security modules (HSMs) are used for code signing. For Extended Validation certificates, it is even required that keys are managed in HSMs meeting the requirements of FIPS 140-2 level 2.

A HSM is a device that stores secret keys and performs cryptographic operations using these keys. When used properly, the HSM will generate the key itself, and will never expose it to any user or any other device. So when you use a HSM for signing, the HSM will not give the key to the signing software. Rather, the signing software will send the data (the digest) to the HSM and ask for a signature.

Certificate Authorities (CAs)

An issuer of certificates is called a Certificate Authority (CA). They own and distribute root certificates that are then used to verify the certificates they issue. There are two common types of CAs:

  • Commercial CAs: dedicated companies that verify identities and issue certificates for a fee. Commercial CAs are usually audited according to WebTrust criteria, and have their root certificates distributed with major operating systems and browsers. Their main business is issuing SSL certificates for HTTPS, but they also issue certificates for code signing, e-mail and document signing.
  • In-house CAs: operated by organizations for internal use. These certificates are distributed within the organization’s network to their PCs and servers.
When a CA issues a certificate, it uses its own root certificate (and the associated private key) to sign the issued certificate. Therefore, every computer that trusts the issuing CA will also trust the issued certificate.

Self-signed certificates
For testing purposes, certificates are often created ad-hoc, without use of a CA. These certificates are self-signed, i.e. they are signed with their own private key. Self-signed certificates must be trusted explicitly by the user’s system, or they will not be accepted

Certificate Chains

A typical certificate is issued by an intermediate certificate. The intermediate certificate is in turn issued by the root certificate. The sum of all these certificates is called a certificate chain. A typical certificate chain looks like this:

Certificate chain sample

Let’s examine this in detail:

  • The CA root certificate is self-signed. It is installed on the client and therefore trusted.
  • The CA intermediate certificate is issued by the CA root certificate:
    • Its issuer attribute is set to CA root certificate.
    • It is signed using the private key corresponding to the CA root certificate.
    • Some company’s certificate is issued by the CA intermediate certificate.
      • Its issuer attribute is set to CA intermediate certificate.
      • It is signed using the private key corresponding to the CA intermediate certificate.
For instance, the actual certificate chain of Mozilla’s Firefox (firefox.exe) looks like this:

Concrete certificate chain sample

In order to verify the legitimacy of a signature, a client needs to know the entire certificate chain. Therefore, certificate files usually contain not only the certificate, but also every certificate in its chain of parents.

Technical Background

Root certificates cannot be revoked, if there is a security problem with any of them, they must be removed from every computer. Therefore, private keys for root certificates must not be stored in systems connected to networks. Issuing an intermediate certificate is a process that is rarely performed and requires physical access to the system that stores and protects the root certificate’s private key. On the other hand, common certificates are usually issued online. This only requires access to the intermediate certificate’s private key, a far less critical resource.

Certificate Revocation

Sometimes certificates are issued in error. And sometimes rightfully issued certificates are either abused, or their private keys are compromised. If a certificate authority learns of such an incident, they are required to revoke this certificate.

Each revocation has an effective date, which is often back-dated. For instance, if a publisher finds out that a certificate’s private key has been stolen two months ago, it will inform the CA, which in turn will issue a revocation effective two months ago. Signatures that were applied before this date will still be considered valid if the signature is time-stamped (see below).

Certificate revocation is an essential part of the certificate validation process. When a client encounters an unknown certificate, it must contact the certificate authority and check whether this certificate has been revoked. If a certificate has been revoked, the client will not accept it.


Technical Background

The certificate contains the URL for this check, and depending on the mechanisms provided, the client can either download a Certificate Revocation List (CRL) or check validity through the OCSP protocol.
Note that while implementations may differ, certificate revocation for code signing is usually more reliable than revocation for HTTPS certificates. The main reasons are a) an attacker who is able to mount a HTTPS attack is often in the position to intercept revocation traffic, and b) Web browsers are often more relaxed when they cannot reach the certificate authority’s servers.

Time Stamping

After signing a software artifact, the signature should be counter-signed by a time stamp authority (TSA). A time stamp provides proof that the signing has taken place at a certain date and time.

Each code signing certificate has a validity period of usually one to three years. Without timestamps, all signatures would be invalid after this period. Also, signatures using certificates that were revoked later would be invalid, no matter when the signing took place. (While the latter occurs less often, it would indirectly create a security problem: Having a large number of legitimately signed binaries without time stamps would strongly discourage revocation of compromised certificates.)


Technical Background

Technically, a time stamp is just a counter-signature, i.e. the primary code signing signature is itself signed by the TSA. The TSA is a service provided by most certificate authorities, providing time stamp signatures to anybody and proofing only the date and time of the original signature. (As far as the TSA can tell, it might have been signed earlier, but that would not matter for the purposes presented here). A time stamp is just a signature using a TSA certificate, which in turn has a certificate chain that terminates at a trusted root certificate.

Signatures

Before publishing software, vendors can sign a software artifact by creating a digital signature, consisting of two parts:

  • The cryptographic signature (the file’s hash code, encrypted with the private key)
  • The certificate that matches the private key (and its entire certificate chain)

Signature Formats

Many file formats for programs and installation packages support embedded signatures (the signature becomes a part of the file). This includes formats from Microsoft and Apple as well as Java and Android packages. Signatures can also be stored separately from the signed files for various reasons:

  • The format of the signed file does not support embedded signatures
  • More than one file must be signed
  • Signatures should be distributed separate from the signed files
Examples for separate signatures are Windows catalog files (.cat) and detached signature files used on Linux (.sig).

Signature validation

A client that wants to verify a signature needs to perform several steps, all of them must succeed in order for a signature to be considered valid.

  • The hash digest for the signed artifact is calculated
  • All signatures and counter-signatures are validated cryptographically
  • Only trusted cryptographic algorithms may be used
  • The certificate must be valid
    • It must have the key usage attributes necessary for the intended purpose (for example, code signing)
    • The validity period must cover the current date, or, if a time stamp is present, the time stamp date
    • The certificate must not have been revoked; if a time stamp is present, it must not have been revoked before the time stamp date
    • The certificate must be trusted: it is either trusted by the client or the certificate chain reliably leads to a trusted root certificate

Hash calculation

A signature is supposed to be authoritative for entire files, but the actual signing algorithm usually only has a small digest as input. This has practical reasons: The actual signing is supposed to take place within a secure system that owns (and protects) the private key, such as a hardware security module (HSM). Also, time stamp authorities (TSAs) must be called over the internet. Submitting large files to a HSM or TSA would not be a good use of resources.

The first step in all signing operations is therefore the calculation of a cryptographic hash digest. A hash algorithm generates a single number, typically from a larger binary file. This number is called digest. Any modification to the original file, no matter how small, is supposed to result in a different digest. However, even for large digests (using many bits), there is a chance that different files may result in the same digest, which is called a hash collision. The chance of an accidental collision for a large digest is so small that it can usually be neglected. However, a hacker might deliberately forge a file that results in the same digest as a signed one, making it possible to copy the signature of an existing file. Choosing the right cryptographic hash algorithm is supposed to make this impossible with current hardware. Current algorithms that are considered to be secure include SHA-2, SHA-3 and various elliptic curve algorithms. SHA-1 and MD5 were once popular, but are now considered forgeable and should no longer be used in any cryptographic context.