Licensing is a good way to prevent misuse of your software, to enable temporary use of it, or to enable the creation of a complex pricing model. The best approach to secure and reliable licensing is through cryptography. In this article, I will present two options for implementing a software licensing system: one option uses symmetric (secret) key cryptography, and the other option uses asymmetric (public) key cryptography. I will describe the algorithms and provide a sample implementation based on OpenSSL. In closing, I will compare the two solutions and evaluate their strengths and weaknesses.
The Licensing System
In this article, the term licensing system refers to a system for authenticating access to a software product through the use of a key string supplied by the vendor and input by the user. The license key is typically input at installation, but the same technique could apply to updates or upgrades. Even casual computer users are familiar with the prompt from an installation wizard to enter a string printed somewhere on the package. In this article, I'll discuss where that license string comes from, how it works, and how to implement your own licensing system using OpenSSL.The licensing system consists of a licensing authority, which issues the license keys, and an application, which consumes them. The inputs to the licensing authority are:
- a unique identifier of the specific instance of the licensed software
- a list of licensed features
- an expiration time
In many ways, a license key is the electronic equivalent of a physical key that "unlocks" the software. As such, it should be easy to issue and use licenses but very difficult to forge them. The license key should be linked to a specific instance of the software and should not be transferable or shared between different instances. The license should also include a time limitation feature, which will enable things like temporary licenses or a complex pricing model using yearly licenses.
Once the application authenticates the license and extracts the encoded information, the application can then take any actions necessary to ensure that the software protected by the license is properly configured. The application, in this case, is typically an Install application used to set up and configure the licensed software. Note, however, that the license key encodes a list of licensed features, so this solution lends itself to situations in which a single CD holds all the necessary software for various licensing models and a user upgrades to a higher license level by obtaining a new key from the vendor.
Licensing and Cryptography
Cryptography has many objectives, two of which are most applicable to licensing: Data Integrity and Data Origin Authentication. Data Integrity addresses the unauthorized alteration of data. Data Origin Authentication addresses the identification of the entity that created the data.The cryptographic MAC mentioned in the previous section ensures both the Data Integrity and the Data Origin Authentication. A valid MAC means that the data provided in the license has not been modified by an unauthorized party and that the licensing authority known to the application has issued the license.
It is not impossible to guess a valid MAC, but it is difficult. The level of difficulty (which is known as the level of security) is often given in terms of the number of operations required to defeat the protection. Typically, the level of security is defined by an upper bound on the number of operations necessary in order to break it.
For example, if a system is "40-bit secure" it means that breaking into it will take up to 2^40=1,099,511,627,776 operations. In our case, if every license check takes 10 msec, then guessing a license will take up to 348 years, which seems quite safe to me.
MAC Calculation
There are several known ways to generate a MAC. I will examine two options: one option uses a symmetric key, and the other option uses an asymmetric key. In symmetric key cryptography, encryption and decryption are performed using the same key. In the asymmetric case, there is a pair of two related keys, private and public. Encryption and signature verification is done using the public key, while decryption and signing is done using the private key (signing is the operation that binds information to an entity). Both implementations are based on well-known and well-tested cryptographic algorithms. It is important not to modify or invent new cryptographic algorithms. The topic is complex, and it is easy to make naïve mistakes.Symmetric Key MAC
My symmetric key implementation is based on RFC 2104 [1] and FIPS-198a [2]. Both documents describe a mechanism for message authentication using cryptographic hash functions. This mechanism is called HMAC. To make it easier to describe the HMAC algorithm, let's define the following:- H -- a cryptographic hash function
- B -- the block size processed in each iteration of H
- L -- the size of the hash output
- K -- a secret key shared between the originator and the receiver. The key size should be greater than or equal to L/2
- M -- the message to be processed
- IPAD -- the byte 0x36 repeated B times
- MPAD -- the byte 0x5c repeated B times
- (X, Y) -- Y appended to X
The HMAC algorithm takes M and K as input and generates a MAC as follows:
1. K' = K padded with zeros to create a byte string of length B
2. K1 = (K' xor IPAD)
3. K2 = (K' xor OPAD)
4. MAC = H(K2, H(K1, M))
SHA-1 output is 160 bit and is considered to be 80-bit secure. This is due to the so-called birthday attack. Any hash function, when supplied with a random input, returns one of k equally likely values. By repeatedly hashing different inputs, we expect to obtain the same output after about k^0.5 trials.
It is a common practice to truncate the output of the MAC algorithm and use only the t leftmost bits as the MAC. You should remember that the probability of guessing a correct MAC is (1/2)t. Therefore, it is recommended to keep t >= L/2 and never use a t that is less than 32 bits. In our case, since each guess takes a while (the user has to somehow type it in and wait for the application response), I think t=48 will do.
Listing 1 provides an implementation of hmac_sha1(). OpenSSL includes an implementation of hmac_sha1, see hmac_openssl_sha1() in the downloadable code archive. Note that the MAC calculation and MAC verification use the exact same algorithm and the same key.
Asymmetric Key MAC
The RSA signature schema as described in RFC 2437 [4] and FIPS-186-2 [5] can be used to authenticate the content of a message. As opposed to the scheme described above, this scheme employs an RSA key pair instead of a single secret key. The data is signed using a private key and the signature is verified using a matching public key.To make it easier to describe the RSA signature generation and verification algorithms, let's define the following:
- H -- a cryptographic hash function
- Encode -- encoding function (i.e. PKCS#1-V1.5)
- PrivK -- RSA private key
- PubK -- RSA public key
- M -- the message to be signed
- S -- the resulting signature
1. Mhash = H(M)
2. EM = Encode(Mhash)
3. S= RSA-Sign(EM, PrivK)
The RSA signature verification algorithm takes M, S, and PubK and generates a binary result (valid/invalid) as follows:
1. Mhash = H(M)
2. EM' = Encode(Mhash)
3. EM'' = RSA-Verify(S, PubK)
4. if EM' == EM'' then the signature is valid
For the same reasons described in the symmetric key discussion, the recommended hash function is SHA-1 or one of its siblings SHA-256, 384 and 512.
The RSA signature is as secure as RSA encryption assuming you use a good hashing function (like SHA). Using an RSA private key of 1024 bits gives 80 bits of security. The signature size is 128 bytes (the same as the key size).
It is impossible to truncate the RSA signature. In the symmetric case, since both sides go through the exact same steps and use the same key, we could easily truncate the MAC and compare only the t leftmost bits. In this case, the verification algorithm requires the complete signature.
Listing 2 provides an implementation of rsa_sign() and rsa_verify().
HMAC Licensing System
The licensing authority (see Listing 3) associates a set of features and an expiration date with a specific instance of the licensed software (target system). The set of features is represented as a bit-mask and allows any combination of up to 16 features. The expiration date is represented as days (not seconds) since 1/1/1970. This way it only takes two bytes to represent the date and more room remains for the MAC. The target system is assumed to have a 64-bit unique identifier. You can embed one in your software or create one on the fly based on the system components (such as, CPU ID, MAC, IP address -- somewhat like WinXP [6]).The licensing authority assembles all this information, calculates the MAC, and outputs the license string. The license string is a user-friendly representation of the following information:
- Feature set -- two bytes
- Expiration time -- two bytes
- Leftmost bits of the MAC -- six bytes
While processing the licensing string, the application extracts the feature set and the expiration date from the license string. It then concatenates the system-unique id and calculates the MAC. The leftmost bits of the MAC are compared to the bits provided in the license string. If they are identical, the license is valid. Note that the secret key used by the licensing authority must be identical to the key embedded in your application.
Sample Input and Output: # gen_hmac_license 016653623F 20040101 featue1 feature2 License = 5F431-67DE5-78FF3-22AB3 # ver_hmac_license 5F431-67DE5-78FF3-22AB3 License expires on 2004-01-01, Features = feature1 feature2I would like to discuss the pros and cons of the above system, but first I will describe the other system which uses asymmetric keys.
RSA Licensing System
The licensing authority in this case (Listing 4) does a similar job to the one described above, but with a few minor twists. The license string is too short to contain the RSA signature (128 bytes). In the previous case, I used the 48 leftmost bits of the HMAC result, and since the process is symmetric, it was good enough. Since the RSA method is asymmetric, the application must receive the complete signature to verify it.To solve this problem, I will distribute a part of the signature with the application and the rest will come from the license string. The application will combine the two to form a complete signature that can be verified. This can be considered a very primitive form of "secret sharing" (more about this topic in [7]).
Given a target machine, every combination of licensed features and expiration date has a unique signature. Since part of every signature will be shipped with the application, it is necessary to limit the number of combinations.
Therefore, in the RSA based licensing system, the set of features is represented by a smaller bit-mask and allows any combination of up to six features. The expiration date has only 4 possible values, 1 week since activation, 3 months, 1 year, or lifetime. As before, the target system is assumed to have a 64-bit unique identifier.
The licensing authority assembles all this information, signs it using the private key, and outputs a license string. The license string is a user-friendly representation of the following information:
- Feature set - six bits
- Expiration code - two bits
- Most significant bits of the signature - nine bytes
The above ten bytes are converted into a 20-character string.
While processing the license string, the application recreates the original data that was signed by extracting the feature set and the expiration code from the license string and concatenating the system unique id. Then it iterates through all of the partial signatures it has, reconstructs the complete signature by appending them to the signature bits extracted from the license, and tries to verify the message using each one of them (Listing 5). If any of them is valid, the license is valid. Note that the private key used by the licensing authority must be the key pair of the public key embedded in your application.
Cryptographic Considerations
The symmetric-key-generated license described here is less than 48-bit secure. Adding additional characters to the license string can increase it up to 80-bit. But the main security problem lies in the fact that the symmetric key is stored somewhere in the application; if found, it can be used to forge licenses.The asymmetric-key-generated license described here is less secure. If the license contained the complete signature, it would have been 80-bit secure. Since I suggest embedding most of the signature in the application and including only some of it in the license, the solution cannot be considered to be more than a few bits secure (evaluating the strength is outside the scope of this article). Nevertheless, it is good enough to make a brute-force attack very difficult, and more importantly, breaking one system does not enable the attacker to break the others (as opposed to the symmetric case).
Cryptographic strength aside, you should remember that the license check provides procedural security -- in almost every case there is a procedural way to bypass the cryptographic protection. For example, in many cases the license check is a single if statement in the code. If that check is found and removed, the need for a license is gone.
I ignored the topics of key generation and key management. Good recommendations can be found in FIPS 140-2 [8].
Implementation Considerations
Not all platforms are capable of performing RSA calculations in a reasonable time. Asymmetric key cryptography is hard to implement on systems with limited resources (i.e., 8-bit CPU with no hardware support for RSA). Some platforms are more susceptible to attacks (such as desktops). Using symmetric key solution in such an environment is probably not a prudent idea. In most embedded systems, attacks are harder. This means using symmetric keys is relatively safe. Using different symmetric keys for each version, or even for each distribution batch, will reduce the exposure to attacks on the key.Regarding the reduced security of the RSA-based licensing solution, if you are not limited by the license format; it is much better (and easier) to distribute the complete signature as part of the license. The license data can contain more information, and it would be impossible to forge a license.
If your platform can support RSA calculations and you can easily distribute the complete RSA signature with the license, you are better off using the asymmetric licensing solution.
References
[1] H. Krawczyk, M. Bellare, and R. Canetti, HMAC: Keyed-Hashing for Message Authentication, Internet Engineering Task Force, Request for Comments (RFC) 2104, and February 1997.[2] FIPS-198a The Keyed-Hash Message Authentication Code (HMAC) http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf FIPS -- Federal Information Processing Standards http://csrc.nist.gov/publications/fips/index.html
[3] FIPS-180-2 Secure Hash Standards http://csrc.nist.gov/publications/ fips/fips180-2/fips180-2.pdf
[4] B. Kaliski, J. Staddon, PKCS #1: RSA Cryptography Specifications Version 2.0, Network Working Group, Request for Comments (RFC) 2437, October 1998.
[5] FIPS-186-2 Digital Signature Standard http://csrc.nist.gov/ publications/fips/fips186-2/fips186-2-change1.pdf
[6] The WinXP licensing schema is more complex and involves an online component.
[7] Handbook of Applied Cryptography, by A. Menezes, P. van Oorschot, and S. Vanstone, CRC Press, 1996, pp 524-528, 538-540
[8] FIPS-140-2 Security requirements for Cryptographic Modules http://csrc.nist.gov/cryptval/140-2.htm
Additional Reading
[1] FIPS-196 Entity Authentication Using Public Key Cryptography http://csrc.nist.gov/publications/fips/fips196/fips196.pdf[2] Another way to calculate a MAC described here http://www.itl.nist.gov/fipspubs/fip113.htm
About the Author
Ilan Shamir has more than 20 years of programming experience. He spent many years with Intel Corp., was a visiting scientist at MIT, and is now working for Decru Inc. that develops wire-speed encryption appliances. He welcomes comments at [email protected]