Transparent Encryption of Properties in .NET, Part 1: The Foundation
This is part 1 in a series of posts that will cover securely encrypting properties in POCO types.
One of my favorite features of SQL Server it it’s ability to easily encrypt my sensitive data by using it’s built in column level data encryption features. I like that the cryptographic implementation is done correctly, using strong algorithms and following the correct encryption protocols. Often when developers have to implement their own encryption logic they miss items such as always using a nonce as an initialization vector every time they encrypt the data, using weak encryption keys or incorrectly using the algorithms. Another practice that leads to vulnerabilities is not correctly managing the encryption key, or worse, hardcoding encryption keys in the application or configuration files. I especially like that the data encryption functionality in SQL Server makes it so easy to properly encrypt and decrypt data.
I do have times that I need to securely store sensitive information but am not able to work directly with SQL Server. I may not even be using a relational database to store my data. This is why I decided to write a set of good encryption templates that can be used with a number of backend data stores. I also wanted to make the encryption easy to use with previously written code and as low friction as possible for a developer to introduce.
Since I am also a big proponent of reducing boilerplate code by using Aspect Orientated Programming techniques I decided to implement the data encryption functionality in a set of aspects for PostSharp. The free Express version of PostSharp should be enough to get started with these aspects. Now all a developer has to do to get strongly encrypted properties and fields is to decorate their existing POCO types with a few attributes and let PostSharp add the common encryption logic automatically at compile time.
In this blog post I am going to cover the details of the property (and field) encryption logic. Other posts in this series will cover additional items, such as providing the ability to perform lookups on encrypted data, specific key management features as well as the details of an implementation specifically for RavenDB.
All of the code I am discussing is released under an open source license, the master repository is on GitHub. You can also easily bring this functionality into your own projects by installing the packages from NuGet.
Good Encryption
Good data encryption is not an easy thing to get right. There are a number of pitfalls that even security focused developers can stumble into. With that in mind, I have attempted to make my data encryption as correct as possible. I am providing the code and packages with absolutely no guarantees that they are free from security defects. This code is for demonstration purposes and you should not use it blindly. Make sure that it is appropriate for your needs and please do let me know if you find any flaws or bugs in it and I will work to correct them.
With the disclaimer out of the way lets dig into the core functionality, encrypting and decrypting data.
Our encryption method attempts to ensure that all the data encrypted is done so securely. There is also an option to add integrity verification to the encryption process.
The integrity function will calculate a Hashed Message Authentication Code (HMAC) of the cleartext data before it is encrypted. We concatenate the cleartext data and HMAC before we encrypt it.
Once we have calculated the HMAC it is time to start encrypting the data All the .NET cryptographic functions work on byte arrays so I have standardized on a decision that I will represent all strings in Unicode. This ensures that I am able to successfully decrypt data that has been encrypted by my functions by assuming that all payload data is a Unicode encrypted string. The first step in encrypting the data is to convert the string to a byte array.
Next, before we encrypt the data we need to generate a unique value that will be used as the Initialization Vector (IV). This is a nonce (a number that will only be used once) and each time we encrypt data we need to generate a new nonce. To do this I have created a helper extension method, FillWithEntropy, that will take a byte array and fill it with a cryptographically random set of values. Note that the standard System.Random functions are not sufficiently random for anything to do with cryptography and should never be used.
Finally, we need the encryption key itself. In my code I have taken pains to ensure that any encryption keys are stored separately from the encrypted data as well as not being directly accessible to my program. This enforces a good separation of concerns and increases data security. I have abstracted this separation by creating an IKeyServer interface that will be used to return an implementation of a key server. That key server provides the GetKey method which will return the symmetric encryption key base value from the key store.
Now that we have our base encryption key value I run it through the Rfc2898DeriveBytes function to return a hashed byte array that is then used as the actual encryption key for the data.
With a unique IV and a non-obvious encryption key I ensure that the code will use the strongest implementation of a standard encryption algorithm. This means that I make sure to use the Cypher Block Chaining mode of my algorithm to ensure that my ciphertext cannot be easily tampered with.
Then, to make it easier to deal with encrypted data throughout the rest of the application I encode both it and the IV as Base64 strings. Finally, I concatenate both the IV and encrypted data together into a single string to make it easier to keep them together for future decryption.
The decryption process is a reversal of the encryption, where I split off the IV from the payload, decrypt the data using the same algorithm implementation and key and optionally verify any HMAC that is encrypted with the data. If all the decryption logic works I return the cleartext encoded as a Unicode string to the caller.
Key Management
While implementing a fully featured key server is the subject of a future blog post I will review some of the basics of the key management functionality of my solution. All key values are stored separate from the encrypted data, this will ensure that any attackers have a higher bar to clear to extract any useful information from my encrypted data.
I interoperate with the key server by interacting with the IKeyServer interface which provides access to the actual encryption keys without tying my code to any particular concrete implementation of a key server. This gives me the flexibility of a number of different storage technologies and security layers to protect my encryption keys.
In order to decrypt the data I need to be able to retrieve the correct encryption key. To enable this I need to store some sort of pointer alongside the encrypted data that denotes which key was used to encrypt it. I accomplish this with the EncryptionKeys dictionary in the type definition. This dictionary is responsible for storing the identifier of each key that is used to encrypt the given property. With my implementation we can have multiple properties, each encrypted with a different encryption key. By doing this we increase the strength of our encryption by minimizing the reuse of encryption keys. This can also form the basis of a strong data visibility restriction implementation, as, by withholding specific encryption key values from users we make it impossible for unauthorized users to access sensitive data since the cleartext of that data is never on their systems to start with.
The primary method I need to encrypt and decrypt data is the GetKey method. All I need to do is pass in the identifier of the key that was used to encrypt the property and it will return the encryption key.
Other useful methods on the interface are the Map method which will return a dictionary of property names and key identifiers that the server knows about. This is useful for administratively specifying which keys encrypt which properties without needing to define key identifiers in the source code. A Keys method can also be implemented that will return a list of which keys are currently available from the server.
Automatic Implementation
Rather than manually implement all the required code in base classes and expect other developers to correctly implement the functionality I am taking advantage of Aspect Orientated Programming to perform the boilerplate implementation of my encryption logic. This will ensure that the correct code is injected into the compiled assembly. Now all any developer has to do is decorate their types and properties with the given attributes and they will automatically receive this functionality in their program.
To perform all the work necessary to store encrypted data a developer just has to decorate their type definition with the [EncryptedType] attribute. For any fields or properties that need to be secured the developer just has to decorate them with the [EncryptedValue] attribute.
By using PostSharp I also gain the ability to transparently override the field and property getters and setters to make sure that the cleartext data is never persisted into the object. This is useful not only from a local security perspective as unencrypted data is never stored in the processes memory space, but it also is vital in situations where the object is serialized for storage or transmission. By overriding both the setter and getter I ensure that any serialization of the object will not return cleartext data as the data will always be encrypted and the field or property will always return the encrypted value.
As with all encryption at some point in time you need to get back the cleartext value of the data. To support this in as easy as way as possible for the consuming developer the IEncryptedType interface provides the ClearText method. Calling the ClearText method with the property name will return the unencrypted value of the field or property (provided the encryption key is both available and correct).
Since I am not a fan of having magic strings in my codebase and I really like to have IntelliSense I have also created an extension method, AsClear, that will extract the name of the property from a given expression and then return the cleartext value of that property.
Usage
Finally it is time to put all of this work into practice. While I will be writing future blog posts detailing a full implementation of all the features, the below code from my(sparse) tests demonstrates how to use the encryption.
I first define a type that will have an encrypted property and decorate the SSN property as the one I want to encrypt. Next I store a social security number in the SSN property and verify that when I read the value of the property through the standard getter that the value is encrypted. Finally I test that the decryption works correctly by retrieving the cleartext version of the property data by casting to IEncryptedType and calling the AsClear extension method on the SSN property.
Get the bits
All of the code required to implement transparent encryption of fields and properties is available under an Apache open source license. I plan to continue working on this code and improving it. If you would like to see the code or contribute to it I have it in a GitHub repository. Please feel free to let me know your feedback. I would love to see this project grow and be useful to people.
For easy use in your own solution you can install a prebuilt version of the code from NuGet. At a minimum you will need both the IEncryptedType and the EncryptedType packages.
You can also get a RavenDB specific implementation that ensures that only a minimal set of serialized properties actually required to support the encryption/decryption will be persisted to the storage engine.
In Conclusion (for now)
This is the first post in my series, future installments will dive deeper into the specifics of the PostSharp implementation, enhance the logic to allow for effective and secure seeking through encrypted data, build a fully functional key server and create storage specific implementations of the code.
I hope you get some good use of this code. I am always open to hearing about specific use cases, feature requests or bug reports. You can email me, hit me up on Twitter or find me at a community event.
Feel free to take the code out for a test drive by grabbing it from NuGet or just check out the codebase and see how it works. As always do try to stay safe out there.
- CodeMash 2014 Door Decorating Competition
- Transparent Encryption of Properties in .NET, Part 1a: The Bugfixes
Pingback: Transparent Encryption of Properties in .NET, Part 1a: The Bugfixes | kuemerle.com