Enterprises across industry segments are moving IT workloads and functions to the cloud, frequently ahead of any strategy or consistent capability to secure sensitive data. The advantages of cloud migration, such as scale, agility, and consumption-based pricing, are compelling and seem to outweigh the risks in the short term. Most enterprise IT today is hybrid, with some workloads in the cloud and some hosted within the enterprise datacenter. Many are adopting a “cloud-first” or “cloud-only” approach for all new IT functions and business. Due to a combination of decentralized IT functions, frequent mergers and acquisitions, and shadow IT, most enterprises are multi-cloud, leveraging more than one cloud service provider (CSP). Data security is rarely the first consideration for the selection of a CSP. The emergence of strict new data privacy regulations, such as GDPR and CCPA, is driving the need for CISOs to more effectively address data protection and data governance in complex and geographically-diverse hybrid IT ecosystems. The terms pseudonymization and anonymization are now common in the context of these privacy regulations when it comes to data protection and privacy. While pseudonymization of data still allows for some form of re-identification (even indirect and remote), while anonymization of data cannot be re-identified. CISOs look to the CSPs for data security solutions to address these privacy requirements but struggle with the confusing array of security models and services they offer. CSPs offer native key management, encryption, and Hardware Security Module (HSM) services. These security services have typically been added as a layer on top of their existing stacks; after-thoughts from a late recognition of their customers’ increasing data security concerns, and are not enterprise-grade. As most enterprises are also multi-cloud, the challenges inherent in CSP security offerings include deficiencies in uniformity, homogeneity, coverage, customer control and ownership, functionality, scalability, performance, visibility, and more. On top of these, there are broader challenges with key management, and vendor lock-in. In this article, we describe the various data-centric security offerings of the “Big Three” CSPs—Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. We make an attempt to objectively reflect on what their data security services entail, based on the published documentation made publicly available by the CSPs. It also outlines what enterprises should be aware of prior to consuming these services in the context of their belated yet increasing capabilities in the data-centric security space. It also needs to be noted that the information captured is a point-in-time assessment and is subject to change as the CSPs continue to enhance and expand their services. This article is intended towards any and all audience who deal with data security and cloud security, executives, hands-on IT and Security professionals, and anyone with a passion or interest in cybersecurity in general, across enterprises and service providers.
As enterprises transition from being just compliant to being secure, they must focus on data-centric security services that keep their sensitive data protected persistently—while at rest, in transit, and in use—rather than server-side or transparent encryption services across storage and databases, which offer very little actual security. The good news is that even the CSPs have realized the increasing need for data-centric security, and have started to offer new capabilities in this space. CSPs offer two kinds of cloud cryptographic (crypto) services to enable the implementation of data-centric security:
This whitepaper also touches upon the Google Cloud DLP service, which is a composite service that detects sensitive data and applies policies on the detected data.
Key broker and key management services typically expose an API for managing keys and secrets. The premise of key management or brokerage across all of the big three CSPs is the use of the Master Key and Working Key model. The Master Key, usually referred to as the Customer Master Key (CMK), never leaves the KMS application, and is not used to protect sensitive data in bulk. It is typically used to generate Working Keys and/or to encrypt Working Keys or other secrets, and thus serves as a Key Encryption Key (KEK). Working Keys are Data Encryption Keys (DEKs), and are used by applications to encrypt/decrypt actual sensitive data. AWS and GCP use symmetric (AES-256) CMKs, but Azure uses only asymmetric (RSA-2048, -3072, -4096) key pairs, storing the private keys in their KMS.CMKs may either be software-managed or stored inside a FIPS-compliant HSM controlled by the CSP. There are different models of Master Key management in terms of customer control and visibility:
Cloud HSM is a service through which keys are generated by, and stored within, FIPS 140-2-compliant HSMs that are hosted and managed by the CSP. This model allows higher throughput than the KMS-based model of encryption. These HSMs offer a subset of the PKCS#11 standard API specifications, which are exposed either directly or through the KMS interface to take advantage of the other cloud services integrations existing with the KMS.An important caveat is that these CSP crypto services are available in specific physical locations, referred to Regions. Even when these services are available, cross-region integrations and availability of keys across CSP regions are also not guaranteed. Some CSPs do not specify their level of FIPS 140-2 compliance.
While these KMSs are used to generate, store, protect, and retrieve encryption keys, it is important to understand the mechanism of application-level data encryption implemented and supported at these CSPs. CSPs implement envelope encryption, which is the practice of encrypting plaintext data with a working key (a DEK), and then encrypting the DEK with a master key (the CMK). CSPs typically offer software development kits (SDKs) that are used by the application to perform envelope encryption.
The encryption process works like this:
Fig. 1: AWS Envelope Encryption
The decryption process works like this:
Fig. 2: AWS Envelope Decryption
CSPs also allow customers to import their own key material. This “Bring Your Own Key” (BYOK) model lets customers generate the keys themselves (typically using on-premises HSMs) and upload them to the CSP KMS. Customers are usually required to download a certificate from the CSP, along with an import token. The symmetric keys generated by the customer are encrypted using the public key that is bound to the downloaded certificate. The encrypted symmetric key(s) plus the CSP token or a hash of the key material are then uploaded to the CSP KMS. The tokens/hashes are used for authentication and integrity purposes. In some BYOK implementations, the CSP requires padding and Base64-encoding encrypted key(s) prior to upload.
Fig. 3: Bring Your Own Key Sequence
Google Cloud Data Loss Prevention (DLP) provides APIs for sensitive data inspection, classification, and de-identification. It includes a number of built-in information type detectors, and allows definition of custom detectors. It offers de-identification techniques including redaction, masking, format-preserving encryption, and date-shifting as optional actions to be taken on detected sensitive data within streams of data, structured text, files in storage repositories such as Google Cloud Storage and BigQuery, and even within images. The keys for data redaction are either:
Figure 4: Google Cloud DLP
A number of issues and challenges around scale, availability, portability, performance, and security of these CSP crypto services should be considered.
Home Key and Secrets Management Bring Your Own Key Exploring Cloud Service Providers' Crypto and Key Management Services Importance and Advantages of Format Preserving Data Protection Recommendations for Implementing the Right Cloud Crypto and Key Management Solution Online Shopping Security in the Age of COVID-19 Published Articles and Press Releases Videos