MongoDB sponsored this post.
While data privacy laws are not new, it was the European Union’s (EU) General Data Protection Regulation (GDPR) introduced in 2018 that raised the bar in personal data protection. GDPR not only made protection and privacy of individuals a legal obligation placed on organizations collecting and processing personal data, but also entrenched data privacy as a fundamental human right of all individuals in the EU. Since the GDPR’s introduction, other state and federal governments around the world have started to enact similar regulations, including the new California Consumer Privacy Act (CCPA).
The CCPA — which took effect Jan. 1, 2020 — borrows many core concepts from the GDPR. It will also likely serve as a catalyst globally, given California’s size as the world’s fifth-largest economy and its home to companies that are some of the largest collectors of personal data.
While there are discrepancies when comparing the CCPA to GDPR, commonalities include defined requirements and controls that govern how organizations collect, store, process, retain and share the personal data of individuals. Given that the GDPR and CCPA are both founded on the principle of granting individuals greater protection and understanding of how their data is being collected and used, a database — like MongoDB — can become a strategic tool in complying with these regulations.
Mapping Privacy Controls to Required Database Capabilities
As with any data security regulation, enabling controls in a database storing personal data is just one step towards compliance — people and processes also are essential. There are, however, specific requirements stated in the GDPR text and similar regulations that define a set of controls organizations need to implement across their data management landscape. We can group these requirements into three areas:
- Discover: scope data subjects (individuals) and types of personal data that are subject to the regulation.
- Defend: implement measures to protect discovered personal data.
- Detect: identify a breach that impacts that personal data and remediate security and process gaps.
Before implementing security controls, an organization first needs to identify:
- Types of personal data stored in its databases.
- Who the data subjects are.
- The purposes of processing that data.
- Who the data is shared with and/or where it is transferred.
- For how long the organization will retain that data.
It is therefore important to have access to tools that enable the data controller to quickly and conveniently review database content and — as part of an ongoing discovery process — to inspect what additional data will be captured as new services develop. The controller needs to record where all of the data is stored to meet the record of processing obligations of GDPR Article 30 as well as other obligations such as responding to a data subject request within the timeframes defined by GDPR and CCPA.
Once the organization has completed the Discovery phase, it needs to implement the requisite security controls needed to protect personal data. Using encryption to obscure personal data is an industry-standard method of security, often paired with cybersecurity tooling to actively monitor your network activity.
Encrypting databases isn’t new, but previously there have been limitations on where and when data is encrypted. Traditionally, data is encrypted on the “server-side,” meaning anyone with full access credentials — such as a database administrator (DBA) or site reliability engineer — could decrypt and access all of the data. That set up leaves you with unintentional risk.
To address that risk, MongoDB 4.2 introduced client-side field-level encryption: the capability to selectively encrypt and decrypt individual document fields in an application before data is sent to the database. This not only provides further protection for sensitive personal data, it also allows you to easily comply with obligations to erase personal data under GDPR and CCPA. Upon receipt of a “request to be forgotten” from a data subject, or upon expiration of a documented retention period for personal data, you can simply dispose of your private key associated with the encrypted personal data and the data is rendered useless. As a result, the personal data can be quickly removed from all storage, logs and backups.
GDPR and CCPA emphasize the importance of ensuring that only authorized users can access personal data. Traditionally, there has always been a theoretical conflict of trust within the database layer industry-wide because a database administrator (DBA) or operator would have access to personal data. As explained above, client-side field-level encryption in MongoDB 4.2 can assist with this.
Within the database, it should be possible to enforce authentication controls so that only clients (e.g., users, applications, administrators) authorized by the data controller can access personal data. The database should also allow data controllers to define the specific roles, responsibilities and duties each client can perform against the personal data. For example, some clients may be permitted to read all of the source data collected on a data subject, while others may only have permission to access aggregated data that contains no reference back to personal identifiers. This approach permits fine-grained segregation of duties and privileges for each data processor and is a way to practice defense in depth.
Future of Data Sovereignty in the Cloud
One problem with using legacy infrastructure is that it only allows you to store data in one location, which can be an obstacle to achieving data sovereignty, not to mention low-latency performance. However, modern distributed database systems in the cloud provide more granular controls helping you pinpoint the exact data center you want to utilize for your application.
While it can be challenging for businesses to comply with data privacy regulations on their own, organizations can choose their cloud provider and move their data to a set storage location by using a feature called Atlas Global Clusters in their database layer. After you select the location, MongoDB moves the data automatically in the background without impacting your application.
While these regulations can appear onerous, forward-thinking companies are using them to spur innovation in how they manage customer data and transform their interactions with customers, i.e. identifying all personal data helps them do things like build a single pane of glass view of their customers. Complying with this regulation isn’t just a necessity, it will help you better serve your customers.
Feature image by from Pixabay.