In the big data era, data receives more and more attention. Deep integration of big data and artificial intelligence (AI) has produced a profound and widespread impact on all walks of life, including government, finance, carriers, electricity, and the Internet. In addition, the circulation and release of data value have further promoted the development of economy and productivity. However, the opportunities brought by data are accompanied by security challenges. Recent years have witnessed increasingly severe security issues, such as frequent large-scale data breaches, “big data-driven price differentiation”, data discrimination, illegal collection of personal information, and privacy theft. These issues have caused a negative impact on and harm to the society. As a response to the challenges, a wave of legislation on data security and privacy has been set off around the world, and regulatory supervision has been continuously strengthened. The European Union implemented the General Data Protection Regulation (GDPR) in 2018[i], the USA adopted the California Consumer Privacy Act (CCPA) in 2020[ii], and Japan passed the revised Personal Information Protection Law in June 2020. In July and October 2020, China successively released two draft regulations, i.e. the Data Security Law (Draft) and the Personal Information Protection Law (Draft). Compliance has become an important driving force for enterprise data security governance and development. To facilitate the understanding of data security evolution, the paper divides data security into two stages, i.e. data security development with non-compliant requirements and data security development with compliant requirements, which are also called Data Security 1.0 and 2.0.
Data Security 1.0
When enough attention is paid to the value of data, enterprises begin to engage in data security development. At this stage, they regard data security development as active, data assets-driven, and low-cost data security protection. The objects of enterprise data security protection are the sensitive data and important assets defined by enterprises. For instance, enterprises take such traditional cybersecurity control measures as encryption, access control, auditing, and database firewalls to protect the data from various asset databases. Therefore, data security at this stage is usually categorized into a branch of cybersecurity. Besides, enterprises tend to use transparent encryption and data loss protection (DLP) products to protect sensitive documents and core technical materials, in a bid to prevent internal personnel from illegally copying them and hackers from stealing them.
Figure 1 Data Security 1.0: data security protection from the perspective of critical enterprise assets
Data Security 2.0
As relevant regulations are gradually adopted and improved, data security is no longer just an internal issue handled by enterprises secretly, and it is also a regulatory issue involving the participation of national regulatory authorities. Data security at the current stage (Data Security 2.0) can be considered as an upgraded version of Data Security 1.0. While satisfying the internal data security requirements of enterprises, enterprise data security development must comply to compliance clauses in laws and regulations. The process is mainly driven by passive compliance and supplemented by proactive data security development. Additionally, it involves high costs. Compared with Data Security 1.0, Data Security 2.0 has gone through the following radical changes:
- Data Security 2.0 protects a wider range of data objects. At the Data Security 1.0 stage, enterprises value data security protection of critical assets and focus on protecting sensitive enterprise data and a small quantity of personal private data, such as a user’s login password. At the Data Security 2.0 stage, enterprises need to protect three types of sensitive data, including sensitive enterprise data, personal private data, and sensitive national data, as shown in the following figure. Personal private data and sensitive national data are protected and managed in accordance with laws and regulations. Both GDPR and CCPA make clear that personal private data of citizens shall be protected. Note that the personal data/personal information defined by regulations includes not only the basic personal information in the traditional sense, such as the ID card number, mobile phone number, and address, but also the device’s IP address, MAC address, and cookie information. It involves a broad range, which poses huge challenges to enterprises in the identification and protection of sensitive data.
Figure 2 Objects and category of sensitive data covered by data security
- Data Security 2.0 covers a greater variety of application scenarios. In the Data Security 1.0 era, enterprises orient data protection towards critical assets such as databases and confidential documents. In the Data Security 2.0 era, regulations apply to not only database and document data but also personal privacy data and important data as defined by regulations for big data platforms, clouds, and terminals and in documents and new environments including evolving 5G and blockchains, as shown in the following figure.
Figure 3 Various environments covered by data security
- Data Security 2.0 spans across the lifecycle of data. In the Data Security 1.0 era, data security is primarily oriented to data transmission and storage processes. In the Data Security 2.0 era, data security spans across the lifecycle of data, including data collection, transmission, storage, processing, exchange, and destruction, as shown in the following figure. For instance, in data collection, we must obey the compliance principles of minimum availability and asking for users’ permission for data security development.
Figure 4 Lifecycle of data security
Due to multiple reasons in compliance, business growth, and data scale, from Data Security 1.0 to Data Security 2.0, data security categories have been greatly expanded, and single-point data security development has become systematic. According to Gartner, systematic data security development is a process of data security governance[iii] in that it extends across the organization from top to bottom, from decision-making to technology, and from management systems to tool support. Various levels within an organization need to reach a consensus on the goals of data security governance and ensure that reasonable and appropriate measures are taken to protect digital assets in the most effective ways.
With global data security regulations continuously strengthened, compliance has to be taken into consideration in enterprise data security efforts. It’s safe to say that compliance has become an important driving force for enterprise data security work and governance. However, laws and regulations have imposed more comprehensive and restrictive data security requirements on enterprises, which brings unprecedented challenges to traditional data security technologies and products.
According to the application scenario and data distribution of enterprises, data security compliance scenarios can be divided into security and compliance of user privacy data, data security governance within enterprises, and data sharing and computing between enterprises.
- Security and compliance of user privacy data: In scenarios where enterprises interact with users, it is a must to guarantee data security and privacy compliance. Specifically, this includes multiple subscenarios, such as privacy protection in data collection, governance and visualization of personal information, and user data rights request responses (right of access, right to erasure, right to restriction of processing, etc.).
- Data security governance within enterprises: In the internal network environment of enterprises, it is necessary to protect and monitor sensitive and important data in storage and in use. This includes multiple subscenarios, such as identification and classification of sensitive data, assessment of residual risks in masked data, and detection of abnormal data operations.
- Data sharing and computing between enterprises: If data sharing and computing tasks are accomplished between two or more enterprises, the enterprises should ensure data and privacy security while meeting business needs. To be specific, this includes multiple subscenarios, such as release and sharing of personal data, secure storage and computing of data on clouds, secure sharing and computing of multiparty data, and joint AI modeling of multiparty data security.
Each subscenario of the preceding three types of scenarios not only has its own internal security requirements, but also imposes strict compliance requirements, which can be mapped to the compliance provisions of GDPR. In order to cope with security and compliance challenges posed by these scenarios, you can adopt 10 cutting-edge technologies, such as differential privacy, homomorphic encryption, and secure multiparty computation, as shown in the following figure.
Figure 5 Go beyond compliance: data security scenarios-graph of cutting-edge technologies
We are going to elaborate on how to use cutting-edge technologies in the three scenarios respectively to help enterprises embrace and go beyond compliance. Stay tuned!
[i] General Data Protection Regulation (GDPR). https://eur-lex.europa.eu/legal-content/EN/TXT/? uri=uriserv: OJ.L_.2016.119.01.0001.01.ENG&toc=OJ:L:2016:119:TOC
[ii] California Consumer Privacy Ac. https://cal-privacy.com/.
[iii] Gartner Summit 2019. Outlook for Data Security 2019