Approaching the new General Data Protection Regulation (GDPR), effective from May 2018, companies based in Europe or having personal data of people residing in Europe, are struggling to find their most valuable assets in the organization – their sensitive data.
The new regulation requires organizations to prevent any data breach of personally identifiable information (PII) and to delete any data if some individual requests to do so. After removing all PII data, the companies will need to prove that it has been thoroughly removed to that person and to the authorities.
Most companies today understand their obligation to demonstrate accountability and compliance, and consequently started preparing for the new regulation.
There is so much information out there about ways to protect your sensitive data, so much that one can be overwhelmed and start pointing into different directions, hoping to precisely strike the target. If you plan your data governance ahead, you can nevertheless reach the deadline and avoid penalties.
Some organizations, mostly edges, insurance companies and manufacturers possess an enormous amount of data, as they are producing data at an accelerated speed, by changing, saving and sharing files, consequently creating terabytes and already petabytes of data. The difficulty for these kind of firms is finding their sensitive data in millions of files, in structured and unstructured data, which is unfortunately in most situations, an impossible mission to do.
The following personal identification data, is classified as PII under the definition used by the National Institute of Standards and Technology (NIST):
o complete name
o Home address
o Email address
o National identification number
o Passport number
o IP address (when connected, but not PII by itself in US)
o means registration plate number
o Driver’s license number
o confront, fingerprints, or handwriting
o Credit card numbers
o Digital identity
o Date of birth
o Genetic information
o Telephone number
o Login name, screen name, nickname, or manager
Most organizations who possess PII of European citizens, require detecting and protecting against any PII data breaches, and deleting PII (often referred to as the right to be forgotten) from the company’s data. The Official Journal of the European Union: Regulation (EU) 2016/679 Of the European parliament and of the council of 27 April 2016 has stated:
“The supervisory authorities should monitor the application of the provisions pursuant to this regulation and contribute to its consistent application throughout the Union, in order to protect natural persons in relation to the processing of their personal data and to ease the free flow of personal data within the internal market. “
In order to permit the companies who possess PII of European citizens to ease a free flow of PII within the European market, they need to be able to clarify their data and categorize it according to the sensitivity level of their organizational policy.
They define the flow of data and the markets challenges as follows:
“Rapid technological developments and globalization have brought new challenges for the protection of personal data. The extent of the collection and sharing of personal data has increased considerably. Technology allows both private companies and public authorities to make use of personal data on an unheard of extent in order to pursue their activities. Natural persons increasingly make personal information obtainable publicly and globally. Technology has transformed both the economy and social life, and should further ease the free flow of personal data within the Union and the move to third countries and international organizations, while ensuring a high level of the protection of personal data.”
Phase 1 – Data Detection
So, the first step that needs to be taken is creating a data lineage which will permit to understand where their PII data is thrown across the organization, and will help the decision makers to detect specific types of data. The EU recommends obtaining an automated technology that can manager large amounts of data, by automatically scanning it. No matter how large your team is, this is not a project that can be handled manually when facing millions of different types of files hidden I various areas: in the cloud, storages and on premises desktops.
The main concern for these types of organizations is that if they are not able to prevent data breaches, they will not be compliant with the new EU GDPR regulation and may confront heavy penalties.
They need to appoint specific employees that will be responsible for the complete course of action such as a Data Protection Officer (DPO) who mainly handles the technological solutions, a Chief Information Governance Officer (CIGO), usually it’s a lawyer who is responsible for the compliance, and/or a Compliance Risk Officer (CRO). This person needs to be able to control the complete course of action from end to end, and to be able to provide the management and the authorities with complete transparency.
“The controller should give particular consideration to the character of the personal data, the purpose and duration of the hypothesizedv processing operation or operations, in addition as the situation in the country of origin, the third country and the country of final destination, and should provide appropriate safeguards to protect basic rights and freedoms of natural persons with regard to the processing of their personal data.”
The PII data can be found in all types of files, not only in PDF’s and text documents, but it can also be found in image documents- for example a scanned check, a CAD/CAM file which can contain the IP of a product, a secret sketch, code or binary file etc.’. The shared technologies today can extract data out of files which makes the data hidden in text, easy to be found, but the rest of the files which in some organizations such as manufacturing may possess most of the sensitive data in image files. These types of files can’t be precisely detected, and without the right technology that is able to detect PII data in other file formats than text, one can easily miss this important information and cause the organization an substantial damage.
Phase 2 – Data Categorization
This stage consists of data mining actions behind the scenes, produced by an automated system. The DPO/controller or the information security decision maker needs to decide if to track a certain data, block the data, or send alerts of a data breach. In order to perform these actions, he needs to view his data in separate categories.
Categorizing structured and unstructured data, requires complete identification of the data while maintaining scalability – effectively scanning all database without “boiling the ocean”.
The DPO is also required to continue data visibility across multiple supplies, and to quickly present all files related to a certain person according to specific entities such as: name, D.O.B., credit card number, social security number, telephone, email address etc.
In case of a data breach, the DPO shall directly report to the highest management level of the controller or the processor, or to the Information security officer which will be responsible to report this breach to the applicable authorities.
The EU GDPR article 33, requires reporting this breach to the authorities within 72 hours.
Once the DPO identifies the data, he’s next step should be labeling/tagging the files according to the sensitivity level defined by the organization.
As part of meeting regulatory compliance, the organizations files need to be precisely tagged so that these files can be tracked on premises and already when shared outside the organization.
Phase 3 – Knowledge
Once the data is tagged, you can map personal information across networks and systems, both structured and unstructured and it can easily be tracked, allowing organizations to protect their sensitive data and permit their end users to safely use and proportion files, consequently enhancing data loss prevention.
Another aspect that needs to be considered, is protecting sensitive information from insider threats – employees that try to steal sensitive data such as credit cards, contact lists etc. or manipulate the data to gain some assistance. These types of actions are hard to detect on time without an automated tracking.
These time-consuming responsibilities apply to most organizations, arousing them to search for efficient ways to gain insights from their enterprise data so that they can base their decisions upon.
The ability to analyze inherent data patterns, helps organization get a better vision of their enterprise data and to point out to specific threats.
Integrating an encryption technology enables the controller to effectively track and monitor data, and by implementing internal physical segregation system, he can create a data geo-fencing by personal data segregation definitions, cross geo’s / domains, and reports on sharing violation once that rule breaks. Using this combination of technologies, the controller can permit the employees to securely send messages across the organization, between the right departments and out of the organization without being over confined.
Phase 4 – Artificial Intelligence (AI)
After scanning the data, tagging and tracking it, a higher value for the organization is the ability to automatically screen outlier behavior of sensitive data and cause protection measures in order to prevent these events to evolve into a data breach incident. This progressive technology is known as “Artificial Intelligence” (AI). Here the AI function is usually comprised of strong pattern recognition part and learning mechanism in order to permit the machine to take these decisions or at the minimum recommend the data protection officer on preferred course of action. This intelligence is measured by its ability to get wiser from every examine and user input or changes in data cartography. ultimately, the AI function build the organizations’ digital footprint that becomes the basic inner between the raw data and the business flows around data protection, compliance and data management.