Data risk management in the age of explosive data growth

hero

Data risk management is now more important than ever

Unstructured data represents approximately 80% of all enterprise data. It’s the fastest-growing form of business data, growing at a rate of 50% each year, and a fast-growing challenge for today’s enterprise.

It’s nearly impossible for organizations to gain a true understanding of what they have and where it lives. The reality is that the unstructured data within corporate files can potentially contain substantial compliance, privacy, and legal risks. From corporate secrets to personally identifiable information (PII) of customers and employees, risk resides deep within the files stored in many places throughout every enterprise.

Organizations need a new approach to data risk management – one that provides the ability to identify and protect unstructured data at scale. This guide provides a deeper understanding of data risk, how it is assessed, how organizations can continuously manage data risk, and what this new era of data management means for IT teams and the business.

Data Risk Management

Download this info as a whitepaper


Redefining data risk for today’s hyper-awareness of data protection

Data risk is traditionally defined as the exposure to loss of value or reputation caused by issues or limitations to an organization’s ability to acquire, store, transform, move, and use its data assets. With the rapid evolution of data protection regulations like the GDPR and CCPA, the data risk definition needs to evolve accordingly. A more current definition of data risk is the exposure to significant financial damage due to data loss, business downtime, regulation non-compliance, or loss of reputation due to poor data security and/or data privacy practices. Fundamentally, data risk is financial risk.

The current estimate for the amount of data created daily is 1145 petabytes per day, with the number expected to grow to 463,000 petabytes (or 463 exabytes) daily by 2025. Enterprises are becoming increasingly overwhelmed with the amount of data being produced. Understanding this data is difficult due to the extreme volume, but also environment complexity. That data exists on employee hard drives, in enterprise productivity solutions, in cloud storage, in on-premises storage, or in a host of partners or third party providers like SaaS solutions. It is impossible for IT teams to properly identify, categorize, and take appropriate actions to safeguard data at this scale without automation.

Why data is getting harder to manage?

What is the financial risk of data risk?

Not having a comprehensive understanding of enterprise data brings a host of negative outcomes that comprise data risk. If you don’t have full control over your data, then you aren’t fully compliant with data privacy regulations. This also means that critical data may not be secured properly, which brings the added risk of damage to the business from data theft, corruption, or loss.

Here are the most common data risks that are caused from lack of data management and their associated financial risks:

DATA RISK FINANCIAL RISK
Not knowing where personally identifiable information (PII) exists across systems Will cost money from resource effort needed to attempt to find data when it is requested to be pulled on an individual for regulations such as the California Consumer Privacy Act (CCPA). Additionally, this increases the risk of having to pay fines for non-compliance. Even unintentional violations could cost organizations $2,500 per infraction.
Having important known or unknown data not secured properly If this data is stolen, corrupted, or encrypted it can cause a host of financial damage including: business downtime, regulatory fines, loss of reputation/revenue, and costs to repair breach
Not able to provide a clear and comprehensive picture of data protection to auditors The latest Data Protection Laws of the World handbook contains 923 pages of data security and privacy regulations active today. And that number is growing. Enterprises need to be able to confidently supply the proof auditors need to show compliance or face significant fines. The GDPR alone has fined an estimated 1,000 companies in the past 2 years with cumulative fines of $1.25 billion.
Having too much dark, unused data There is an inherent belief that data has value. This is why we stopped deleting emails and why organizations archive instead of deleting data. The risk here is twofold. First, with the explosion of data production, this is costing companies money to store it, either on-premises or in the cloud. The second risk is the category of data breach and regulatory non-compliance. Just because you aren’t using the data, doesn’t mean it can’t hurt your business if it is exposed—particularly if it contains PII.
Data used by employees is not properly permissioned/secure The biggest threat to data risk is not malicious actors, it’s actually your employees. IT sets permissioning rules, but employees find workarounds in order to get their jobs done. They don’t intentionally mean to cause harm by sharing a payroll list to a third party, but the end result is a privacy violation which can cause reputation damage and regulatory fines.

What organizations need from data risk management

Traditional enterprise data management solutions have been around for decades and most businesses have solutions that help them construct and maintain a framework for ingesting, storing, mining, and archiving the data integral to their business. Unfortunately, there are two key components missing from these legacy solutions—unstructured data and risk. Companies need better ways to identify unstructured data and get a holistic picture of data risk that allows them to take action to mitigate that risk. A new type of solution is required for data risk management that encompasses the needs of today’s complex data environments.

Key capabilities that enterprises need to manage data risk

1. The ability to see into their unstructured data – in any system

Organizations have a lot of data and not all of it is known, let alone visible for risk analysis. Dark data is typically referred to as all of the unused and/or unknown data that is generated by an enterprise. This accounts for everything from server log files to unstructured data created by apps. Unstructured data is defined as information that is not arranged according to a pre-set data model. Examples of unstructured data includes texts, email, video, photos, webpages, audio files and many business documents. Basically, 80 to 90 percent of the data businesses generate and collect is unstructured.

Unstructured data is difficult for IT professionals to find, to identify what the data is and whether PII is contained in the data, let alone what the risk is of keeping that data where it is currently located. If enterprises are going to stay compliant with data privacy regulations, they need better visibility and risk assessment of their data—including dark and unstructured data.

The first aspect to doing this is having the ability to discover unstructured data in any storage system, whether in the cloud, on-premises, or on an employee desktop. It is critical that the data risk management solution is fully integrated across your environment for this purpose. This helps eliminate disparate data management tools and have one source of truth.

The second aspect to gaining data visibility is Artificial intelligence (AI) that brings data science to risk management. The AI can look into dark or unstructured data and accurately identify what it is and classify its risk. The key feature to look for here is an AI technology that is pre-trained, meaning that it already knows how to identify standard data and document types like invoices or resumes. This makes the solution magnitudes more powerful and provides a faster time to value than a solution that has to be trained in your environment before you can use it.

2. Help to stay current with data protection regulations

Data privacy regulations are being introduced around the world at a rapid pace, while decisions in lawsuits like Schrems II are resulting in new interpretations of existing regulations. There are industry-specific regulations as well as country-specific regulations that may be applicable depending upon where you do business. Staying current with these hundreds of regulations is very difficult, but keeping data systems compliant is even more unreasonable without the help of technology.

Enterprises need an unstructured data management solution that stays current with all of these regulations and identifies data risk against these regulations. Furthermore, it is important for prioritization that the solution also calculates financial risk. If companies have two different areas of data risk, they need to know which holds the greatest potential cost. This also means that instead of using a data management tool for occasional auditing, they need a solution that is continuously monitoring for any changes or newly created data to alert to risk in real-time.

3. Better controls to prevent human error and automate protection

Data risk management tows a fine line between data protection and business productivity. If you lock a system down too tightly with limited permissions, then employees can’t get their jobs done. Too loose and you’re non-compliant. Another challenge is enforcing guidelines if users have complete control over data access. IT teams need a data management solution that helps them to better control data access in an automated way, so IT can focus on keeping employees productive without worrying about regulatory compliance.

Beyond automating data access controls, IT also wants a solution that can manage data transfer to appropriate storage locations based on classification of the data. For instance, critical data with lots of IP or personal information needs to be transferred to a highly secure cloud object storage bucket. Having this action automated once the data risk is assessed is highly valuable to IT teams and to mitigating financial risk in real-time.

4. Reduce data storage costs

As enterprise content volume grows, so does the cost of storage—on-premises and in the cloud. This is becoming a concerning line item for IT teams and any capability that can allow them to reduce data risk while cutting costs is a big win. Having the capability to confidently use an unstructured data management solution and its AI capabilities to identify and classify data that is truly not useful to the organization allows IT to appropriately treat that data as cold and archive it or to delete it altogether. Both reduce risk and storage costs.

5. Handle reorganization events

In large enterprises, reorganizations are common and require data risk management, but nothing is more disruptive than a merger or acquisition. Enterprises want the ability to use their unstructured data management solution to evaluate the data from the organization being merged into theirs. They also want the ability to automate the data migration, including retagging, labeling and categorization of the data to fit into the data model in use. This is critical when making an acquisition where the data is less secure or less compliant to ensure you don’t take on additional risk. It is even more critical for a carve-out where you have to be able to identify the enterprise content you need in order to separate it from the rest of the business. Having this capability makes their data management solution truly holistic to the needs of an enterprise in managing their ever-changing data.

Download this info as a whitepaper


Managing data risk in today’s enterprise involves using an intelligent data management solution to gain a better understanding of the data the enterprise has and help automate action to properly eliminate or secure that data in accordance with data privacy and security regulations.

The 3 key functions of data risk management explained

1. Unstructured data discovery

In order to better understand an organization’s data, enterprise data management solutions will automatically discover unstructured data properties such as:

  • File type (like pdf or jpg)
  • File size
  • File location (storage system and file hierarchy)
  • Permissions

In order to do this discovery, the solution must be integrated across all enterprise content platforms and storage systems. This provides an overview of what kind of data you have, where it lives and who has access.

Next the solution will automatically classify your content by looking into your data to determine what it is and if it contains sensitive information. Pre-trained AI is used to review and compare your unstructured data to known data types using advanced pattern matching. It can identify what it is and search the content for sensitive information that needs to be flagged with a qualitative risk (high, medium, low) for security purposes. It can then apply metadata, document classification, or other identifying tags or labels to unstructured data.

Intelligent data risk management includes the classification of:

  • Document type (resume, W-2, invoice, etc.)
  • PII including names, ages, addresses, dates of birth, phone numbers, social security numbers, banking information, etc.
  • More than 5,000 standard government forms
  • Foreign language detection
  • Data attributes unique to a specific organization 

2. Enterprise file migration

Data migration can be a part of regular data hygiene as you automate the transfer of unstructured data to the appropriate storage location based on its classification or as part of larger project to move a massive volume of data from one system to another. Enterprise data management solutions have both of these capabilities and can automate continuous content migration or a large-scale migration.

Enterprise data management helps you make intelligent migration decisions by helping you answer questions like these:

  • How much content are we managing? Where should we be migrating it?
  • What types of documents do I have? How sensitive is it? Where is it?
  • Who has access to my content? Is my content vulnerable?
  • If end users have been manually labeling content, are they doing it properly?
  • How can we stay on top of the rapid growth in unstructured data we’re facing?

We're the content migration experts

Learn more about DryvIQ’s approach to intelligent, high-scale file migration.

Learn more

3. Unstructured data governance

Advanced data management solutions can look at the data classification and actually calculate financial risk. This involves some configuration to agree on the variables and assumptions in the calculation, but essentially it assigns a value to each content type and matches that up the the financial liability of that data being lost or exposed. It will calculate only the liability for regulations that your business needs to adhere to based on industry and regions of operation. Organizations can use this information to balance their financial risk with the cost to control, manage, and mitigate that risk. 

Once data risk has been assessed, actions can be taken in order to safeguard unstructured data and mitigate risk. Data risk management can be automated by taking these actions to modify or transfer data appropriately, through configured workflows based on risk level and other data properties.

Actions that can be triggered by data risk assessment:

  • Flag or re-classify the data
  • Transfer to a new storage location
  • Quarantine the data for a set period of time
  • Change permissions/provisioning
  • Archive or delete data

Actions can be automated at a file level, but the information on data risk can also be used to make larger storage and security decisions to improve data security and protect data privacy.

Perhaps the most interesting innovation in data governance is leveraging artificial intelligence processing to continuously monitor enterprise content for sensitive data or incorrectly applied classification labels that may expose the business to financial risk from data loss or non-compliance. The results of the monitoring from enterprise data management solutions are typically presented in a risk assessment dashboard or reports that reveal any sensitive or vulnerable data found, its financial risk, and any automated actions taken to mitigate that risk.

This capability provides continuous oversight to:

  • Ensure data is classified properly for both protection and discoverability
  • Ensure data is labeled properly so that DLP policies can be applied
  • Ensure that permissions are set correctly so that sensitive data is not exposed to unauthorized use

Data risk management also needs to stay current with data privacy regulations to be able to show compliance and financial risk associated with each regulation. This is incredibly powerful for meeting the needs of regulatory audits that provide exactly the information they need for the specific regulation. Advanced data management solutions will stay current with data privacy regulations so IT has the confidence that they understand and are mitigating actual risks.


How the role of IT changes with automated data risk management

Today’s IT teams often take a project approach to data risk management. They utilize traditional data management tools to audit the data they can see and manually create reports for internal audits or to supply to regulatory auditors. They have a point person who contacts all the different data owners across business units. Each data owner gathers their known information and sends it to the point person who does their best to make sense of it and put it in a unified format. And when an official compliance audit happens, this project essentially becomes a fire drill.

Not only does this process waste the time of many valuable resources, but it also isn’t fully accurate—both because it isn’t comprehensive in scope and because the information isn’t real-time.

With enterprise data management that focuses on risk mitigation, IT and data owners can change the job from gathering information and creating reports to instead focus on maintaining their system and continually lowering risk. The unstructured data management solution can automatically provide holistic and granular reporting that satisfies auditors with proof of regulatory compliance and improvement.

With data risk management IT spends their time:

  • Reviewing automated reporting
  • Easily providing real-time reports to auditors
  • Responding to data risk increases
  • Making improvements to lower data risk

The benefits of continuous unstructured data risk management

Having continuous unstructured data risk management as an operational part of IT brings a host of powerful benefits. The most basic being that having the visibility into and understanding of an enterprise’s data allows IT to do their jobs better. Whether that be protecting the important IP of their company or the privacy of their customers, they can do it with confidence.

Maintain data compliance Lower risk of data loss Lower risk of regulatory fees Reduce operational costs Happier IT employees
This includes being able to quickly adapt to new regulations or court interpretations as well as to easily provide information needed for audits. Being able to identify private data and secure it appropriately mitigates the threat of data loss. It also brings investor confidence in their cyber resiliency. With evidence of efforts to maintain and improve compliance with data regulations, should an issue occur, financial penalties are much lower. Better managing unstructured data saves money in storage costs and by freeing up the time of data owners from running manual reports. Both at senior and doer levels, IT can go back to what they do best—managing data operations and optimizing to continue to improve.

Download this info as a whitepaper