Get a free assessment and gain valuable insights into your unstructured data to increase efficiency, minimize risk, and cut costs across your organization.
Every business has at least one thing in common—a lot of data. No matter what industry you are in, the fact of the matter is that you handle information on a daily basis, accumulating records and files of many different types. One of the trickier categories of data to deal with is unstructured data. Unlike structured data, information that is formatted and contained in tools like databases, unstructured data is information that is often found in individual files, stored in a variety of locations. Due to this, unstructured data is difficult to search, organize, and analyze. For example, unstructured data can look like:
- Live chat records
- Customer-generated content
- E-books and whitepapers
- Email correspondence
- Internal communications
- Media and multimedia (images, audio, video)
- Medical records
- Mobile data
- Social media content
- Text files
- Web server logs
- Website content
Unstructured data encompasses many different important types of information that are often very valuable and prone to security risks. The process of sensitive data discovery, where you search and identify information that should be restricted from unauthorized access, is vital. Without it, your business could run afoul of governmental regulations as well as compromising the privacy and security of your stakeholders. And then, once identified, how do you sort and organize the unstructured data in order to continue to know what is sensitive data and what is not?
Data classification levels are the answer to this question. Using these practices, you can establish an appropriate system and mitigate much of the risk of storing unstructured data.
Data classification is a way of defining and categorizing unstructured data for a business. Rather than storing files haphazardly in an unsecured way, data classification gives you the framework to sort, organize, and store unstructured data into an appropriate system.
Imagine that you have just moved from one apartment to another. When packing, you didn’t label or organize any of the boxes. Once you’ve arrived at your new apartment, it’s time to unpack. But how do you know which boxes to put in which room? Not only that, but how can you delegate the appropriate boxes to the right person? After all, setting up your kitchen to work for you is important—if everyone is pitching and helping to unpack, your brother might end up with a box full of cooking utensils that he won’t know how to put in a new space to suit your workflow.
There are two problems that surface when you start looking at data classification:
- How should you “pack” or store data? What data should be organized together?
- Who should be allowed to access each “box” or be able to use which pieces of data?
Data classification can help answer these questions, giving greater access to and control over unstructured data.
What Are Data Classification Standards?
Data classification standards are the definitions of the data categories that a business will use. There are different sets of standards that can be established depending on the business and the type of data that needs to be classified. One of the most common approaches is classification based on data sensitivity—essentially, who should or should not be allowed to access the data.
Classification standards based on sensitivity levels most commonly use either a three-level or four-level model, depending on organizational complexity, regulatory requirements, and risk tolerance.
While data classification models vary, most organizations adopt either a three-level or four-level classification structure. The difference is not about right or wrong, but about granularity. Three-level models prioritize simplicity, while four-level models provide greater precision and control—particularly for organizations managing regulated or high-risk data. Understanding both approaches helps organizations select a framework that aligns with their operational and compliance needs.
What Are the 3 Data Classification Levels?
Organizations classify data to understand risk, control access, and apply the right level of protection. While there’s no single universal standard, most data classification frameworks fall into one of two models: a simplified three-level model or a more granular four-level enterprise model. The three-level approach focuses on basic sensitivity and is often easier to adopt, while the four-level model adds clearer distinctions for organizations with stricter compliance, governance, or security requirements. Below, we’ll break down both models and when each one makes the most sense.
The 3 Data Classification Levels (Simple Sensitivity Model)
When data is classified with three levels, it is typically organized into:
Low sensitivity data:
This is information that is meant for anyone to access and use. For example, your business’ social media pages are filled with low sensitivity data. The public is welcome to engage with and consume this level of information, as there is nothing being shared that poses a security threat.
Medium sensitivity data:
Data that falls into this category should be considered for internal use only. While it doesn’t include highly confidential information, it still encompasses data that could be potentially harmful if accessed by unauthorized individuals. Inter-office emails or internal memos often fall into this category.
High sensitivity data:
This classification level encompasses business-critical data and customer-specific details. If this data were compromised, it could have serious business impacts. High sensitivity data may include:
-
Personal Identifiable Information (PII): Any data that identifies an individual, such as name, address, social security number, or phone number.
-
Protected Health Information (PHI): Identifiable health information, such as medical records or billing information.
-
Nonpublic Personal Information (NPI): Personally identifiable financial information, such as bank account numbers.
-
Material Non-Public Information (MNPI): Company data not released to the public, such as upcoming mergers or acquisitions.
-
Confidential, regulated, or high-risk business information: Proprietary data such as patents or intellectual property, which may overlap with other categories.
What Are the 4 Data Classification Levels? (Expanded Enterprise Model)
Another way to establish data classification standards is to use four levels of classification, a structure commonly adopted by organizations that require clearer separation between internal, confidential, and highly restricted data.
Public data:
Information that can be viewed, accessed, and used by anyone. This aligns closely with low sensitivity data.
Internal data:
Information used internally within an organization. This generally correlates with medium sensitivity data.
Confidential data:
Data intended for internal use and often limited to specific teams, individuals, or departments due to its sensitive nature. This category may include medium or high sensitivity data.
Restricted data:
The highest level of sensitivity. Access is tightly controlled and limited only to individuals who require the data to perform their job functions.
3-Level vs. 4-Level Data Classification Models
| Feature | 3-Level Classification Model | 4-Level Classification Model |
|---|---|---|
| Overall approach | Simplified sensitivity-based structure | More granular, enterprise-oriented structure |
| Primary goal | Ease of use and quick adoption | Greater control and risk differentiation |
| Typical levels | Low, Medium, High | Public, Internal, Confidential, Restricted |
| Best suited for | Smaller organizations or high-level frameworks | Mid-to-large organizations and regulated environments |
| Regulatory alignment | Common in government and legacy models | Common in GDPR and modern compliance programs |
| Access control precision | Broad access groupings | Clear separation of access and permissions |
| Risk management | Basic risk segmentation | Stronger protection for high-risk data |
| Scalability | Limited as data volume grows | Scales well with data growth and complexity |
What Is a Data Classification Framework?
A data classification framework is another way to describe a specific approach to setting data classification levels. This is often used interchangeably with data classification standards. Different frameworks may be adopted based on industry requirements or regulatory bodies, including those established by government entities.
There are many real-world examples of data classification practices in action, particularly within regulated environments.
GDPR data classification levels
The European Union’s General Data Protection Regulation (GDPR) came into effect in 2018, impacting privacy and data protection practices globally. GDPR-aligned data classification commonly uses a four-level model: public, internal, confidential, and restricted. In addition to classification, GDPR requires organizations to delete data that is unnecessary or no longer in use, making it essential to understand what types of unstructured data exist within the organization.
US government data classification levels
The U.S. government provides a well-known example of a three-level data classification model:
-
Confidential: Unauthorized disclosure could cause damage to national security.
-
Secret: Unauthorized disclosure could cause serious damage to national security.
-
Top Secret: Unauthorized disclosure could cause exceptionally grave damage to national security.
These levels align conceptually with the three-level classification approach, though the overall sensitivity threshold is significantly higher.
HIPAA data classification levels
The Health Insurance Portability and Accountability Act (HIPAA) is legislation designed to protect individuals’ health information. Unlike other frameworks, HIPAA does not mandate specific classification levels. Instead, it requires organizations to group data according to sensitivity and determine appropriate controls based on risk.
The primary reason to establish a data classification policy is to protect data security and privacy. Without clear classification, organizations may struggle to understand where sensitive data lives or how it should be protected. These gaps can lead to risks such as:
-
Data silos
-
Unauthorized access or mishandling of sensitive data
-
Regulatory fines and penalties
-
Legal exposure due to data breaches
Assess your unstructured data risk
Get a free data sensitivity assessment and uncover the risks that may be lurking within your unstructured data.
A strong data classification policy starts by answering a few foundational questions:
What types of data are you collecting, processing, and storing?
Every industry has specific regulations governing data usage. Understanding these requirements is critical to building an effective policy.
What data do you already have, and who will conduct the data discovery process?
Before implementing classification standards, organizations need a clear picture of their existing data landscape.
Who should be able to access specific information?
Use sensitivity and access categories—public, internal, confidential, and restricted—to guide access controls and policy decisions.
With these questions answered, organizations can develop consistent rules for reliable data classification. Many data classification tools are available to help automate and enforce these policies.
DryvIQ empowers companies with the tools they need to classify, migrate, and manage unstructured data. Our AI-driven platform helps uncover hidden risks by analyzing, classifying, labeling, and cataloging unstructured data with accuracy, speed, and scale.
See a demo to get insights into data discovery and classification, sensitivity findings with vulnerability and risk by category, and recommended next steps. Get started with DryvIQ to discover and secure your sensitive unstructured data.