Improving the security posture of your Google Workspace environment - Data Classification

:next_track_button: Editor’s Note: This is the 3rd edition of Marcin’s series on securing your Workspace environment. Be sure to check out his articles discussing DNS records and BeyondCorp Enterprise.

![AXfLAJsrbRM9np2.png|1640x902](upload://7p2eVgoL41sTzTMwfJWosm2eqZ.jpeg)

In today’s world, enterprise companies generate and manage huge amounts of data every day. We share the documents with coworkers or external entities, that contain data ranges from need-to-know information, to critical confidential files. In some cases data might get out of control, and bring us a lot of challenges. One effective strategy for managing the collaboration data is data classification.

In this article we will explore Data Classification related features that we can use to classify the data within Google Workspace environments, we will go through the functionalities that can help us to classify the data at scale, and example use cases. Before we will do that, it’s important to understand, why you might even want to classify the data?

Why should we classify the data?

Data Classification is an important part of managing and protecting information within each modern organization, companies however, might have different reasons for classifying, let’s take a look at some of them.

Security and Compliance
When the data is classified and categorized (e.g. based on its sensitivity level) it’s much easier to control it. The companies are able to implement appropriate security measures for categories, for example sensitive data such as customer credit card numbers can be given higher security controls compared to less critical data. Categorization also aids in compliance with various regulations such as GDPR, HIPAA and others, which mandate strict handling and protection of sensitive information.

Data Management & operational efficiency
Classified data is effective to manage - the end user can search for specific information based on its category or sensitivity, while a litigation team applies retention rules based on certain classification fields.

Companies can make better decisions when they identify the patterns, when the data is classified, auditing capabilities help us to learn more about how and which data is used.

Risk Mitigation
Data Classification plays a crucial role in risk management. By identifying which data is sensitive, companies can apply different security policies and measures, to protect against data breaches or over-sharing.

Data can be classified as a result of data loss prevention action, when the PII data is found, a rule can prevent users from certain activities - such as sharing the file outside of the company.

Data Governance
Effective Data Governance relies on a structured approach to data management, where the data classification is a foundational element, that ensures that policies and procedures of data handling are consistently applied across the organization.

Google Workspace features for Data Classification

In this section we will take a look at the functionalities that can be used for data classification within Google Workspace environments, we will explore different use cases, and features that help admins to classify the data at scale.

Google Drive Labels

The main functionality of Google Workspace that enables users for file classification Is Drive Labels.

As administrators we define classification labels to apply to files that are stored in Drive. The primary purpose of labels is to store metadata of files, these can be simple, like one value tag to store department information, or they can have many structured fields that include selections, dates, numbers or categories - depending on your company needs.

Drive Labels has different use cases, including:

  • Data classification to follow an information governance strategy, by using Sensitivity label, we can restrict access to files marked as “Confidential” when Labels are used as conditions for DLP rules.
  • Apply policy to items. To meet compliance requirements such as handling PII data, we can automatically label documents, or apply retention policies to the labeled files.
  • Improved search, end-users can find files easier by searching using labels fields.

Different types of classification

Companies are classifying data in different ways, Workspace offers flexibility in applying classification, depending on the requirements we can use one or a combination of a few different methods.

Manual Classification
Users who have labels provided, can classify the files manually, either by applying badge label or metadata label. Labels can help them to find the files of specific categories easier and faster. End user might be required to pick an option for every new document, In such cases, they are seeing a notification banner.

DLP Classification
Data Loss prevention rules can automatically label the drive files based on the findings (e.g. PII data detected). Workspace DLP offers a variety of predefined content detectors, and possibility to use custom detectors (e.g. Regex based).

Default Classification
Admins can set policy, to automatically set labels on files created in certain departments. In such configuration every newly created file is getting classified, e.g. files owned by financial teams are getting sensitive labels by default. Such labels can be later adjusted by the end-user if we allow for that.

Programmatic Classification
Drive Labels offers an API that can be used to classify the data at scale. Customers are utilizing such APIs to apply classification labels in bulk or integrate the feature with 3rd party solutions.

AI Classification
Customers who are using Gemini Enterprise and AI Security add-on, can benefit from AI classification. This feature uses artificial intelligence to automatically label sensitive content. The customer goes through an initial training period, during which the AI model is created and learns organization’s criteria for content to label. AI classification then classifies the files at scale, across all licensed users (both new and existing files).

Example use case

This example configuration, prevents the end-users from sharing the documents classified as ‘Sensitive’ and ‘Confidential’ outside of organization. Sensitivity labels can be applied either manually, automatically or as a result of the DLP rule detection.

  1. Navigate to Apps > Google Workspace > Drive and Docs > Labels > Manage Labels

  1. Select ‘Badged label’ and configure the options as required. In this example we give options to the end-users to pick the right document sensitivity. When the label is prepared, publish it and adjust the permissions as needed in the right corner of the screen.

  1. When the label is ready, we can configure the Data Protection Rule. Navigate to the Rules, select Create Rule > Data protection. Name the Rule, and select the scope*.*

  1. Select Google Drive under Apps.

  1. In the conditions fields, select previously created Drive Labels and field options that you want to restrict.

  1. Select the action to block external sharing. Define the alert severity and notifications. Save the Rule.

  1. To test the blocking mechanism, navigate to the Google Docs, create a test document and apply previously created test label.

8.When you try to share the document to an external recipient, you will be prevented by the DLP rule.

In conclusion, Google Workspace provides various data classification options to help organizations protect sensitive information and ensure compliance with regulations. By understanding the different classification labels and using the available tools, organizations can effectively manage and control access to their data.

Lastly, you should remember that the data classification is an ongoing process that requires regular review and updates to ensure continued protection. By leveraging Google Workspace’s data classification capabilities, organizations can safeguard their sensitive information, enhance security, and maintain trust with their stakeholders.

10 Likes

Well explained.

2 Likes

@Marcin_Milewski You should be in Harvard, Nicely explained. Thanks that was helpful

2 Likes

Is there something similar for Gmail, because that’s where I really need it.

Thnak you, interesting.

This is an excellent introduction to enhancing the security of Google Workspace. I am grateful for the practical counsel on harmonizing security with user experience, a frequently neglected aspect. The focus on context-aware access, including the ability to restrict login based on location, is particularly salient in the context of today’s increasingly mobile workforce.

One particularly intriguing aspect is the discussion on Data Loss Prevention (DLP). Although the post effectively emphasizes the significance of this topic, I am intrigued by its practical implementation within the context of complex data workflows. To what extent can Data Loss Prevention (DLP) policies be customized to align with the specific requirements of diverse departments or project teams, ensuring authorized data sharing while preventing sensitive data leakage? As an illustration, how might the necessity of a marketing team to disseminate customer data for advertising purposes be reconciled with the exacting confidentiality requirements of the legal department?

Moreover, as the reliance on third-party applications integrated with the workspace continues to grow, the necessity of securing the ecosystem increases concurrently. While the blog post does address this issue, I believe it merits further investigation. How might administrators most effectively oversee the permissions granted to these third-party apps to ensure they align with and uphold the organization’s established security standards? It would be beneficial to understand whether there are tools within Workspace that are capable of auditing the data access and activity of the aforementioned integrated apps.

Furthermore, in light of the increasing sophistication of phishing attacks, it is imperative to devote greater attention to this subject. In addition to basic email filtering, which protective measures are currently in place within the Workspace system to identify and prevent advanced threats such as phishing attacks? For instance, are machine learning techniques being used to detect phishing emails or is there real-time analysis of email headers and URLs? It would be beneficial to understand how organizations can leverage these features to effectively combat spear-phishing and business email compromise (BEC) attacks targeting their employees. Adding, the practical application of these security measures and how they can be fine-tuned to create a robust yet flexible security posture within Workspace would be advantageous to ascertain. Any additional information or resources that could be provided to facilitate this understanding would be invaluable.

Nice overview of the capabilities available in GWS to manage data, thanks for that.

An important feature, well explained.

Does this already work with email, e.g., don’t allow files with a certain label to be attached or sent?

Hey @JoachimKoch the Classification Labels for Gmail are available in open beta - Help Center

Hope this helps!
Best,
Marcin