Exploring best practices for data in use and data in motion
Author: Karen Madera
Organizations collect a tremendous amount of data from a variety of sources and any of these data sources could potentially contain sensitive data. Data is often relocated for warehousing, reporting, analytics, storage, testing, and application use, therefore data or AI models could potentially be copied multiple times over, resulting in misuse of sensitive data. Gartner accurately predicted that backup and archiving of personal data represents the largest area of privacy risk for 70 percent of organizations, up from 10 percent in 2018.
Emergence of newer technology platforms such as cloud and data lakes can actually exacerbate the issue further and organizations often feel a natural tension between data governance, security and innovation, when a well-governed, secure environment can actually spur innovation and lead to an increase in organizational productivity. In order to understand the amount of sensitive data living across the organization and mitigate associated risks, it is important to examine the entire data landscape to ensure all regulatory requirements towards its lifestyle and correct usage are met.
The data lifecycle should be managed from creation to disposal, and everything in between. There are three main states in which data exists across an enterprise: at rest, in motion, and in use. While most practitioners have a clear understanding of how to protect data at rest (i.e. inactive data that is stored digitally typically in databases), the other two states require more complex strategies.
Data in motion: Moving from a source to destination
When data is in the state of motion it is in the process of moving within or between information systems. As enterprises move to cloud, adopt big data technologies and introduce disparate tools from multiple vendors, complexity rises in managing a centralized view.
Data in motion is vulnerable to ransomware attacks and data breaches. Most often encryption is used to help make data unusable in the event it is hacked or stolen. Think of it as the first and last line of defense that can help protect your data from full exposure.
There are steps you can take to protect data in motion. A good place to start is identifying what data needs to be protected and where it is located. Customer and financial data are obvious choices for encryption, but many companies fail to realize that even older, seemingly non-critical data can contain sensitive information, partly because the definition of what constitutes personally identifiable information (PII) has broadened considerably in the last decade.
Controlling and monitoring data access and activity represents an important part of any data protection strategy. It’s something that organizations need to balance with frictionless access to data.
Dive deeper: Explore a smarter, integrated approach to data security. Read smartpaper.
Data in use: Access and preparation of data across applications
There are many situations in which data is considered to be in use. This includes when it is being processed by applications, when being transformed or manipulated, and when it is being viewed by enterprise users. The primary goal in governing data in use is to minimize the likelihood of data misuse across the enterprise.
As more departments within the organization express the need to manage and access data, information leaders need to focus on streamlining data operations with increased efficiency, data quality, findability and governing rules in order to provide an efficient, self-service data pipeline to the right people at the right time from any source.
At the heart of their strategy for data in motion, typically lies a data catalog. This tool allows organizations to create and automate policies for enterprise-wide categorizing and classifying all company data, no matter where it resides, in order to ensure that the appropriate data protection measures are applied while data remains at rest and triggered when data classified as sensitive is accessed, used, or transferred. Additional capabilities like data masking, user-based access controls for discovery, and risk assessment of unstructured data are also critical to implement for a robust approach to data in motion.
Dive deeper: Read a comprehensive guide to the modern data catalog. Open eBook.
Learn More
Organizations are using data governance and security not only to accelerate analytical processing and insights, but also for positive compliance with the regulations they face. While the data volumes are extensive, machine learning and artificial intelligence practices are helping to overcome the limits of human scale in such tasks as data mapping, activity monitoring, cataloging, matching large data volumes and sustaining data quality.
IBM is committed to help clients deliver at these operations at scale to cover millions of data assets with a unified privacy framework via IBM Cloud Pak for Data, which helps integrate tools like IBM Watson Knowledge Catalog and IBM Security Guardium.
Register for the upcoming workshop, Build your Business Case for Data Privacy on June 17th. IBM Data & AI experts will take you through a design thinking exercise where you’ll brainstorm your initial privacy use case for data and AI models. Examples will also be provided of how firms are seeing success with data discovery, data governance, continuous auditing and more to simplify privacy reporting, data protection and risk management to ensure compliance.
Learn more about IBM Cloud Pak for Data, by visiting https://www.ibm.com/products/cloud-pak-for-data
Learn more about IBM Security Guardium by visiting www.ibm.com/security/data-security/guardium
#governance#DataProtection #datasecurity#data-privacy #AI