The Ultimate Manual For Data Governance
Today, organizations worldwide are collecting massive amounts of data from multiple sources. How this data is managed, stored, and shared will be defined by data governance best practices.
What is Data Governance Anyway?
Data governance is a system that defines how an organization handles data. It’s the set of policies and procedures explaining who has authority and control over different types of data and how a company’s data assets can be used.
It’s worth noting that data governance is not the same as data management, although the two terms are often confused. Data governance is a subtopic within the overall scope of data management.
The implementation of data architecture and toolsets, for example, falls into the data management category, whereas data governance refers to the big-picture strategy that informs and directs that implementation.
At its core, data governance consists of official rules, policies, processes, and regulations that determine the way an organization collects, stores, and shares data at scale.
How Data Governance Works
The overall purpose of data governance is to create a consistent approach to managing data assets company-wide. There are several different reasons for implementing this type of framework, including regulatory concerns, compliance, and data security.
Certain people within a company won’t have access to all of the organization’s data—whether it be customer data, department financials, sales data, or something else.
For specific types of data, internal and external users may only be able to access part of the data while the rest remains hidden or confidential.
Data privacy and data protection worldwide have been under closer scrutiny lately as new regulations are forcing companies to crack down on the way sensitive data is protected. The GDPR (General Data Protection Regulation) in Europe and the CCPA (California Consumer Privacy Act) are two prime examples, as is PCI compliance for credit card processing and HIPAA compliance for patient health data in the medical industry.
To fully understand how data governance works, you must be able to apply these principles—often referred to as the data governance framework:
- Transparency — All data governance practices and procedures used by an organization must be transparent. Be open with stakeholders, customers, and users about how the company is collecting and using data.
- Integrity — Data governance stakeholders must be honest and truthful whenever discussing or using data. This includes ethical data collection best practices compliant with regulations like the GDPR and CCPA.
- Auditability — All data-related processes within the scope of data governance can and will be audited to ensure it’s being used properly for a specific purpose.
- Stewardship — Data stewards will make sure all regulations related to data governance have been implemented appropriately.
- Accountability — Data governance will determine how data is used for different roles and policies. Cross-functional data will be assigned to stakeholders with different terms to ensure everyone can access the data they need appropriately.
- Standardization — Standardized procedures will be implemented. The usage of data for different use cases will all follow the same standard governance process.
- Change Management — Change management policies must be introduced to facilitate the way all master data and metadata assets are used.
Data governance encompasses all aspects of your data management strategy, including architecture, storage, data warehousing, business intelligence, metadata, data security, data modeling, and more.
Many organizations already have some form of governance used to manage access for data, even if it hasn’t been formalized. Data governance takes that to the next level by establishing systematic and formal control over all data assets and responsibilities.
Example 1: Open Source Projects
Open source projects can be used or modified by anyone. This can be challenging to manage without data governance policies in place.
One of the best examples of this in the real world is OpenStreetMap (OSM), an open-source project that launched back in 2004.
The site provides map data for thousands of websites, apps, and hardware systems worldwide. It’s used by organizations like Apple, Facebook, Uber, Craigslist, Snapchat, Foursquare, Amazon Logistics, and more.
All of the data from the site comes from millions of contributors worldwide. These contributors use different technology to maintain data about trails, roads, railway stations, coffee shops, and more.
The only way for a project at this scale to be successful is with data governance policies built-in from the beginning.
OSM’s product is the data—not the map. This concept only works if the contributors adhere to OSM’s data governance standards. But by standardizing the data, this platform has had massive success over the years.
Example 2: Data Compliance and Privacy
As previously mentioned, there are certain instances where data security practices must be applied. One of those examples is the GDPR (General Data Protection Regulation) imposed by the European Union.
If you fall within this jurisdiction (or any jurisdiction with data privacy laws), you must have data privacy baked into your governance process.
In terms of GDPR, personal identifiers that can be used to identify a natural person must be protected when data is collected, stored, and shared. So if your company is collecting customer data or employee data, certain pieces of information must be hidden or removed in order for you to stay compliant.
Let’s say you’re collecting customer data using POS software. If you’re going to share that information with your marketing team or a third-party analyst, the customer’s name, address, phone number, and other personal information must be removed or altered.
These types of regulations control the way you store data as well, even if you’re not sharing it. Removing personal identifiers and storing them separately from the rest of the data helps conceal personal identities in the event of a hack or data breach.
For companies that store credit card data, primary account numbers must be encrypted to remain PCI (payment cayman card industry) compliant. This encryption process must be built into the data governance policies.
How to Get Started With Data Governance
Data governance is a broad term that can be applied to a wide range of use cases. Regardless of your business type or industry, here’s how you can implement data governance for your unique situation:
Step 1: Establish Your Data Governance Goals
The first thing you need to do is evaluate your current position and policies for managing data. Has any part of the process been formalized? Is there an informal process?
From here, you can map out your goals for why you’re implementing data governance in the first place. Once you have a clear end game, it will be easier to get through the remaining steps.
Potential goals associated with data governance include:
- Comply with regulatory requirements and avoid penalties by safely storing data
- Improve security policies by establishing ownership and responsibilities
- Boost data monetization efforts by collecting and storing data in a more optimal way
- Define data distribution policies for both internal and external entities
- Make data-driven business choices using governed data assets
- Improve planning by not having to restructure data for each unique purpose
- Optimize employee productivity by giving them greater access to data
- Create internal rules for data usage
- Reduce costs associated with data management
- Improve your confidence associated with the quality of data collected
The list goes on and on. This just barely scratches the surface in terms of what can be accomplished with proper data governance.
Some of you might want to implement data governance for more than one of these reasons. If you fall into that category, my best advice would be to start small.
Trying to implement all of this at once will be a disaster. So start with the must-haves, like compliance-related data governance, and scale from there.
Step 2: Identify the Data Governance Roles
Every organization will handle data governance slightly differently. But these are some of the common roles and responsibilities that are typically associated with this process:
Data Owners
Data owners, also known as data sponsors, are the individuals within an organization who will make decisions related to how the company handles data. It’s also the data owners’ job to enforce these decisions.
Owners and sponsors must approve data definitions, ensure the accuracy of data, and review all of the master data policies. They’ll have input with what type of software or regulatory requirements will be used for data governance as well.
Data Stewards
Data stewards handle the day-to-day tasks of managing organizational data.
It’s their job to report data to the owners, sponsors, and stakeholders. Data stewards typically work cross-functionally through different departments of a business to ensure all data is maintained properly at each level.
It’s common for data stewards to be the subject matter experts of different data entities.
Data Custodians
Data custodians, also known as data operators, are responsible for the technical onboarding and maintenance of data assets. They’re responsible for managing the data life cycle as well.
Data Governance Committee
Also known as the steering committee, the governance committee is typically comprised of the C-suite, senior management, and high-level executives within an enterprise.
It’s the committee’s job to set the overall standards and strategy associated with data governance. They will identify specific outcomes and goals that will be passed along to the data stewards for day-to-day operations.
Data Governance Team
In addition to the high-level roles mentioned above, many large-scale organizations will have a data governance team as well. Here are the basic roles and responsibilities of those team members:
- Data Managers — Leads the implementation and maintenance of master data control across the entire organization.
- Data Architects — Assists and provides insight into the design and implementation methods of data governance.
- Data Analysts — Uses trends, patterns, and analytics to review the information in data assets.
- Data Strategists — Creates and executes plans for trend analysis, often working alongside data analysts.
- Data Compliance Specialists — Ensures the company is adhering to all legal standards and regulatory requirements for data compliance.
Again, these role titles and responsibilities will vary from business to business. Some of you may have more, less, or something slightly different.
The key aspect of this step is to clearly define those roles and responsibilities. Outline a process for how these different roles interact and who is held accountable for what. There should be no confusion on what’s expected by everyone associated with managing data governance after this step is completed.
Step 3: Identify Potential Challenges For Data Governance
Data governance can be complicated, especially at the enterprise level. One of the best ways to eliminate complexities or reduce the chance of failure is by identifying potential roadblocks before you get started.
Once these have been identified, it will help you steer clear of them during the implementation and day-to-day usage for data governance.
Integrating data governance with your big-picture IT governance policies is a common hurdle that most businesses face. Both of these initiatives must be coherent and work in conjunction with each other if you want to be successful.
Getting the end-users on board with your policy changes can be difficult too. Lots of employees don’t like change. You may need to offer incentives to motivate the staff company-wide to follow your new data governance policies.
Step 4: Apply Data Governance Best Practices
Once the planning process is over, it’s time to implement your new data governance processes. It’s a big step, and you should follow certain tips and best practices to ensure things go smoothly.
Here are a handful of considerations to keep in mind:
- Establish formatting standards for your data.
- Use technology to enforce formatting standards and data integrations from multiple sources.
- Classify your data and tag everything with metadata.
- Track KPIs to determine how your data is being used.
- Automate data requests, approvals, permission requests, and everything else possible in your workflow to ensure data governance initiatives don’t slow down your operation.
- Don’t forget about unstructured and unmanaged data in your archives.
Data governance is dynamic—it’s not a set-it-and-forget-it initiative. Be prepared to make changes and continuous improvements after the initial implementation.
As previously mentioned, you probably can’t achieve all of your goals at once. So start with one or two and scale from there.