By Leila Toplic, Head of Emerging Technologies Initiative, NetHope. In collaboration with James Eaton-Lee (Oxfam), Florian Edelmaier (SOS Children’s Villages), Pallavi Garg (PATH), Molly Hrudka (MSI Reproductive Choices), Kate Keator (The Carter Center), Bo Percival (Humanitarian OpenStreetMaps Team), Joel Urbanowicz (Catholic Relief Services), and Joff Williams (Mercy Ships).
As digital transformation accelerates, the volume of available data is continuing to grow. As a result, there are now a number of important issues for nonprofit organizations to consider, from data utilization and management to data protection and security. Good data hygiene is key to data maturity in the nonprofit sector.
Every day, 2.5 quintillion bytes of data is generated globally. According to IDC, the amount of data created over the next three years will be more than the data created over the past three decades. Yet, despite the availability, much of this data remains underutilized – or not used at all. One of the reasons is poor data hygiene.1
Nonprofits are striving to be more data-driven in their approach to delivering on their missions, and the ability to leverage data responsibly and meaningfully is one of the most critical aspects of digital transformation. The value2 of becoming a data-driven nonprofit is significant. The better you manage and apply data, the more your programs and services improve, serving more people in need responsibly and delivering more value to those you’re already supporting. That in turn improves future programs and helps organizations predict future needs, enabling them to access essential support from donors and partners.
But how do we get there?
Over the past couple months, we’ve hosted a round of consultations with data experts at several INGOs about how nonprofits can make sure their data is accurate, credible, and properly utilized. What follows are insights on the key considerations and steps that can help the nonprofit sector better manage and utilize data.
While there are various definitions of data hygiene, for the purposes of this post we’ll use a definition from one of NetHope’s Members: “Data hygiene is a set of principles and processes we follow to ensure the data we collect, assess, report, and analyze is of the highest quality.” In essence, data hygiene is the process of getting to data that we can trust and can use to make good decisions.
Most nonprofit organizations will be the first to admit that there is room for improvement – they still struggle to realize the value of data and their challenges extend beyond the mechanics of data hygiene. Consultations with INGOs revealed that the biggest barriers3 to data maturity are cultural and operational: a lack of trust surrounding quality (ie, accuracy and credibility), flow, and use of data; an inability to operationalize data to use it strategically (due to skills, process4 and incentives); and an absent organization-wide data strategy and intentionality. As a result, there is a gap between available data and the ability to use it effectively.
The message from the consultations is clear: to unlock the critical value of data for the benefit of our society and the planet, the data we’re working with must be high quality (ie, clean, accurate, credible, useful). Working to achieve good data hygiene is a key step in making responsible and impactful decisions and being good data stewards.
On that basis, we have identified four considerations and six steps for nonprofits to make data hygiene achievable. We start from two premises. First, nonprofits have a responsibility to know what data they have, where it’s stored, who uses it, and how. And second, to do this successfully, nonprofits must have people, processes, and technology components working together and across the whole organization to ensure that all steps in data management (including data hygiene) are sound.
Here are four considerations for building an enabling environment for good data hygiene:
1. Build and operationalize data culture as a systematic process
This process begins at the top – with leaders who promote and model data-driven decision making and create the conditions for data-driven behavior across the whole organization. Building an organizational culture that values data is imperative, and attempts at becoming data driven will be won or lost by the culture that has been created.
Nonprofits need to integrate data across their whole organization in a systematic, consistent, and responsible way backed by processes, policies, reinforcements, and incentives. As one of the nonprofit practitioners noted: “Good practices are there. They’re not secrets. But they are not being adopted consistently.” Also, “We’re rich in documentation in terms of what should be done, but the reinforcement is lacking.”
How: As a first step, it’s important to get people on board – the right level of awareness and understanding of the value of data for nonprofit work is critical. Next, create systemic conditions for individuals and teams to use data in their own work consistently. In the organizations with a centralized governance model that may mean having all of the stakeholders on the same page regarding data by following the same processes, complying with the same set of policies, and functioning as one. This requires a number of things, such as: accountability and support model (eg, shared service model); policy and governance; technology (with some focus on what can be automated to save time and increase consistency); and resources/funding, with incentives and resourcing at the local level to clearly assert data usage as a top priority. In the organizations with distributed, federated, and ‘open ecosystem’ governance models, building and operationalizing data culture might require ‘designing-for-open-boundaries’ (ie, building tools and resources for public consumption rather than just for internal users) and determining what capacity and resourcing models, metrics, and governance levers are appropriate for building/advancing data culture in your organization.
2. Develop data capacity
Realizing the value of data requires people with skills and time to teach those involved how to do a number of things well, including how to gather high-quality data from different sources, how to tag and de-identify data, clean the data, and manage and use it responsibly.
How: Invest in the capacity of both technical and non-technical teams. Reinforce the training through frequent engagement and practical application (eg, team check-ins, individual performance reviews, planning sessions). If your implementation is partner-driven, you will need to invest in the capacity of local partners to unlock collaboration and knowledge sharing across the whole ecosystem.
3. Center around the need
The data you collect should be the right data for the problem you need to solve and connected with your organization’s mission. As highlighted in the consultations – being clear from the outset why they are collecting data and what they intend to do with it is also critical for INGOs’ compliance with GDPR.
How: Start with the right questions, such as: What is the problem you are trying to solve? What does success look like? Who benefits from the data you’re collecting? What could be the unintended consequences?
4. Put it into practice: use data responsibly to make decisions
For nonprofits to be able to use data to diagnose the effectiveness of their programs and interventions, uncover needs and opportunities, or assess risk, data-driven decision making needs to become a key design principle in any nonprofit initiative and program development. Furthermore, if we want to embrace data-driven decision making, with all its tremendous benefits, we must do so responsibly – considering the very real risks too and being intentional about understanding how data may be misused and abused and take action to minimize the negative consequences.
Operationalizing effective use of data across the whole organization can seem like a daunting task, but it is an important one. It’s through meaningful, responsible, and systemic use of data that nonprofits will chart the course for how their services and programs are built, implemented, and resourced.
How: Create and operationalize a workflow for decision making centered on making the data usage a top priority. Design the workflow to enable you to scan for ethical risks, to use data in conjunction with human expertise, and to iterate by measuring and incorporating feedback from key stakeholders in the data lifecycle. Good data governance (ie, how you collect, validate, store, organize, protect and maintain the data) is an iterative, agile learning process that embeds feedback loops on the validity and efficacy of data governance policies and processes.
Additionally, keep data at the top of everyone’s mind, every day, by building habits. For example, at one of NetHope’s nonprofit Members, reports are automatically emailed to people regularly, and data responsibilities are built into job descriptions and performance reviews (monthly data review meetings). Above all, understand how to make data work for the people that need support the most.
In summary, to create a viable enabling environment for data hygiene, we recommend to push for specificity on the alignment between data and need, connect leadership imperatives with the realities on the ground, and invest in sustainable capacity and resources.
While creating an enabling environment for data hygiene in your organization is important, you don’t need to have every part of this in place before you can start establishing data hygiene practices. But creating this kind of environment will help your whole organization become more data-minded and, as you take the steps below, it will help create more sustainable data practices.
Before we get into the practical steps, it’s important to first understand what challenges nonprofits face when it comes to data hygiene. Challenges include: freshness (and lifecycle), accuracy, timeliness, legibility, duplication, integrity, validation, visibility, fraud and governance5. There are a number of reasons for these challenges including where nonprofits
are in their digital transformation journey and how quickly they’re able to scale up digital solutions globally. For example, at one of NetHope’s nonprofit Members, most clinical records are still captured on paper which limits their ability to efficiently use the data to improve clinical quality and client experience.
To address data hygiene issues, nonprofits are doing a number of things, including: conducting user training focused on data literacy, developing and deploying methodologies for collecting, organizing and verifying data, conducting data audits (a step in understanding the quality of your data), and hiring dedicated resources. For example, one of NetHope’s nonprofit Members has a full-time Data Integrity Informatics role. This role is responsible for defining data validation standards and developing the tools for site-level spot checks (ie, diagnostics of data quality).
To get to data you can trust and use responsibly to make good decisions, here is a brief overview of several steps and resources to consider. We plan to build on this work in the coming months and curate a toolkit of resources and trainings for nonprofit organizations.
Data strategy: Determine the purpose of data by facilitating development of a data strategy. Your data strategy framework may include: data architecture, data analysis, data governance, and data management. | Examples of questions NetHope Members recommend asking: What are we hoping to use the data for? Should this data be re-used or connected and in what contexts? Does this data (need to) relate to other data in our org or ecosystem? Examples of tools NetHope Members are using: Excel, Word, Miro, Plectica; Data strategy, data lifecycle and data architecture frameworks (eg, see The Carter Center’s data strategy, data lifecycle and data architecture frameworks). | |||
Data exploration: Start with an audit and assessment of your existing data. Assess for representativeness to mitigate the risk of bias. | Examples of questions NetHope Members recommend asking: What’s the state of the data we have? What data can we access? How is it organized? What is the right quality and format of data? Examples of tools NetHope Members are using: To assess what data you have and how it’s organized, consider using pivot tables, Power BI, Qlik Sense. | |||
Data cleaning: Define what constitutes clean data (ie, set the key indicators) and cleanse the data. Cleansing data includes several steps such as: data verification or validation, cleaning up duplicates, addressing incomplete data (eg, either complete incomplete data or delete incomplete records), normalizing/formatting data to a common value. | Examples of questions NetHope Members recommend asking: What principles and indicators do we need to ensure to prevent common data quality issues? How do we balance data integrity with timelines as well as being able to correct errors? Examples of tools NetHope Members are using: OpenRefine, R to automate data cleaning, PowerQuery and Mcode. Also, Business rules and Data Dictionaries. | |||
Data collection and validation: Get additional data and validate that it’s useful and meets the standard of quality for the project. | Examples of questions NetHope Members recommend asking: What data do we need? Is it already being collected elsewhere? How often is the data collected? Does the data meet the standard of quality for the project? Examples of tools NetHope Members are using: Survey tools, REDCap Cloud, ODK, Kobo Toolbox, M365 Power Platform, Proprietary tools (eg, TCC developed their own survey tools - Neemo and Elmo). Examples of enablers to consider: Data literacy of those who are collecting and analyzing data. Consider that people have different context and understanding of data use. | |||
Data management and governance: Build a data governance program to ensure that everyone involved in managing data throughout the entire lifecycle has a consistent set of guidelines and processes. | Examples of questions NetHope Members recommend asking: How do we ensure that data can be discovered and used by different people? What is the data ownership approach (eg, do you have permission to use the data)? Examples of tools NetHope Members are using: Azure data lake, Neo4J (graph database), data management plans. | |||
Data protection and security: Develop a strategy, policies, and processes, and secure resources for data protection and security. | Examples of questions NetHope Members recommend asking: What is our responsible data strategy? What security policies need to be in place for each of the data sources? Examples of tools NetHope Members are using: CIS Controls, Data Privacy Impact Assessments screening form (eg, Mercy Ships' Privacy Impact Assessment Template and Global Financials - Data Privacy Impact Assessment). Examples of enablers to consider: Responsible data management and security training (eg, Oxfam’s Responsible Data Management Training Pack). |
In closing. For a digital nonprofit, it is more important than ever to level up your data maturity in a strategic and intentional way that connects insights to action. As you take stock and tackle your organization’s data maturity, consider your approach to data hygiene as an essential step.
1In the private sector - according to Experian 2021 Global Data Management Research, companies estimate that approximately one third of all business data about customers and potential customers is inaccurate, 55% of leaders do not trust the data their organizations own, and only 50% believe that their CRM / ERP data is clean data and can be fully used.
2Value of data: (1) diagnostic (how are we doing), (2) opportunity enabling (what else can we do), (3) risk-management (are we doing our work responsibly).
3Challenges include: (1) Organizational maturity to work with data is lacking at all levels of the organization. (2) Willingness to share and use data. Lacking mandate, process / practice, and incentives for sharing / usage. (3) Standard definition of high quality data. Standardization is challenging in open ecosystem and federated organizational models.(4) Representation, including consent (how data is collected), transparency, privacy.
4Process issues include: (1) Lack of integration and sharing of records between systems. (2) Executive imperative disconnected from the reality on the ground. I.e. responsible data-driven decision-making is not getting operationalized locally (e.g. no training, resources, incentives, performance reviews).
5Challenges: (1) Accuracy - there are errors in the data and it’s difficult to correct them; (2) Timeliness - data is not reported on time; (3) Completeness: data is missing or incomplete; (4) Legibility - it’s difficult or impossible to read the data (eg, handwritten clinical records); (5) Duplication - data has been captured or reported twice; (6) Integrity - data has been made or altered up for fraudulent purposes; (7) Validation - the source data is hard to find and there is a lack of capacity/processes to validate it; (8) Visibility - NGOs can’t see data in one place, which limits their ability to spot issues; (9) Consistency - data is interpreted differently across users and countries; (10) Governance - data values are added, removed, or changed over time.