The State of Big Data Analytics in the Federal Space
Published: September 16, 2020
Government and commercial leaders come together to reflect on the federal government’s current and future priorities in data collection, curation and utilization.
- DOD, Navy and NASA data leaders joined members of the commercial sector at a Federal Executive Forum hosted by Federal News Network, to discuss current progress and top priorities in big data and analytics in the federal space.
- Government agencies are laying the foundation in strategic data management, moving the focus away from data collection to a higher order of data utilization.
- The COVID-19 pandemic spurred key programs in data interoperability and visualization in agencies such as DOD and NASA.
- Panelists envision a data-driven federal government in the future, leading to increased use in artificial intelligence and machine learning to extract actionable intelligence in key decisions and critical missions.
In an era of the Evidence Act and Federal Data Strategy, appointment of Chief Data Officers at all agencies, the need for real-time reliable data, and the continuous evolution of data services and technologies, it is no wonder that big data and analytics is a large focal point in the federal space. The Federal News Network hosted several government and industry leaders to discuss their big data analytics programs and strategies surrounding federal agencies under its Federal Executive Forum, Big Data Analytics in Government 2020/2021 “Progress & Best Practices”. Panelists of the forum included:
- Michael Conlin, Chief Business Analytics Officer, Department of Defense
- Tom Sasala, Chief Data Officer, Navy
- Ron Thompson, Chief Data Officer, NASA
- Nick Psaki, Principal System Engineer, Pure Storage
- Henry Sowell, Chief Information Officer, Cloudera Government Solutions
- Jon Harmon, Vice President, Worldwide Sales, BMC Software
Status of Federal Big Data and Analytics
Given the viewpoints shared by government and industry panelists, the underlying theme of federal big data in 2020 is moving past the initial order of data collection and towards data orchestration and automation. According to Harmon, data use at agencies is evolving from traditional areas of job scheduling and management business processing, to simplifying data complexities and delivering critical initiatives, and ingesting data into workflows.
For example, Conlin explained that his office grappled with over 10,000 separate systems with data from different sources and spent two years to bring it together to present to DOD leaders. Now, with DOD data running smoothly, his role has shifted to focus on business analytics and insight in building out DOD’s first balanced scorecard to enable the Secretary and Deputy Secretary to make decisions on the defense ecosystem, in place of siloed decisions at individual department divisions.
Likewise, after spending a year getting Navy’s data program off the ground, Salsa is now focused on getting the governance, policies and enterprise data management in place (specifically data quality, accessibility and availability) in order to get Navy and the Marine Corps plugged in with each other, and eventually becoming interoperable with other parts of DOD.
COVID-19 Influences Current Data Initiatives at Agencies
The urgent demand for real-time, reliable health data by the COVD-19 pandemic spurred the creation of data-centered initiatives across several federal agencies in efforts to make sound evidence-based decisions.
At NASA, Thompson’s team set up the Executive Decision Lens, a visualization layer for agency leaders to view real-time health-related statistics across all its locations. The initiative consists of placing APIs on disparate data sets to allow for agility and reuse of the data. The agency also set up a contract tracing application, one of the first agencies to do so, normalizing data with common standards with other agencies who are co-located at NASA offices.
Conlin’s team at DOD not only helped pull data in from internal sources, but needed to bring in external data sources as well (i.e. HHS) to help inform his agency about the pandemic. DOD was able to do this seamlessly and even look into their supply chain, specifically their suppliers’ suppliers, and detect problems earlier on in the lack of health equipment across the country. Moreover, DOD implemented the Installation Commander’s Dashboard to allow those leaders to make health protection decisions on their respective divisions. The dynamic dashboard consists of data that varies from installation to installation, leading to decisions by data analytics versus dates or politics, explained Conlin.
Next in the program, each government panelist shared a set of priorities within the next six to twelve months they hope to achieve:
- Data Quality: ensure the agency has a controlled vocabulary and then dig into the trustworthiness of the data and codify it
- Evolution of the Workforce: get people to understand the value of data, and how to exploit it and use it for purposeful decision-making once they have access to it
- Dashboard Sprawl: identify which dashboards are reliable for what information and what decision by certifying data sources as trustworthy
- Cultural Shift: change the mindset of people at the agency who take ownership of data and stop the sharing of data across the organization when needed
- Data Protection: classify data properly and have different classification levels for different types of data (i.e. low classification for public scientific data, high classification for proprietary partner data)
- Standardization and Interoperability: know how to move data between purpose built containers to more open and interoperable standards
- Cultural Change: use a balanced scorecard to report on what has happened in the past, and aim for predictive reporting and beyond. Going through the levels of data maturity will help to shift the culture’s viewpoint on data
- Policy Analytics Tool: most DOD policy documents start out as unstructured data, OSD currently has over 1900 policy documents and will use the new tool to perform a semantic search to not only find the policy but also understand the intent of it
Painting picture of big data’s future, the panelists agreed that a general shift is occurring in the perception of data use. The COVID-19 pandemic is causing people to think about their data strategies, says Sowell. Data integration and data infusion and decreasing the time to market form data creation to decisions is top of mind, according to Salsa. Moreover, the federal government is undergoing a digital transformation that will lead to leveraging artificial intelligence and machine learning in order to extract even larger actionable intelligence from data, says Harmon.