Announcing the Next Phase of the Big Data Hubs

The National Science Foundation (NSF) is awarding a second round of funding for the four Big Data Hubs—organizations launched in 2015 to build and strengthen data science partnerships across industry, academia, nonprofits, and government.

Each of the hubs will receive $4 million over four years, for a total investment of $16 million. This is double the budget for the first round of Big Data Hubs awards made in 2015.

The Big Data Hubs connect innovators regionally to address scientific and societal challenges, focusing on data science activities and initiatives that inspire cross-sector collaboration and exemplify the need for multi-disciplinary approaches.  “By catalyzing partnerships that integrate academic researchers into the fabric of communities across the U.S., we can accelerate and deepen the impact of basic research on a range of societal issues, from water management to efficient transportation systems,” says Beth Plale, one of the National Science Foundation program directors managing the Big Data Hubs awards.

The West Hub will continue to be coordinated by UC Berkeley, UC San Diego, and the University of Washington, with Principal Investigators David Culler (UC Berkeley), Ed Lazowska (UW), and Michael Norman (UCSD), Executive Director and Co-Principal Investigator Meredith Lee (UC Berkeley), Deputy Directors and Co-Principal Investigators Christine Kirkpatrick (UCSD) and Sarah Stone (UW), and Co-Principal Investigator Bill Howe (UW).

“We’re excited to spark new public-private partnerships and build upon our collective strengths in this next phase,” notes Meredith Lee, West Hub Executive Director and Co-Principal Investigator. “Each community member has a compelling story to share and the Hub can help amplify our constituents’ impact across a diverse network.”

Collaborative Data Science for Social Impact
The West Hub’s first three years of operation have included a diverse set of application-focused projects—developing data analysis and tools to support access to safe drinking water, better understand disease through all 20,000 human proteins, and facilitate new insights in transportation safety. The Hub also supports cross-cutting efforts to produce frameworks and resources useful to multiple areas of inquiry and practice, from data sharing and cloud computing to responsible data science.

The next four years will include an emphasis on developing and enabling translational data science, with signature initiatives including:  
  • Fire and Water: Regional Data Collaboratives for the Future of Natural Resource Management. Building upon momentum from regional roundtables, workshops, online tutorials, UW Waterhackweek, the open-to-all California Water Data Challenge, and other efforts, the West Hub will focus on collaborative, user-focused projects that leverage new shared data and open access tools. This summer and fall, with additional funding from the Water Foundation and Leonardo DiCaprio Foundation, the West Hub will work with journalists from mainstream and ethnic media, offering fellowships that connect impacted communities with research and education efforts around water data.
  • Stress-Testing Access for Road Video: Understanding Risk and Opportunity in Data Sharing. After hosting a 6-month nation-wide series of community problem-solving sessions, technology demonstrations, and discussions focused on transportation safety, the West Hub will strengthen a partnership with the NSF and the Federal Highway Administration to investigate the reversibility of tools used to de-identify video data from automobile drivers. Tied to a 3-year data collection effort that produced data for more than 3,000 drivers, including 1,500 crashes and 3,000 near-crashes, this project will include community dialogue about privacy and bias.
  • Housing Instability: Trusted Data Collaborative for Responsible Data Management. Racial biases in eviction practices, rapidly increasing housing prices, and complex interactions between services to support homeless families have led to neighborhood-level inequities in urban environments and a lack of transparency in the efficacy of interventions. Through a partnership with the Bill and Melinda Gates Foundation, Microsoft, and the Cascadia Urban Analytics Cooperative, the West Hub will integrate data from multiple jurisdictions to study questions about how neighborhood change, service delivery, and demographics influence outcomes for homeless families. In preliminary work, the West Hub supported a Seattle evictions study, which extracted information from thousands of evictions case reports and uncovered extreme racial disparity, leading directly to a policy change increasing the response time allowed to tenants. As part of this work, the West Hub is expanding the scope of the Trusted Data Collaborative, a socio-technical platform for responsible data governance initially used for mobility data, to support housing and population health data. The effort is designed to balance competing objectives among stakeholders, improving fairness in analytic methods, preserving privacy, protecting data owners’ proprietary information, and promoting transparency.
  • Inclusive Data Science Education and Training. Leveraging lessons learned from the undergraduate Discovery Research Program at UC Berkeley and four years of University of Washington’s Data Science for Social Good (DSSG) program, the West Hub will host a training course and develop a guide for organizations interested in creating programs pairing student fellows with data scientist mentors and project leads from academia, government, or the private sector. The West Hub’s focus on societal-facing challenges will drive collaborations in topics such as transportation, public health, sustainable urban planning, and disaster recovery. As part of their efforts to increase workforce readiness in the region, the West Hub will partner with The Carpentries for three years to host data science Train-the-Trainer workshops, especially aiming to engage underrepresented groups and geographic areas that are not currently served by cognate programs. The partnership builds upon prior training workshops that included local government leaders across the Western region and the first Data Carpentry event with a tribal community. This month, UC Berkeley Division of Data Sciences, Microsoft, and the West Big Data Innovation Hub will extend a recent effort that convened undergraduate data science instructors spanning community colleges to graduate degree-granting institutions, to host 60 data science educators for a National Workshop on Data Science Education.
  • Access to Cloud Computing. The West Hub will expand regional and national-scale efforts in cloud computing for research and education by organizing educational training opportunities, facilitating the sharing of best practices, and sparking collaborations across the region. The Hub team from UC Berkeley, UC San Diego, and the University of Washington has collectively worked with more than 100 institutions across the world to understand opportunities to deploy cloud approaches for data science education, training, and research, emphasizing reproducibility and open science. Efforts originating from Berkeley Institute of Data Science through Project Jupyter have led to 4.5 million publicly available Jupyter notebooks globally, and this next phase of the Hub will further connect the scientific community with cloud resources to broaden computing access.

“Developing innovative, effective solutions to grand challenges requires linking scientists and engineers with local communities. The Big Data Hubs provide the glue to achieve those links, bringing together teams of data science researchers with cities, municipalities and anchor institutions,” notes Jim Kurose, Assistant Director for Computer and Information Science and Engineering at the National Science Foundation.

Many of the Hub’s continuing initiatives and collaborations will highlight challenges surrounding data ethics and responsible data science, bringing communities together through opportunities such as workshops on translational data science, Data For Good Exchange efforts, and FAIR Data Awareness and Data Reuse Labs. “In the first years of the West Big Data Innovation Hub, we built capacity and brought focus to the data-driven challenges faced by our Western constituents. This next phase allows for fine tuning of offerings that give rise to new collaborations, ideas, and combinations of data resources,” says Christine Kirkpatrick, West Hub Deputy Director and Co-Principal Investigator.

As a new service to the community, each Big Data Hub will maintain a seed fund for translational data science as part of its project budget. This seed fund will provide grants to pilot early feasibility studies for innovative new solutions to grand challenges of importance to the region. The West Hub’s requests for collaborative seed projects will serve to gather compelling, timely, and actionable community ideas throughout the year. Embarking on the next phase of growth and national coordination, the Hubs will also work with the National Science Foundation and additional partners to host an All Hubs All Hands community data science meeting open to the public as a signature event in 2020.  

About the Hubs Each Big Data Hub is located in one of the four U.S. Census regions (Northeast, South, Midwest, and West), and serves as a thought leader and convening force on social and economic challenges that are unique to the region—for example, fresh water in the West, agriculture in the Midwest, coastal flooding in the South, and aging urban infrastructure in the Northeast. Beyond their regional focus, the Big Data Hubs will act as a single national body as needed to respond to issues that cross regions, such as the evolution of U.S. transportation infrastructure and workforce development. Read more at the newly-launched, and see an early prototype of our work together at!

West Hub Contact:
NSF Media Contact:

Related Stories:
Smarter Together: NSF Awards $16 Million for Data Innovation Hubs to Drive Partnerships for Societal Impact
Partnerships for Impact: NSF Awards an additional $4M to the West Big Data Innovation Hub co-led by the UW eScience Institute
SDSC Receives New Funding for West Big Data Innovation Hub: National Science Foundation extends awards for all four U.S. hubs

Retweet the announcement