Know Where Graph Logo


Background Imagery - County Map of the USA with a vectored web of lines and nodes wrapping across the country, signifying connections

The KnowWhereGraph improves data-driven decision making and data analytics, specifically data analytics that involve geographic data. the KnowWhereGraph is a knowledge graph tool that specifically enables other data-analysis knowledge tools that have a geospatial component.

Geo-enrichment describes the process by which data become augmented with a wide range of auxiliary information (such as demographic data) tailored to a geospatial study area. Geo-enrichment tools significantly reduce the costs involved in acquiring, entering, and cleaning geo-data. Unfortunately, currently available geo-enrichment services provide access to only predefined categories of information, do not effectively handle interconnected data, offer limited support for data integration, and are generally expensive.

The KnowWhereGraph makes data-driven decision making and data analytics substantially more effective, accessible, and affordable. The KnowWhereGraph merges novel Artificial Intelligence-based geo-enrichment technologies with a knowledge graph that brings together open, cross-domain, densely integrated data spanning the human-environment interface.

The KnowWhereGraph is enabled by an open, freely usable knowledge graph. The graph combines scalable, Web-standard technologies, specifications, and data cultures for representing densely interconnected statements derived from structured or unstructured data across domains, in both human- and machine-readable ways. The technology tools are designed to be useful to and usable by researchers, analysts, decision makers, and the interested public in any domain or cross-domain activity requiring geospatial intelligence.

The KnowWhereGraph includes strong partnerships with non-academic and academic stakeholders including four for-profit organizations, two government agencies, and one non-profit, as well as five academic partnerships: Esri (Geographic Information Systems); Oliver Wyman, (commodity markets and supply chains), Hydronos Labs (weather, climate and agriculture information services); US Geological Survey (USGS), Natural Resources Conservation Service within the U.S. Department of Agriculture (USDA), and Direct Relief (humanitarian aid); as well as University of California Santa Barbara (UCSB), Kansas State University (K-State), Michigan State University (MSU), Arizona State University (ASU), and University of Southern California (USC). Additional partnerships are expected to develop during this Phase II effort.

The KnowWhereGraph is a valuable element of the National Science Foundation's (NSF) Convergence Accelerator Phase II cohort, providing geospatial tools to other projects within the cohort. In addition, the project focuses on several strategic application areas that are likely to benefit US society, including: COVID-19-related supply-chain disruptions and the US food, agriculture, and energy sectors, and their attendant supply chains generally; environmental policy issues relative to interactions among agricultural sustainability, soil conservation practice, and farm labor; and delivery of emergency humanitarian aid, within the US and internationally. Any time knowing where is key, the KnowWhereGraph will be helpful.

Formally, a knowledge graph consists of a massive set of statements, constructed from inter-connected node- and edge-labeled resources, allowing multiple, heterogeneous edges for the same nodes. A collection of definitional statements specifying the meaning of the knowledge graph's vocabulary is called its (KG) schema or ontology. The ontology is critical for rigorous logical interpretation and machine-actionability. Co-PI Pascal Hitzler explains, "Knowledge graphs are industry's go-to methods for complex data integration and re-use scenarios. We use rigorous and open standards together with sophisticated quality control based on years of experience and research to produce the highly versatile KnowWhereGraph. Spatial information plays a key role, and we are significantly pushing the state of the art with our technology solutions."

Several innovations in knowledge graph technology will drive the project: (I) creating an open, Web-accessible knowledge graph, with attendant methods and tools, to enable contributions to the graph from a range of sources; (II) developing strategies for semantically lifting imagery data, such as remotely sensed imagery and drone imagery, into this graph, thereby integrating vast amounts of data; (III) developing novel spatially explicit AI-based methods, models, and services to enable geo-enrichment on top of this graph; and (IV) developing both programmatic (application programming interface, API) and human-accessible interfaces for the KnowWhereGraph. By merging the flexibility, expressive power, and community-driven features of open graph technologies with multi-format geospatial data and advanced geospatial intelligence, the KnowWhereGraph is designed to become a rich, integrative information resource that can transform and converge discovery, analysis, and synthesis within and across a multitude of fields and sectors.

Years ago, lead PI Krzyztof Janowicz said that it was enough for decision makers in some sectors to be concerned only with their local context, within a regional scope of an enterprise. "Now, whether you are an individual farmer, a retail company, or a humanitarian relief organization, you act in a global context," he said. "You have to track commodity prices, tariffs, listen to the pulse of society, be sensitive to the culture of your markets, monitor weather forecasts, and even be able to react quickly to freak events such as pandemics or terrorists."

What the KnowWhereGraph can do, according to Janowicz, is to bring together a wealth of highly diverse sources of relevant information to form an open, spatially explicit knowledge graph-a model that integrates not just different kinds of data, but, importantly, also their relationships- in a way that can be accessed by those for whom the information matters most.

To interlink and be able to query all these data sources requires a "universal" language. "To some degree, such language already exists," Janowicz said. "It's called the Resource Description Framework (RDF), and it enables us to describe the world around us in human- and machine-understandable terms." These resources can include things like maps, images, tabulated data, and text-all of which are built into a global and decentralized graph.

"RDF triples, which are statements in a subject, predicate, object form, enable us to publish knowledge about the world around us, and irrespective of the fact who made these statements, what they are about, or when and where they were made," he said. "Everybody gets to contribute and connect statements to already existing ones."

Of course, the value of such a data graph relies heavily on the richness of the data, how current they are, and how the connections between disparate bits of information can become solutions to current problems or predictors of future scenarios. For that, there's artificial intelligence.

"We're going to develop AI methods to help decision makers communicate with our KnowWhereGraph," Janowicz said. "Essentially, the graph will deliver contextual background information about an analyst's study area using a process called geo-enrichment. We are particularly interested in graph summarization techniques to find task-relevant triples from a pool of billions of other statements."

"Building a domain knowledge graph is a critical step towards developing artificial general intelligence for future machines to reason like human beings," said Wenwen Li, Co-PI of the project, who specializes in smart cyberinfrastructure and geospatial big data analytics. "The IT giants, such as Google and Facebook, have developed enterprise-level knowledge graphs to better understand the world's information to improve web search and product recommendation."

Building a scientific knowledge graph that models research data is very challenging-data come from different sources, are encoded in different formats, are large in size, and are often short of metadata. Additionally, much of the existing data are hidden in the deep web, making their discovery and reuse even more difficult.

Co-PI Mark Schildhauer is excited about the KnowWhereGraph creating a framework to support deep interrogation into specific thematic areas, as well as enabling bridging across multiple disciplines. "The KnowWhereGraph has use cases we call 'Verticals', that involve detailed inquiry into highly focused topics, such as clarifying the linkages among soil health, agricultural productivity, and farming methods. However, we are also developing 'Horizontal' use cases, which are emerging through secondary or tertiary connections to nodes in our 'Verticals'. For example, hurricanes and floods disrupt communities through immediate impacts, but can have lasting effects on agricultural productivity, as well as community health and resilience. Our 'Horizontals' will reveal these connections, and be further enabled through some of the general 'design patterns' we're developing for interoperating with other Knowledge Graphs."

"The great part of the project," Co-PI Dean Rehberger explains, "is that not only are we working on a project that has the potential for real social impact, but we also get to work with a great team of superb scholars and researchers from both the academic and the private sector as well as NGOs. This is also a fast-paced, new grant program for NSF that emphasizes 'accelerating' the use of research in the public sphere. Very exciting."