What are Knowledge Graphs, and when are they relevant?
The term "Knowledge Graph" has recently become mainstream as information giants like Google, Facebook, Yahoo, Microsoft, have announced their move from traditional search and data management to "Knowledge Graphs". See here, here and here.
How does it differ from traditional information management?
Traditional DBs or data warehouses are centered around "records" and "tables". This is certainly efficient when a domain is well known in advance and discovery of new information is not expected.
However, in domains where one has the need to flexibly connect all sort of, possibly unexpected kind of information to the records, the Knowledge Graphs technology has distinctive advantages:
1) Entity centric
All bits of information are catalogued with respect to an entity or entities it is relevant for. A single bit of information will never be artificially "forced" in a single bucket, rich connections are kept so that facts are never lost.
Entities are understood for possibly having multiple names, so searching precision is dramatically increased.
Free knowledge structure is allowed, no schema is required a priori.
Traditional databases forces you to think in terms of "schema". You must decide the table schema upfront, and modifying it is very expensive. In Knowledge Graphs typically information of any kind can be added with no need to create "tables" or modify schema.
3) Metadata Rich, Self Describing.
Rich metadata is at the core of Knowledge Graph. Metadata is so close to the data that in fact typically its never away from it. Streams of metadata rich knowledge are now easier to integrate and scale across departments, organizations and even domains.
The end result:
The Knowledge Graph as a workflow
Graph databases have existed for a very long time with their share of success stories. While a Knowledge Graph will typically make use of one or more such databases, the core of these systems lies in the ability to continuously process big data streams as a "workflow" and make use of many index and storage systems as needed, not a single "graph store".
A useful way to divide the process is usually in 4 steps (a EETL process) :
Data and Applications of a typical Enterprise Knowledge Graph
In Enterprise the typical goal of a Knowledge Graph is to collect information about every entity of interest in a domain (and their relationships) and make it "maximally easy" to reuse for any application, current, future, foreseen or unforeseen.
Knowledge Intensive Enterprises across many sectors can immensely benefit from the ability keep the data and the structure of any piece of information they can collect about any entity of interest to their business.
Typically an Enterprise Knowledge Graph will be created with the above ETL process with data ranging from relational databases to public open data including public knowledge graphs, to unstructured data processed via machine learning API, to customer own datasets.
It is important to notice that a knowledge graph will typically not be a replacement for existing databases and data warehouses, but instead it will complement them, adding knowledge exploration capabilities and ability to quickly experiment with new knowledge-driven business ideas.
SindiceTech infrastructure for Knowledge Graph management
With 10+ year of experience on the topic, the SindiceTech team is the partner to work with implement your Enterprise Knowledge Graphs strategy.
SindiceTech "CloudSpace" product suite combines Big Data proprietary Knowledge Graph technologies with best of the breed complementary open source technologies. SindiceTech knowledge graphs are ideally delivered both on public clouds and in private enterprise environments.