Knowledge graphs are a ground-breaking development that is revolutionizing our ability to extract useful and insightful information from the immense amounts of data we are always generating and collecting. It is entirely possible that you've been using knowledge graphs for a long time, but didn't know it; indeed, Google Search utilizes a knowledge graph extensively to help users find answers to their questions, learn useful connections, and aid in distinguishing different objects with the same name. Roughly speaking, a knowledge graph is a data structure that takes a disparate collection of facts and organizes these into a network of interconnected entities (e.g., people, cities, subjects) in a way that can be easily understood and processed by humans and machines.
For a data scientist, a knowledge graph is a gold mine of predictive and actionable insights. In this guide, you will learn what a knowledge graph is, how it is used in many of the technologies we all know and love, and how to fruitfully apply machine learning to knowledge graphs.
You may remember from your computer science courses that a graph is a type of data structure consisting of nodes/vertices and connections/edges. A knowledge graph is pretty much just that, with vertices that are objects of knowledge and edges representing some linkage, such as a family connection or country of origin. For example, a knowledge graph might contain the fact that Pierre Curie and Marie Curie are related via the connection "marriage". It might know many more relationships between other objects, such as Marie Curie's thesis advisor.
One of the best-known knowledge graphs is the Google Search knowledge graph. Whenever Search recognizes that a query pertains to an object, it displays relevant and commonly sought-after information on the right. For example, searching for Python, we see a blurb describing the language, as well as other relevant information, such as release dates.
In addition, the knowledge graph is able to relate questions and their answers to the object Python.
Other well-known enterprises also maintain knowledge graphs:
The standard approach for employing machine learning over a knowledge graph utilizes embeddings. Here's how this approach works. First, know that traditionally, machine learning operates on vectors, or arrays of numbers. For example, an image is encoded as a vector via its pixel RGB values, or a house is encoded via its price, GPS coordinates, number of bedrooms, etc. An embedding takes an object and produces a vector for it. For example, it might take English dictionary words and produce a vector for each of them.
This same approach is one of the main ways machine learning is applied to knowledge graphs. The objects of the graph are embedded in a way that takes advantage of the structure of the graph. For example, two objects that are linked may be embedded closely together in coordinate space. Once the embedding is complete, utilizing machine learning on the data becomes a straightforward, traditional machine learning problem.
Knowledge graphs help solve the immensely important problem of making data and relationships within data transparent to humans. Moreover, they similarly aid machine learning algorithms to make a lot of their tasks much easier. Knowledge graphs are certainly with us to stay, and if you are interested in creating a knowledge graph, one popular framework you can use is Grakn. Whether or not you choose to pursue knowledge graphs further, you now know how Google Search is so effective.