Approach to the Problem
Learn how to solve a knowledge graph completion problem.
We'll cover the following...
Our approach
After the dataset is built, we can use the pykeen Python library to train a knowledge graph embedding model and evaluate its performance. Let's take a look at the code below:
Note: At the end of the output, that is not the errors; it measures the time the evaluation takes.
Let’s look at the code explanation below:
Lines 5–7: Create
TriplesFactoryfrom the dataset and split it into training and testing sets in a ratio of 80:20.Lines 10–20: Train knowledge graph embeddings using the
TransRalgorithm and PyKEEN'spipelinemethod. This method performs the evaluation of embeddings and outputs the results.Line 23: Saves the results to a DataFrame.
Line 25: Prints the DataFrame.
The resulting DataFrame shows different metrics and their values.
Evaluation
PyKEEN's pipeline method performs evaluations and provides results in an easy-to-read manner. The results in the DataFrame can look as follows (values might be different since the model was trained with different hyperparameters):
Let's find out what these metrics mean.
Process
By default, PyKEEN uses a rank-based evaluation, which is standard for link prediction tasks. After generating the knowledge graph embeddings using the TransR algorithm, the pipeline method performs rank-based evaluation on the link prediction task.
Let's break this down into different steps.
We split the triples into training and testing sets so that both sets have common ...