Currently there are two Knowledge Graph Embedding (KGE) models are implemented: TransE [1] and DistMult (Bilinear-Diag) [2].
The following code snippets show you how you can load your dataset and apply cross validation techniques supported on SANSA KGE.
-
12345678910111213141516171819// dataset to be loadedval input = "fb15k.txt"// technique used to split the dataval technique = "holdout"val k = 5val data = new Triples(input, "\t", false, false, spark)// converting the original data to indexDataval indexedData = new ByIndex(data.triples, spark)val numericData = indexedData.numeric()val (train, test) = technique match {case "holdout" => new Holdout(numericData, 0.6f).crossValidation()case "bootstrapping" => new Bootstrapping(numericData).crossValidation()case "kFold" => new kFold(numericData, k, spark).crossValidation()case _ =>throw new RuntimeException("'" + technique + "' - Not supported, yet.")}