10. How can I compute the pagerank of resources in RDF files?

  • The PageRank algorithm compute the importance of each vertex (represented as resource) in a graph. Resource PageRank is build on top of Spark GraphX.

    Full example code: https://github.com/SANSA-Stack/SANSA-Examples/blob/master/sansa-examples-spark/src/main/scala/net/sansa_stack/examples/spark/rdf/PageRank.scala

9. How do I write RDF files?

8. Can I load several files without merging them beforehand?

  • In Spark, the method textFile() takes an URI for the file (either a local path or a hdfs:// ). You could run this method on a single file or a directory which may contains more than one file by calling :

5. How can I count the number of subjects / predicates / objects / triples of my RDF file?

4. How can I filter all triples with a certain subject / predicate / object in an RDF file?

  • Full example code: https://github.com/SANSA-Stack/SANSA-Examples/blob/master/sansa-examples-spark/src/main/scala/net/sansa_stack/examples/spark/rdf/TripleOps.scala

  • Full example code: https://github.com/SANSA-Stack/SANSA-Examples/blob/master/sansa-examples-flink/src/main/scala/net/sansa_stack/examples/flink/rdf/TripleOps.scala

3. How can I collect RDF dataset statistics on SANSA?

1. How can I read an RDF file and retrieve a Spark RDD representation of it?