Loads N-Triples data from a file or directory into an RDD.
Loads N-Triples data from a file or directory into an RDD.
The path can also contain multiple paths
and even wildcards, e.g.
"/my/dir1,/my/paths/part-00[0-5]*,/another/dir,/a/specific/file"
By default, it stops once a parse error occurs, i.e. a org.apache.jena.riot.RiotException will be thrown generated by the underlying parser.
The following options exist:
org.apache.jena.net.sansa_stack.rdf.spark.riot.RiotException
will be thrownIf the additional checking of RDF terms is enabled, warnings during parsing can occur. For example, a wrong lexical form of a literal w.r.t. to its datatype will lead to a warning.
The following can be done with those warnings:
Set whether to perform checking of NTriples - defaults to no checking.
Checking adds warnings over and above basic syntax errors.
This can also be used to turn warnings into exceptions if the option stopOnWarnings
is set to STOP or SKIP.
See also the optional errorLog
argument to control the output. The default is to log.
the Spark session
the path to the N-Triples file(s)
stop parsing on encountering a bad RDF term
stop parsing on encountering a warning
run with checking of literals and IRIs either on or off
the logger used for error message handling
the RDD of triples
Loads N-Triples data from a set of files or directories into an RDD.
Loads N-Triples data from a set of files or directories into an RDD.
The path can also contain multiple paths
and even wildcards, e.g.
"/my/dir1,/my/paths/part-00[0-5]*,/another/dir,/a/specific/file"
the Spark session
the path to the N-Triples file(s)
the RDD of triples
Loads N-Triples data from a file or directory into an RDD.
Loads N-Triples data from a file or directory into an RDD.
the Spark session
the path to the N-Triples file(s)
the RDD of triples
An N-Triples reader. One triple per line is assumed.