Class

net.sansa_stack.rdf.spark.stats

StatsCriteria

Related Doc: package stats

Permalink

implicit class StatsCriteria extends Logging

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StatsCriteria
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StatsCriteria(triples: RDD[Triple])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  13. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  14. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  15. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  16. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  17. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  18. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  19. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  20. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  21. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  22. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  23. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  24. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  25. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  26. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  28. val spark: SparkSession

    Permalink
  29. def stats: RDD[String]

    Permalink

    Compute distributed RDF dataset statistics.

    Compute distributed RDF dataset statistics.

    returns

    VoID description of the given dataset

  30. def statsAvgPerProperty(): RDD[(Node, Double)]

    Permalink

    29.

    29. Average per property {int,float,time} criterion

    returns

    entities with their average values on the graph

  31. def statsAvgTypedStringLength(): Double

    Permalink

    22.

    22. Average typed string length criterion.

    returns

    the average typed string length used throughout the RDF graph.

  32. def statsAvgUntypedStringLength(): Double

    Permalink

    23.

    23. Average untyped string length criterion.

    returns

    the average untyped string length used throughout the RDF graph.

  33. def statsBlanksAsObject(): RDD[Triple]

    Permalink

    19.

    19. Blanks as object criterion

    returns

    number of triples where blanknodes are used as objects.

  34. def statsBlanksAsSubject(): RDD[Triple]

    Permalink

    18.

    18. Blanks as subject criterion

    returns

    number of triples where blanknodes are used as subjects.

  35. def statsClassHierarchyDepth(): RDD[(Node, Int)]

    Permalink

    4. Class hierarchy depth criterion

    4. Class hierarchy depth criterion

    returns

    the depth of the graph

  36. def statsClassUsageCount(): RDD[(Node, Int)]

    Permalink

    2. Class Usage Count Criterion
    Count the usage of respective classes of a datase, the filter rule that is used to analyze a triple is the same as in the first criterion.

    2. Class Usage Count Criterion
    Count the usage of respective classes of a datase, the filter rule that is used to analyze a triple is the same as in the first criterion. As an action a map is being created having class IRIs as identifier and its respective usage count as value. If a triple is conform to the filter rule the respective value will be increased by one. Filter rule : ?p=rdf:type && isIRI(?o) Action : M[?o]++

    returns

    RDD of classes used in the dataset and their frequencies.

  37. def statsClassesDefined(): RDD[Node]

    Permalink

    3. Classes Defined Criterion
    Gets a set of classes that are defined within a dataset this criterion is being used.

    3. Classes Defined Criterion
    Gets a set of classes that are defined within a dataset this criterion is being used. Usually in RDF/S and OWL a class can be defined by a triple using the predicate rdf:type and either rdfs:Class or owl:Class as object. The filter rule illustrates the condition used to analyze the triple. If the triple is accepted by the rule, the IRI used as subject is added to the set of classes. Filter rule : ?p=rdf:type && isIRI(?s) &&(?o=rdfs:Class||?o=owl:Class) Action : S += ?s

    returns

    RDD of classes defined in the dataset.

  38. def statsDatatypes(): RDD[(String, Int)]

    Permalink

    20.

    20. Datatypes criterion

    returns

    histogram of types used for literals.

  39. def statsDistinctEntities(): RDD[Node]

    Permalink

    16. Distinct entities
    Count distinct entities of a dataset by filtering out all IRIs.

    16. Distinct entities
    Count distinct entities of a dataset by filtering out all IRIs. Filter rule : S+=iris({?s,?p,?o}) Action : S

    returns

    RDD of distinct entities in the dataset.

  40. def statsDistinctObjects(): RDD[Node]

    Permalink

    Distinct Objects
    Count distinct objects within triples.

    Distinct Objects
    Count distinct objects within triples. Filter rule : isURI(?o) Action : M[?o]++

    returns

    RDD of objects used in the dataset.

  41. def statsDistinctSubjects(): RDD[Node]

    Permalink

    Distinct Subjects
    Count distinct subject within triples.

    Distinct Subjects
    Count distinct subject within triples. Filter rule : isURI(?s) Action : M[?s]++

    returns

    RDD of subjects used in the dataset.

  42. def statsLabeledSubjects(): RDD[Node]

    Permalink

    24.

    24. Labeled subjects criterion.

    returns

    list of labeled subjects.

  43. def statsLanguages(): RDD[(String, Int)]

    Permalink

    21.

    21. Languages criterion

    returns

    histogram of languages used for literals.

  44. def statsLinks(): RDD[(String, String, Int)]

    Permalink

    26.

    26. Links criterion.

    returns

    list of namespaces and their frequentcies.

  45. def statsLiterals(): RDD[Triple]

    Permalink

    * 17.

    * 17. Literals criterion

    returns

    number of triples that are referencing literals to subjects.

  46. def statsMaxPerProperty(): RDD[(Node, Node)]

    Permalink

    28.Maximum per property {int,float,time} criterion

    28.Maximum per property {int,float,time} criterion

    returns

    entities with their maximum values on the graph

  47. def statsObjectVocabularies(): RDD[(String, Int)]

    Permalink

    32. Object vocabularies
    Compute object vocabularies/namespaces used through the dataset.

    32. Object vocabularies
    Compute object vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?o) Action : M[ns]++

    returns

    RDD of distinct object vocabularies used in the dataset and their frequencies.

  48. def statsPredicateVocabularies(): RDD[(String, Int)]

    Permalink

    31. Predicate vocabularies
    Compute predicate vocabularies/namespaces used through the dataset.

    31. Predicate vocabularies
    Compute predicate vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?p) Action : M[ns]++

    returns

    RDD of distinct predicate vocabularies used in the dataset and their frequencies.

  49. def statsPropertiesDefined(): RDD[Node]

    Permalink

    Properties Defined
    Count the defined properties within triples.

    Properties Defined
    Count the defined properties within triples. Filter rule : ?p=rdf:type && (?o=owl:ObjectProperty || ?o=rdf:Property)&& !isIRI(?s) Action : M[?p]++

    returns

    RDD of predicates defined in the dataset.

  50. def statsPropertyHierarchyDepth(): RDD[(Node, Int)]

    Permalink

    12.

    12. Property hierarchy depth criterion

    returns

    the depth of the graph

  51. def statsPropertyUsage(): RDD[(Node, Int)]

    Permalink

    5. Property Usage Criterion
    Count the usage of properties within triples.

    5. Property Usage Criterion
    Count the usage of properties within triples. Therefore an RDD will be created containing all property IRI's as identifier. Afterwards, their frequencies will be computed. Filter rule : none Action : M[?p]++

    returns

    RDD of predicates used in the dataset and their frequencies.

  52. def statsPropertyUsageDistinctPerObject(): RDD[(Iterable[Triple], Int)]

    Permalink

    7. Property usage distinct per object
    Count the usage of properties within triples based on objects.

    7. Property usage distinct per object
    Count the usage of properties within triples based on objects. Filter rule : none Action : M[?o] += ?p

    returns

    RDD of predicates used in the dataset and their frequencies.

  53. def statsPropertyUsageDistinctPerSubject(): RDD[(Iterable[Triple], Int)]

    Permalink

    6. Property usage distinct per subject
    Count the usage of properties within triples based on subjects.

    6. Property usage distinct per subject
    Count the usage of properties within triples based on subjects. Filter rule : none Action : M[?s] += ?p

    returns

    RDD of predicates used in the dataset and their frequencies.

  54. def statsSameAs(): RDD[Triple]

    Permalink

    25.

    25. SameAs criterion.

    returns

    list of triples with owl#sameAs as predicate

  55. def statsSubjectVocabularies(): RDD[(String, Int)]

    Permalink

    30. Subject vocabularies
    Compute subject vocabularies/namespaces used through the dataset.

    30. Subject vocabularies
    Compute subject vocabularies/namespaces used through the dataset. Filter rule : ns=ns(?s) Action : M[ns]++

    returns

    RDD of distinct subject vocabularies used in the dataset and their frequencies.

  56. def statsTypedSubjects(): RDD[Node]

    Permalink

    24.

    24. Typed subjects criterion.

    returns

    list of typed subjects.

  57. def statsUsedClasses(): RDD[Node]

    Permalink

    1. Used Classes Criterion
    Creates an RDD of classes are in use by instances of the analyzed dataset.

    1. Used Classes Criterion
    Creates an RDD of classes are in use by instances of the analyzed dataset. As an example of such a triple that will be accepted by the filter is sda:Gezim rdf:type distLODStats:Developer. Filter rule : ?p=rdf:type && isIRI(?o) Action : S += ?o

    returns

    RDD of classes/instances

  58. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  59. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  60. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  61. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  62. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped