for Large-scale RDF Knowledge Graphs
What is SANSA?
SANSA is a big data processing engine for scalable processing of large-scale RDF data. SANSA uses Spark and Flink which offer fault-tolerant, highly available and scalable approaches to process massive sized datasets efficiently. SANSA provides the facilities for Semantic data representation, Querying, Inference, and Analytics.
SANSA-Stack’s core is a processing data flow engine that provides data distribution and fault tolerance for distributed computations over RDF large-scale datasets.
SANSA includes several libraries for creating applications:
- Read / Write RDF / OWL library for RDF/OWL operations,
- Querying library support a query language on top of distributed RDF/OWL library,
- Inference library implements rule-based reasoning on RDF/OWL data,
- ML- Machine Learning core library
What is the idea behind SANSA?
In SANSA, we combine distributed computing frameworks (specifically Spark and Flink) with the semantic technology stack.
The SANSA vision combines distributed analytics (left) and semantic technologies (right) into a scalable semantic analytics stack (top). The colours encode what part of the two original stacks influence which part of the SANSA stack. The main objective of SANSA is to investigate whether the characteristics of each technology stack (bottom) can be combined to retain the respective advantages.
SANSA inherits the following advantages from the semantic technology
stack and machine learning research and distributed computing.
Powerful Data Integration
Current analytics pipelines have to handle increasing data variety and complexity more…
The vast majority of machine learning algorithms have to rely on simple input more…
The usage of W3C standards can generally reduce pre-processing time in those more…
A key driver for the success of machine learning
is that its benefits are often directly more…
Distributed in-memory computing can provide the
horizontal scalability required more…
If you have question related to SANSA community then you can post in on various channels:
- Mailing List. Subscribe to the mailing list via @SANSA-Stack or by sending an e-mail message to firstname.lastname@example.org. Once the subscription was confirmed, you can send questions to email@example.com.
- GitHub issues. Post your questions to GitHub Issues for the specific module.
Latest Blog Posts
- Quality Assessment of RDF Datasets at Scale - “Data is the new oil. It’s valuable, but if unrefined ...
- SANSA 0.6 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.6 - the sixth ...
- SANSA 0.5 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.5 - the fifth ...
- SANSA 0.4 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.4 - the fourth ...
- SANSA Parser Performance Improved - More efficient RDF N-Triples Parser introduced in SANSA: Parsing Improvements ...
SANSA is a research project of the Smart Data Analytics research group.