What is SANSA?
SANSA is a big data engine for scalable processing of large-scale RDF data. SANSA uses Spark and Flink which offer fault-tolerant, highly available and scalable approaches toefficiently process massive sized datasets. SANSA provides the facilities for Semantic data representation, Querying, Inference, and Analytics.
SANSA-Stack’s core is a processing data flow engine that provides data distribution and fault tolerance for distributed computations over RDF large-scale datasets.
SANSA includes several libraries for creating applications:
- Read / Write RDF / OWL library for RDF/OWL operations,
- Querying library support a query language on top of distributed RDF/OWL library, as well as querying heterogeneous non-RDF data.
- Inference library implements rule-based reasoning on RDF/OWL data,
- ML- Machine Learning core library
What is the idea behind SANSA?
In SANSA, we combine distributed computing frameworks (specifically Spark and Flink) with the semantic technology stack.
The SANSA vision combines distributed analytics (left) and semantic technologies (right) into a scalable semantic analytics stack (top). The colours encode what part of the two original stacks influence which part of the SANSA stack. The main objective of SANSA is to investigate whether the characteristics of each technology stack (bottom) can be combined to retain the respective advantages.
SANSA inherits the following advantages from the semantic technology
stack and machine learning research and distributed computing.
Powerful Data Integration
Current analytics pipelines have to handle increasing data variety and complexity more…
The vast majority of machine learning algorithms have to rely on simple input more…
The usage of W3C standards can generally reduce pre-processing time in those more…
A key driver for the success of machine learning
is that its benefits are often directly more…
Distributed in-memory computing can provide the
horizontal scalability required more…
If you have question related to SANSA community then you can post in on various channels:
- Mailing List. Subscribe to the mailing list via @SANSA-Stack or by sending an e-mail message to firstname.lastname@example.org. Once the subscription was confirmed, you can send questions to email@example.com.
- GitHub issues. Post your questions to GitHub Issues for the specific module.
Latest Blog Posts
- SANSA 0.7.1 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.7.1 - the seventh release… ...
- Squerall to SANSA-DataLake - For over four decades, relational data management remained a dominant… ...
- Quality Assessment of RDF Datasets at Scale - “Data is the new oil. It’s valuable, but if unrefined… ...
- SANSA 0.6 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.6 - the sixth… ...
- SANSA 0.5 (Semantic Analytics Stack) Released - We are happy to announce SANSA 0.5 - the fifth… ...
SANSA is a research project of the Smart Data Analytics research group.