Getting Started with SANSA-Stack
This document summarizes all instructions to help first time users to get and use SANSA-Stack.
- Set up SANSA
- Configuring the Computing Frameworks
- Configuring an Application
Set up SANSA
In order to get quickly started, SANSA provides project templates for the following build tools: Maven and SBT.
1234git clone https://github.com/SANSA-Stack/SANSA-Template-Maven-Spark.gitcd SANSA-Template-Maven-Sparkmvn clean package
The subsequent steps depend on your IDE. Generally, just import this repository as a Maven project and start using SANSA / Spark.
1234git clone https://github.com/SANSA-Stack/SANSA-Template-Maven-Flink.gitcd SANSA-Template-Maven-Flinkmvn clean package
The subsequent steps depend on your IDE. Generally, just import this repository as a Maven project and start using SANSA / Flink.
1234git clone https://github.com/SANSA-Stack/SANSA-Template-SBT-Spark.gitcd SANSA-Template-SBT-Sparksbt clean package
The subsequent steps depend on your IDE. Generally, just import this repository as a SBT project and start using SANSA / Spark.
1234git clone https://github.com/SANSA-Stack/SANSA-Template-SBT-Flink.gitcd SANSA-Template-SBT-Flinksbt clean package
The subsequent steps depend on your IDE. Generally, just import this repository as a SBT project and start using SANSA / Flink.
These templates help you to set up the project structure and to create the initial build files. Enjoy it!
- Make sure that you have Java 8 or higher installed.
- Install the Eclipse m2e Maven plugin for Maven support, “m2e-egit“ for Git (if not installed already) and m2eclipse-scala (if not installed already).
- Go to “File → New Project → “Checkout Maven Projects from SCM“.
- Set SCM URL type to “git“ and enter the URL of your repository (e.g. for https://github.com/SANSA-Stack/SANSA-RDF it is https://github.com/SANSA-Stack/SANSA-RDF.git ).
- Click on “OK” and wait a while.
- File → New → Project from version control -> GitHub
- Log in to github
- Choose github.com/SANSA-Stack/SANSA-Query.git (for example)
- “Non-managed pom file found” prompt in the lower right
- Add as maven project
- Be patient while it is “Resolving dependencies” (in the status bar)
For developers using SANSA:
- In order to generate Eclipse project files out of the sbt project, you should install sbteclipse plugin and just hit sbt eclipse on the root of the project .
- Once you have installed and generated the Eclipse project files using one of the above plug-ins, start Eclipse.
- File → Import → General/Existing Project into Workspace.
- Select the directory containing your project as root directory (e.g. https://github.com/SANSA-Stack/SANSA-Template-SBT-Spark), select the project and hit Finish.
- File –> New –> Project from Existing Sources.
- Select a project (e.g. https://github.com/SANSA-Stack/SANSA-Template-SBT-Spark) that you want to import and click OK.
- Select Import project from external model option and choose SBT project from the list. Click Next.
- Select SBT options and click Finish.
Interactive Spark Notebooks can run SANSA-Examples and are easy to deploy with docker-compose. Deployment stack includes Hadoop for HDFS, Spark for running SANSA examples, Hue for navigation and copying file to HDFS. The notebooks are created and run using Apache Zeppelin.
Clone the SANSA-Notebooks git repository:
git clone https://github.com/SANSA-Stack/SANSA-Notebooks
Get the SANSA Examples jar file (requires
Start the cluster (this will lead to downloading BDE docker images, will take a while):
When start-up is done you will be able to access the following interfaces:
- http://localhost:8080/ (Spark Master)
- http://localhost:8088/home (Hue HDFS Filebrowser)
- http://localhost/ (Zeppelin) To load the data to your cluster simply do:
Go on and open Zeppelin, choose any available notebook and try to execute it.
For more information refer to SANSA-Notebooks Github repository. If you have questions or found bugs, feel free to open an issue on the Github.
Configuring the Computing Frameworks
Running SANSA on Apache Spark.
|SANSA Version||Spark Version|
Running SANSA on Apache Flink.
|SANSA Version||Flink Version|
Using SANSA in Maven Projects
If you want to directly write an application on top of SANSA, simply add the following dependencies to your pom.xml to include SANSA in your project.
|On Spark applications||On Flink applications|