Interactive Spark Notebooks can run SANSA-Examples and are easy to deploy with docker-compose. Deployment stack includes Hadoop for HDFS, Spark for running SANSA examples, Hue for navigation and copying file to HDFS. The notebooks are created and run using Apache Zeppelin.
Getting started
Clone the SANSA-Notebooks git repository:
1 2 |
git clone https://github.com/SANSA-Stack/SANSA-Notebooks cd SANSA-Notebooks |
Get the SANSA Examples jar file (requires wget
):
1 2 |
make |
Start the cluster (this will lead to downloading BDE docker images, will take a while):
1 2 |
make up |
When start-up is done you will be able to access the following interfaces:
- http://localhost:8080/ (Spark Master)
- http://localhost:8088/home (Hue HDFS Filebrowser)
- http://localhost/ (Zeppelin) To load the data to your cluster simply do:
1 2 |
make load-data |
Go on and open Zeppelin, choose any available notebook and try to execute it.
For more information refer to SANSA-Notebooks Github repository. If you have questions or found bugs, feel free to open an issue on the Github.