- Databricks Community Edition
Setting up a Spark cluster in Databricks.
One of the quickest ways to get up and running with PySpark is to use a hosted notebook environment.
Databricks is the largest Spark vendor and provides a free version for getting started called Community Edition. We’ll use this environment to get started with Spark and build AWS and GCP model pipelines.
Getting started
The first step is to create a login on the Databricks website for the community edition. Next, perform the following steps to spin up a test cluster after logging in:
- Click on “Clusters” on the left navigation bar.
- Click “Create Cluster”.
- Assign a name, “DSP”.
Get hands-on with 1400+ tech skills courses.