Google Cloud Platform

The Google Cloud Platform provider is used to connect the Core Engine with your Google Cloud Platform account and to enable various backends.

Quickstart

If you're already familiar with the Core Engine and Google Cloud, feel free to skip this block.

CLI
CE Dashboard
CLI

Create a Google Cloud Platform provider by providing the path to a Google Cloud Storage bucket and a Service Account .json file.

Please make sure the service account has the required permissions to ensure we can orchestrate pipelines properly.

cengine provider create gcp PROVIDER_NAME \
--artifact_store=BUCKET_NAME \ # format: gs://bucket_name
--service_account=SERIVCE_ACCOUNT_FILE

where:

  • PROVIDER_NAME is a freely choosable name for this specific provider instance.

  • BUCKET_NAME is a path of a Google Storage Bucket. Note: This needs to be within the same Google Cloud Project as the Service Account.

  • SERVICE_ACCOUNT_FILE is a json file created by Google Cloud. It is the credential file for a Service Account.

CE Dashboard

Currently, this feature is unavailable.

We are working hard to create a dashboard for the Core Engine. Please see our roadmap for an indication of when this feature will be released.

Prerequisites

In order to connect the Core Engine with a Google Cloud project you need will need a Google Cloud account with the necessary permissions to

  • Enable Services

  • Create a Service Account

Required permissions

To successfully run the Core Engine in your Google Cloud you need to provide us with a Service Account with the following permissions:

  • roles/iam.serviceAccountUser so we can ...

  • roles/compute.instanceAdmin.v1 to run pipelines with the default orchestrator

  • roles/bigquery.admin so we can persist datasource commits in BigQuery (our default version store)

  • roles/storage.admin so we can persist artifacts in a storage bucket (our default artifact store)

Note: Some of the processing backends for Google Cloud require additional permissions.

Enabling services

For full functionality please enable these services within your Google Cloud Account.

  • Compute

  • Storage

  • BigQuery

  • Dataflow

  • AI Platform

gcloud CLI
Google Cloud Web Console
gcloud CLI
gcloud services enable compute.googleapis.com \
storage.googleapis.com \
bigquery.googleapis.com \
dataflow.googleapis.com \
ml.googleapis.com
Google Cloud Web Console

Please go to the Web Dashboard of Google Cloud and follow these steps to enable the required APIs for Compute, Storage, BigQuery, Dataflow and AI Platform:

https://cloud.google.com/endpoints/docs/openapi/enable-api

Creating a Service Account

gcloud CLI
Google Cloud Web Console
gcloud CLI

The most convenient way is through the gcloud command line.

For convenience:

Run the following commands to avoid retyping:

export PROJECT_ID=<enter your Google Cloud Project name>
export SA_NAME="core-engine"
export SA_PATH="$(pwd)/core-engine.json"
export BUCKET_NAME="ce-artifacts-$(date

Creating the actual Service Account

gcloud iam service-accounts create ${SA_NAME} \
--description="Client for Core Engine"gcloud command line.

Creating a json-key for the new Service Account

gcloud iam service-accounts keys create ${SA_PATH} \
--iam-account ${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com

Give permissions to the new Service Account

gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--member=serviceAccount:${SA_NAME}@${PROJECT_ID}.gserviceaccount.com \
--role "roles/iam.serviceAccountUser" \
--role "roles/compute.instanceAdmin.v1" \ # allows us to launch VMs
--role "roles/bigquery.admin" \ # allows us to store datasource commits
--role "roles/storage.admin" # allows us to store artifacts
Google Cloud Web Console

Please go to the Web Dashboard of Google Cloud and follow these steps to enable the required APIs for Compute, Storage, BigQuery, Dataflow and AI Platform:

https://cloud.google.com/endpoints/docs/openapi/enable-api

Create a Google Storage Bucket

gcloud CLI
gcloud CLI
gsutil mb gs://${BUCKET_NAME} -p ${PROJECT_ID}