Introduction

The Core Engine - the home of reproducible machine learning

Whether you're a new or an experienced user, there is plenty to discover about the Core Engine. We've collected (and continue to expand) this information and present it in a digestible way so you can start and build.

Why use the Core Engine?

Most of the people reading this would want to know why they would want to use yet another supposed MLOps tool that solves all production problems. The simple answer to this question there is still no one solution out there that really solves the ML in production headache: Most of them either solve for really Ops-y problems (CI/CD, deployments, feature stores) or for really Data Scienc-y (remote kernels, metadata tracking, hyper-parameter tuning) problems. The tools that are really state-of-the-art and come close are not approachable (financially + technologically) for hobbyists or smaller teams that just want to get models in production. The result is that 87% of ML models never make it into production, and those that do make it tend be looked after by enormous engineering teams with big budgets.

‚ÄčThe team behind the Core Engine has been through the ringer with putting models in production, and has built the Core Engine from the perspective of both ML and Ops people. Our goal with the Core Engine is to provide a neat interface for data scientists to write production-ready code from training day 0, and to provide a configurable, extensible and managed backend for Ops people to keep things chugging along.

Last but not least, our hope is that the Core Engine provides the hobbyist/smaller companies with a golden path to put models in production. With our free plan, you can start writing production-ready ML pipelines immediately.

For Data Science people..

For the people who actually create models and do experiments, you get exposed a simple interface to plug and play your models and data with. You can run experiments remotely as easily as possible, and use the automatic evaluation mechanisms that are built-in to analyze what happened. The goal is for you to follow as closely as possible the pandas/numpy/scikit paradigm you are familiar with, but to end-up with production-ready, scale-able and deploy-able models at the end. Every parameter is tracked and all artifacts are reproducible.

For Ops people..

For the people who are responsible for managing the infrastructure and tasked with negotiating the ever changing ML eco-system, the Core Engine should be seen as a platform that provides high-level integrations to various backends that are tedious to build and maintain. If you want to swap out some components of the Core Engine with others, then you are free to do so! For example, if you want to deploy on different cloud providers (AWS, GCP, Azure), or a different data processing backend (Spark, Dataflow etc), then the Core Engine provides this ability natively.

What is the Core Engine?

The Core Engine is an end-to-end MLOps platform that serves multiple roles in your machine learning workflow. It is:

  • A workload processing engine - it processes and executes your code (in a distributed environment)

  • An orchestrator - it automates configuration, management, and coordination of your ML workloads.

  • A ML framework - it provides built-in plug-ins for normal tasks (like evaluation and serving).

  • A standardized interface - to quickly configure and run pipelines from data ingestion, to training, evaluation, and finally serving.

If you are used to writing Jupyter notebooks, scripts and glue-code to get your ML experiments or pipelines going, you should give the Core Engine a try. The Core Engine will provide you with an easy way to run your code in a distributed, transparent and tracked environment. You can leverage all the perks of running in a production-ready environment, but without the overhead of setting up the Ops, the datasources, organizing all this and writing the code that brings it all together into one coherent environment for your organization.

The Core Engine takes care of much of the hassle of ML development, so you can focus on writing your app without needing to reinvent the wheel. By providing an easy-to-use and powerful computing platform, we expedite the transition of ML models to production services.

How does this all work?

To simplify things, the Core Engine lets you create ML pipelines either through Python or the command line. No matter how you create it, at the end an easy-to-read YAML configuration file is produced with all necessary information required to uniquely identify what this pipeline is set up to do. This YAML file is a source of immutable ground truth for your colleagues that you can always trust no matter when it was produced and by whom.

Each pipeline is connected to a datasource commit - an immutable snapshot of any supported datasource. By versioning datasources using the Core Engine, you are able to track precisely what flows through your pipelines at any moment in time. The Core Engine supports multiple types (images, tabular, text) and sources (relational database, blob storage etc) of datasources.

The actual code that is being executed also exists in different types of functions that users can create asynchronously or during the creation of a pipeline. This creates a complete separation of the code from the configuration, and lets the Core Engine automatically track the important metadata that is necessary to keep an eye on as you progress through the ML life-cycle.

At the end of each training pipeline, the model is deployed on a supported backend as an endpoint. You are then able to schedule repeatably training pipelines based on time/data triggers, run a batch inference pipelines (also on a schedule if needed) and also run evaluation pipelines on other datasources according to your requirements. This way, every pipeline produces artifacts that are battle-tested and production-ready from day 1.

All the computation, training and deployment in the Core Engine is done on multiple supported backends, which can be swapped in and out according to your wishes. Currently, all our supported backends are on the Google Cloud Platform, but they will soon be available for AWS and Azure as well. We will publish a full list of supported environments and backends soon!

If all of this appeals to you, sign up now to get the Core Engine. We will always have a free tier that lets any one use our platform. So whats the hold up: Sign up now and create some ML pipelines!