Introduction
Konduit Serving is a serving system and framework focused on deploying machine learning pipelines to production.
Last updated
Konduit Serving is a serving system and framework focused on deploying machine learning pipelines to production.
Last updated
Konduit Serving provides building blocks for developers to write their own production machine learning pipelines from pre-processing to model serving, exposable as a simple REST API.
The core abstraction is an idea called a pipeline step. A pipeline step performs a task such as:
pre-processing steps;
running one or more machine learning models; and
post-processing steps: transforming the output in a way that can be understood by humans, such as labels in a classification example,
as part of using a machine learning model in a deployment scenario.
For instance, a ModelStep
performs inference on a (mix of) TensorFlow, Keras, Deeplearning4j (DL4J) or Predictive Model Markup Language (PMML) models.
A custom pipeline step can be built using a PythonStep
. This allows you to embed pre- or post-processing steps into your machine learning pipeline, or to serve models built in frameworks that do not have built-inModelStep
s such as scikit-learn and PyTorch.
Konduit Serving also contains functionality for other pre-processing tasks, such as DataVec transform processes and image transforms.
One way to configure a Konduit Serving instance is by using a YAML file. The following YAML file configures a Konduit Serving instance to run a short Python script as specified in the python_code
argument:
Installing the Konduit Serving Python SDK exposes the konduit
command line interface (CLI). Assuming the YAML file above is saved in the current directory as hello-world.yaml
, start a Konduit Serving instance by running the following code in the command line:
This exposes a REST API for sending data to the server for inference. Inputs can be sent using the CLI, the Python SDK or any other application that supports sending HTTP POST requests such as requests or UiPath (for RPA-based workflows).
Finally, stop the Konduit Serving instance:
To get started with Konduit Serving, check out the Quickstart page.
We strive to provide a Python-first SDK that makes it easy to integrate Konduit Serving into a Python-first workflow.
We want to expose modern standards for monitoring everything from your GPU to your inference time. Konduit Serving supports visualization applications such as Grafana that support the Prometheus standard for visualizing data.
Konduit Serving was built with the goal of providing proper low-level interoperability with native math libraries such as TensorFlow and DL4J's core math library libnd4j. At the core of Konduit Serving are the JavaCPP Presets, Vert.x and DL4J for running Keras models in Java.
Combining JavaCPP's low-level access to C-like APIs from Java with Java's robust server-side application development (Vert.x on top of netty) allows for better access to faster math code in production while minimizing the surface area where native code = more security flaws (mainly in server side networked applications). This allows us to do things like zero-copy memory access of NumPy arrays or Arrow records for consumption straight from the server without copy or serialization overhead. Extending that to Python SDK, we know when to return a raw Arrow record and return it as a pandas DataFrame.
When dealing with deep learning, we can handle proper inference on the GPU (batching large workloads).
A Vert.x-based model server and pipeline development framework allows a thin abstraction that can be embedded in a Java microservice.
We aim to provide integrations with more enterprise platforms typically seen outside the big data space.