MNIST

This notebook illustrates a simple client-server interaction to perform inference on a TensorFlow model using the Python SDK for Konduit Serving.

This tutorial is split into three parts:

  1. Freezing models

  2. Configuration

  3. Running the server

This tutorial is tested on TensorFlow 1.14, 1.15 and 2.00.

from konduit import ParallelInferenceConfig, ServingConfig, ModelConfigType, TensorFlowConfig
from konduit import TensorDataTypesConfig, ModelStep, InferenceConfiguration
from konduit.server import Server
from konduit.client import Client

import tensorflow as tf

if tf.__version__[0] == "1":     
    from tensorflow import keras
elif tf.__version__[0] == "2": 
    import tensorflow.compat.v1 as tf
    from tensorflow.compat.v1 import keras
else: 
    print("No valid TensorFlow version detected")

from keras.layers import Flatten, Dense, Dropout, Lambda
from keras.models import Sequential
from keras.datasets import mnist

from PIL import Image
import numpy as np
import imageio
import os
import matplotlib.pyplot as plt 
import pandas as pd

Creating frozen models (Tensorflow 1.x)

In TensorFlow 1.x, "frozen" models can be exported in the TensorFlow Graph format. For deployment, we only need information about the graph and checkpoint variables. Freezing a model allows you to discard information that is not required for deploying your model.

The following code is adapted from tf-import-examples in the deeplearning4j-examples repository.

In the following code, we build a model using TensorFlow's Keras API and save it as a TensorFlow Graph. The architecture is adapted from the following Kaggle kernel: https://inclass.kaggle.com/charel/learn-by-example-neural-networks-hello-world/notebook.

Overview

Konduit Serving works by defining a series of steps. These include operations such as

  1. Pre- or post-processing steps

  2. One or more machine learning models

  3. Transforming the output in a way that can be understood by humans

If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep.

Before running this notebook, run the build_jar.py script or the konduit init command. Refer to the Building from source page for details.

Configure the step

Define the TensorFlow configuration as a TensorFlowConfig object.

  • tensor_data_types_config: The TensorFlowConfig object requires a dictionary input_data_types. Its keys should represent column names, and the values should represent data types as strings, e.g. "INT32". See here for a list of supported data types.

  • model_config_type: This argument requires a ModelConfigType object. Specify model_type as TENSORFLOW, and model_loading_path to point to the location of TensorFlow weights saved in the PB file format.

Now that we have a TensorFlowConfig defined, we can define a ModelStep. The following parameters are specified:

  • model_config: pass the TensorFlowConfig object here

  • parallel_inference_config: specify the number of workers to run in parallel. Here, we specify workers=1.

  • input_names: names for the input data

  • output_names: names for the output data

Konduit Serving requires input and output names to be specified. In TensorFlow, you can find the names of your input and output nodes by printing model.inputs[0].op.name and model.outputs[0].op.name respectively. For more details, please refer to this StackOverflow answer.

Configure the server

Specify the following:

  • http_port: select a random port.

  • input_data_format, output_data_format: Specify input and output data formats as strings.

The ServingConfig has to be passed to Server in addition to the steps as a Python list. In this case, there is a single step: tf_step.

By default, Server() looks for the Konduit Serving JAR konduit.jar in the directory the script is run in. To change this default, use the jar_path argument.

Accepted input and output data formats are as follows:

  • Input: JSON, ARROW, IMAGE, ND4J (not yet implemented) and NUMPY.

  • Output: NUMPY, JSON, ND4J (not yet implemented) and ARROW.

Start the server

Start the server:

Configure the client

To configure the client, create a Client object by specifying the port number:

The Client's attributes will be obtained from the Server.

Inference

We obtain test images from the test set defined by keras.datasets.

png
png
png

Batch prediction

To predict in batches, the data_input dictionary has to be specified differently for client images in NDARRAY format. To input a batch of observations, ensure that your inputs are in the NCHW format: number of observations, channels (optional if single channel), height and width.

An example is as follows:

We compare the predicted probabilities and the corresponding labels:

0

1

2

3

4

5

6

7

8

9

0

0.0

0.000

0.001

0.001

0.0

0.0

0.0

0.998

0.000

0.0

1

0.0

0.000

0.998

0.002

0.0

0.0

0.0

0.000

0.000

0.0

2

0.0

0.986

0.005

0.000

0.0

0.0

0.0

0.005

0.003

0.0

The configuration is stored as a dictionary. Note that the configuration can be converted to a dictionary using the as_dict() method:

Last updated

Was this helpful?