Keras (TensorFlow 2.0)
This page illustrates a simple client-server interaction to perform inference on a Keras LSTM model using the Python SDK for Konduit Serving.
Saving models in Keras HDF5 (.h5) format
HDF5 model files can be saved with the .save()
method. Refer to the TensorFlow documentation for Keras for details.
Keras model loading functionality in Konduit Serving converts Keras models to Deeplearning4J models. As a result, Keras models containing operations not supported in Deeplearning4J cannot be served in Konduit Serving. See issue 8348.
Overview
Konduit Serving works by defining a series of steps. These include operations such as
Pre- or post-processing steps
One or more machine learning models
Transforming the output in a way that can be understood by humans
If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep
.
Before running this notebook, run the build_jar.py
script or the konduit init
command. Refer to the Building from source page for details.
Configure the step
Define the Keras configuration as a ModelConfig
object.
model_config_type
: This argument requires aModelConfigType
object. Specifymodel_type
asKERAS
, andmodel_loading_path
to point to the location of Keras weights saved in the HDF5 file format.
For the ModelStep
object, the following parameters are specified:
model_config
: pass theModelConfig
object here.parallel_inference_config
: specify the number of workers to run in parallel. Here, we specifyworkers=1
.input_names
: names for the input nodes.output_names
: names for the output nodes.
Input and output names can be obtained by visualizing the graph in Netron.
Configure the server
In the ServingConfig
, specify a port number.
The ServingConfig
has to be passed to Server
in addition to the steps as a Python list. In this case, there is a single step: keras_step
.
Start the server
Use the .start()
method:
Configure the client
To configure the client, create a Client object with the port
argument.
Note that you should create the Client object after the Server has started, so that Client can inherit the Server's attributes.
Inference
NDARRAY inputs to ModelSteps must be specified with a preceding batchSize
dimension. For batches with a single observation, this can be done by using numpy.expand_dims()
to add an additional dimension to your array.
Last updated