Models can be saved using Python with the .save() method. Refer to the TensorFlow documentation for Keras for details. These saved models shall be loaded in Java.
Keras model loading functionality in Konduit Serving converts Keras models to Deeplearning4J models. As a result, Keras models containing operations not supported in Deeplearning4J cannot be served in Konduit Serving. See issue 8348.
Overview
Konduit Serving works by defining a series of steps. These include operations such as
Pre- or post-processing steps
One or more machine learning models
Transforming the output in a way that can be understood by humans
If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep.
A reference Java project is provided in the Example repository ( https://github.com/KonduitAI/konduit-serving-examples ) with a Maven pom.xml dependencies file. If using the IntelliJ IDEA IDE, open the java folder as a Maven project and run the main function of the InferenceModelStepKeras class.
Configure the step
Define the Keras configuration as a ModelConfig object.
modelConfigType: This argument requires a ModelConfigType object. Specify modelType as ModelConfig.ModelType.KERAS, and modelLoadingPath to point to the location of Keras weights saved in the HDF5 file format.
For the ModelStep object, the following parameters are specified:
modelConfig: pass the ModelConfig object here
parallelInferenceConfig: specify the number of workers to run in parallel. Here, we specify workers = 1.
inputName, outputName: names for the input and output nodes, as lists
The inferenceConfiguration is stored as a JSON File. Set the KonduitServingMainArgs with the saved config.json file path as configPath and other necessary server configuration arguments.
Start server by calling KonduitServingMain with the configurations mentioned in the KonduitServingMainArgs using Callback Function(as per the code mentioned in the Inference Section below)
Inference
NDARRAY inputs to set ModelStep must be specified with a shape size.
To configure the client, set the required URL to connect server and specify any port number that is not reserved (as used in server configuration).
A Callback Function onSuccess is implemented in order to post the Client request and get the HttpResponse, only after the successful run of the KonduitServingMain Server.