Konduit Serving works by defining a series of steps. These include operations such as
Pre- or post-processing steps
One or more machine learning models
Transforming the output in a way that can be understood by humans
If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep.
A reference Java project is provided in the Example repository ( https://github.com/KonduitAI/konduit-serving-examples ) with a Maven pom.xml dependencies file. If using the IntelliJ IDEA IDE, open the java folder as a Maven project and run the main function of the InferenceModelStepMNIST class.
Configure the step
Define the TensorFlow configuration as a TensorFlowConfig object
tensorDataTypesConfig: The TensorFlowConfig object requires a HashMap input_data_types. Its keys should represent column names, and the values should represent data types as strings, e.g. "INT32","FLOAT",etc,. See here for a list of supported data types.
modelConfigType: This argument requires a ModelConfigType object. Specify modelType as TENSORFLOW, and modelLoadingPath to point to the location of TensorFlow weights saved in the PB file format.
The inferenceConfiguration is stored as a JSON File. Set the KonduitServingMainArgs with the saved config.json file path as configPath and other necessary server configuration arguments.
File configFile =newFile("config.json");FileUtils.write(configFile,inferenceConfiguration.toJson(),Charset.defaultCharset());//Set and Start inference server as per the above configurationsKonduitServingMainArgs args1 =KonduitServingMainArgs.builder().configStoreType("file").ha(false).multiThreaded(false).configPort(port).verticleClassName(InferenceVerticle.class.getName()).configPath(configFile.getAbsolutePath()).build();
Start server by calling KonduitServingMain with the configurations mentioned in the KonduitServingMainArgs using Callback Function(as per the code mentioned in the Inference Section below)
Inference
The image file(s) has to be converted into NDARRAY using ImageLoadingStep and passed as an input for inference.
To configure the client, set the required URL to connect server and specify any port number that is not reserved (as used in server configuration).
A Callback Function onSuccess is implemented in order to post the Client request and get the HttpResponse, only after the successful run of the KonduitServingMain Server.
Accepted input and output data formats are as follows:
Input: JSON, ARROW, IMAGE, ND4J and NUMPY.
Output: NUMPY, JSON, ND4J and ARROW. {% endhint %}
Note that we consider only one test image in this example.