This page provides a Java example of deploying a built-in model Python with Open Neural Network Exchange (ONNX) platform. The ONNX format is supported by other deep learning frameworks such Tensorflow, Pytorch, etc. In this example, the ONNX model is used to deploy the Iris model in the server.
We'll need to include ONNXStep into the pipeline and specify the following:
modelUri : the model file path
inputNames : names for model's input layer
outputNames : names for model's output layer
//include pipeline step into the Inference ConfigurationinferenceConfiguration.pipeline(SequencePipeline.builder().add(newONNXStep()//add ONNXStep into pipeline.modelUri(modelTrainResult.modelPath()).inputNames(modelTrainResult.inputNames()).outputNames(modelTrainResult.outputNames()) ).build());
Deploy the server
Let's deploy the model in the server by calling DeployKonduitServing with the configuration made before. A callback function is used to respond only after a successful or failed server deployment inside the handler block.
//deploy the model in serverDeployKonduitServing.deploy(newVertxOptions(),newDeploymentOptions(), inferenceConfiguration, handler -> {if (handler.succeeded()) { // If the server is sucessfully running// Getting the result of the deploymentInferenceDeploymentResult inferenceDeploymentResult =handler.result();int runnningPort =inferenceDeploymentResult.getActualPort();String deploymentId =inferenceDeploymentResult.getDeploymentId();System.out.format("The server is running on port %s with deployment id of %s%n", runnningPort, deploymentId);try {String result =Unirest.post(String.format("http://localhost:%s/predict", runnningPort)).header("Content-Type","application/json").header("Accept","application/json").body(newJSONObject().put("input",newJSONArray().put(Arrays.asList(1.0,1.0,1.0,1.0))) ).asString().getBody();System.out.format("Result from server : %s%n", result);System.exit(0); } catch (UnirestException e) {e.printStackTrace();System.exit(1); } } else { // If the server failed to runSystem.out.println(handler.cause().getMessage());System.exit(1); } });
Note that we consider only one test input array in this example for inference to show the model's deployment in Konduit-Serving. After execution, the successful server deployment gives below output text.
The server is running on port 44301 with deployment id of 775bfbd3-2d18-435b-86c6-e9fbe7303cad
Result from server : {
"output" : [ [ 0.035723433, 0.27029678, 0.69397974 ] ]
}
Process finished with exit code 0
The complete inference configuration in YAML format is as follows.