Deeplearning4j (DL4J)

This page illustrates a simple client-server interaction to perform inference on a DL4J image classification model using the Python SDK for Konduit Serving.
import numpy as np
import os

This page documents two ways to create Konduit Serving configurations with the Python SDK:

  1. Using Python to create a configuration, and

  2. Writing the configuration as a YAML file, then serving it using the Python SDK.

These approaches are documented in separate tabs throughout this page. For example, the following code block shows the imports for each approach in separate tabs:

Python
Python from YAML
Python
from konduit import ModelConfig, TensorDataTypesConfig, ModelConfigType, \
ModelStep, ParallelInferenceConfig, ServingConfig, InferenceConfiguration
from konduit.server import Server
from konduit.client import Client
Python from YAML
from konduit.load import server_from_file, client_from_file

Saving models in Deeplearning4j

The following is a short Java program that loads a simple CNN model from DL4J's model zoo, initializes weights, then saves the model to a new file, SimpleCNN.zip.

import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.zoo.ZooModel;
import org.deeplearning4j.zoo.model.SimpleCNN;
import java.io.File;
public class SaveSimpleCNN {
private static int nClasses = 5;
private static boolean saveUpdater = false;
public static void main(String[] args) throws Exception {
ZooModel zooModel = SimpleCNN.builder()
.numClasses(nClasses)
.inputShape(new int[]{3, 224, 224})
.build();
MultiLayerConfiguration conf = ((SimpleCNN) zooModel).conf();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();
System.out.println(net.summary());
File locationToSave = new File("SimpleCNN.zip");
net.save(locationToSave, saveUpdater);
}
}

A reference Java project using DL4J 1.0.0-beta6 is provided in this repository with a Maven pom.xml dependencies file. If using the IntelliJ IDEA IDE, open the java folder as a Maven project and run the main function of the SaveSimpleCNN class.

Overview

Konduit Serving works by defining a series of steps. These include operations such as

  1. Pre- or post-processing steps

  2. One or more machine learning models

  3. Transforming the output in a way that can be understood by humans

If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep.

Before running this notebook, run the build_jar.py script or the konduit init command. Refer to the Building from source page for details.

Configure the step

Python
YAML
Python

Define the DL4J configuration as a ModelConfig object.

  • tensor_data_types_config: The ModelConfig object requires a dictionary input_data_types. Its keys should represent column names, and the values should represent data types as strings, e.g. "INT32". See here for a list of supported data types.

  • model_config_type: This argument requires a ModelConfigType object. In the Java program above, we recognized that SimpleCNN is configured as a MultiLayerNetwork, in contrast with the ComputationGraph class, which is used for more complex networks. Specify model_type as MULTI_LAYER_NETWORK, and model_loading_path to point to the location of DL4J weights saved in the ZIP file format.

input_data_types = {"image_array": "FLOAT"}
input_names = list(input_data_types.keys())
output_names = ["output"]
port = np.random.randint(1000, 65535)
dl4j_config = ModelConfig(
tensor_data_types_config=TensorDataTypesConfig(
input_data_types=input_data_types
),
model_config_type=ModelConfigType(
model_type="MULTI_LAYER_NETWORK",
model_loading_path=os.path.abspath("../data/multilayernetwork/SimpleCNN.zip")
)
)

For the ModelStep object, the following parameters are specified:

  • model_config: pass the ModelConfig object here.

  • parallel_inference_config: specify the number of workers to run in parallel. Here, we specify workers=1.

  • input_names: names for the input data.

  • output_names: names for the output data.

dl4j_step = ModelStep(
model_config=dl4j_config,
parallel_inference_config=ParallelInferenceConfig(workers=1),
input_names=input_names,
output_names=output_names
)
YAML

In the Java program above, we recognised that SimpleCNN is configured as a MultiLayerNetwork, in contrast with the ComputationGraph class, which is used for more complex networks. Hence, we create a dl4j_mln_step of type MULTI_LAYER_NETWORK.

  • model_loading_path denotes the location of the model file.

  • input_names and output_names denote the names of the input and output nodes, as lists.

  • input_data_types maps the data types of the input nodes to the data type. See here for a list of supported data types.

  • parallel_inference_config: specify the number of workers to run in parallel. Here, we specify workers=1.

steps:
dl4j_mln_step:
type: MULTI_LAYER_NETWORK
model_loading_path: ../data/multilayernetwork/SimpleCNN.zip
input_names:
- image_array
output_names:
- output
input_data_types:
image_array: FLOAT
parallel_inference_config:
workers: 1

To find the names of input and output nodes in DL4J,

  • for input_names: print the first element of net.getLayerNames().

  • for output_names: check the last layer when printing net.summary().

Configure the server

Specify the following:

  • http_port: select a random port.

  • input_data_format, output_data_format: specify input and output data formats as strings.

Python
YAML
Python

The ServingConfig has to be passed to Server in addition to the steps as a Python list. In this case, there is a single step: dl4j_step.

serving_config = ServingConfig(
http_port=port,
input_data_format='NUMPY',
output_data_format='NUMPY'
)
server = Server(
serving_config=serving_config,
steps=[dl4j_step]
)
YAML
serving:
http_port: 1337
input_data_format: NUMPY
output_data_format: NUMPY
log_timings: True
extra_start_args: -Xmx8

Accepted input and output data formats are as follows:

  • Input: JSON, ARROW, IMAGE, ND4J (not yet implemented) and NUMPY.

  • Output: NUMPY, JSON, ND4J (not yet implemented) and ARROW.

Start the server

Python
Python from YAML
Python
server.start()
Starting server...
Server has started successfully.
<subprocess.Popen at 0x2723b619ac8>
Python from YAML
konduit_yaml_path = "../yaml/deeplearning4j.yaml"
server = server_from_file(konduit_yaml_path)
server.start()

Configure the client

To configure the client, create a Client object with the following arguments:

  • input_data_format: data format passed to the server for inference.

  • output_data_format: data format returned by the server endpoint.

  • return_output_data_format: data format to be returned to the client. Note that this argument can be used to convert the output returned from the server to the client into a different format, e.g. NUMPY to JSON.

Python
YAML
Python
client = Client(
input_data_format='NUMPY',
output_data_format='NUMPY',
return_output_data_format="NUMPY",
host='http://localhost',
port=port
)
YAML

Add the following to your YAML configuration file:

client:
input_data_format: NUMPY
output_data_format: NUMPY
return_output_data_format: NUMPY
host: http://localhost
port: 1337

Use client_from_file to create a Client object:

konduit_yaml_path = "../yaml/deeplearning4j.yaml"
client = client_from_file(konduit_yaml_path)

Inference

We generate a (3, 224, 224) array of random numbers between 0 and 255 as input to the model for prediction.

NDARRAY inputs to ModelSteps must be specified with a preceding batchSize dimension. For batches with a single observation, this can be done by using numpy.expand_dims() to add an additional dimension to your array.

Before requesting for a prediction, we normalize the image to be between 0 and 1:

rand_image = np.random.randint(255, size=(1, 3, 224, 224)) / 255
prediction = client.predict({"image_array": rand_image})
print(prediction)
server.stop()
[[4.1741084e-02 3.2335979e-01 2.5368158e-02 3.9881383e-05 6.0949111e-01]]

Again, we can use the as_dict() method of the config attribute of server to view the overall configuration:

server.config.as_dict()
{'@type': 'InferenceConfiguration',
'steps': [{'@type': 'ModelStep',
'inputNames': ['image_array'],
'outputNames': ['output'],
'modelConfig': {'@type': 'ModelConfig',
'tensorDataTypesConfig': {'@type': 'TensorDataTypesConfig',
'inputDataTypes': {'image_array': 'FLOAT'}},
'modelConfigType': {'@type': 'ModelConfigType',
'modelType': 'MULTI_LAYER_NETWORK',
'modelLoadingPath': 'C:\\Users\\Skymind AI Berhad\\Documents\\konduit-serving-examples\\data\\multilayernetwork\\SimpleCNN.zip'}},
'parallelInferenceConfig': {'@type': 'ParallelInferenceConfig',
'workers': 1}}],
'servingConfig': {'@type': 'ServingConfig',
'httpPort': 57441,
'inputDataFormat': 'NUMPY',
'outputDataFormat': 'NUMPY',
'logTimings': True}}