Deeplearning4j (DL4J)
This page illustrates a simple client-server interaction to perform inference on a DL4J image classification model using the Python SDK for Konduit Serving.
This page documents two ways to create Konduit Serving configurations with the Python SDK:
Using Python to create a configuration, and
Writing the configuration as a YAML file, then serving it using the Python SDK.
These approaches are documented in separate tabs throughout this page. For example, the following code block shows the imports for each approach in separate tabs:
Saving models in Deeplearning4j
The following is a short Java program that loads a simple CNN model from DL4J's model zoo, initializes weights, then saves the model to a new file, SimpleCNN.zip
.
A reference Java project using DL4J 1.0.0-beta6 is provided in this repository with a Maven pom.xml
dependencies file. If using the IntelliJ IDEA IDE, open the java
folder as a Maven project and run the main
function of the SaveSimpleCNN
class.
Overview
Konduit Serving works by defining a series of steps. These include operations such as
Pre- or post-processing steps
One or more machine learning models
Transforming the output in a way that can be understood by humans
If deploying your model does not require pre- nor post-processing, only one step - a machine learning model - is required. This configuration is defined using a single ModelStep
.
Before running this notebook, run the build_jar.py
script or the konduit init
command. Refer to the Building from source page for details.
Configure the step
Define the DL4J configuration as a ModelConfig
object.
tensor_data_types_config
: TheModelConfig
object requires a dictionaryinput_data_types
. Its keys should represent column names, and the values should represent data types as strings, e.g."INT32"
. See here for a list of supported data types.model_config_type
: This argument requires aModelConfigType
object. In the Java program above, we recognized that SimpleCNN is configured as aMultiLayerNetwork
, in contrast with theComputationGraph
class, which is used for more complex networks. Specifymodel_type
asMULTI_LAYER_NETWORK
, andmodel_loading_path
to point to the location of DL4J weights saved in the ZIP file format.
For the ModelStep
object, the following parameters are specified:
model_config
: pass theModelConfig
object here.parallel_inference_config
: specify the number of workers to run in parallel. Here, we specifyworkers=1
.input_names
: names for the input data.output_names
: names for the output data.
To find the names of input and output nodes in DL4J,
for
input_names
: print the first element ofnet.getLayerNames()
.for
output_names
: check the last layer when printingnet.summary()
.
Configure the server
Specify the following:
http_port
: select a random port.input_data_format
,output_data_format
: specify input and output data formats as strings.
The ServingConfig
has to be passed to Server
in addition to the steps as a Python list. In this case, there is a single step: dl4j_step
.
Accepted input and output data formats are as follows:
Input:
JSON
,ARROW
,IMAGE
,ND4J
(not yet implemented) andNUMPY
.Output:
NUMPY
,JSON
,ND4J
(not yet implemented) andARROW
.
Start the server
Configure the client
To configure the client, create a Client object with the following arguments:
input_data_format
: data format passed to the server for inference.output_data_format
: data format returned by the server endpoint.return_output_data_format
: data format to be returned to the client. Note that this argument can be used to convert the output returned from the server to the client into a different format, e.g.NUMPY
toJSON
.
Inference
We generate a (3, 224, 224) array of random numbers between 0 and 255 as input to the model for prediction.
NDARRAY
inputs to ModelStep
s must be specified with a preceding batchSize
dimension. For batches with a single observation, this can be done by using numpy.expand_dims()
to add an additional dimension to your array.
Before requesting for a prediction, we normalize the image to be between 0 and 1:
Again, we can use the as_dict()
method of the config
attribute of server
to view the overall configuration:
Last updated