Theserve command is used to deploy a Konduit-Serving application which must be followed by configuration file either in JSON or YAML. There are a few other options to use with serve command which can be seen through the konduit serve --help command.
Examples
The server identifies with an id that can be set using the --serving-id or -id option, for example:
$konduitserve-idinf_server-cconfig.json
You'll be able to see the following output (trimmed for brevity):
...16:52:08.812 [vert.x-worker-thread-0] INFO o.d.nn.multilayer.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
16:52:08.838 [vert.x-worker-thread-0] INFO a.k.s.v.verticle.InferenceVerticle - ##################################################################### ## | / _ \ \ | _ \ | | _ _| __ __| | / | / ## . < ( | . | | | | | | | . < . < ## _|\_\ \___/ _|\_| ___/ \__/ ___| _| _|\_\ _) _|\_\ _) ## #####################################################################16:52:08.838 [vert.x-worker-thread-0] INFO a.k.s.v.verticle.InferenceVerticle - Pending server start, please wait.......16:52:09.052 [vert.x-eventloop-thread-0] INFO a.k.s.v.p.h.v.InferenceVerticleHttp - Inference HTTP server is listening on host: 'localhost'
16:52:09.052 [vert.x-eventloop-thread-0] INFO a.k.s.v.p.h.v.InferenceVerticleHttp - Inference HTTP server started on port 42823 with 4 pipeline steps
To start with a specific profile you can run the following command which starts a server in the foreground with an id of 'inf_server' using 'config.json' as configuration file and GPU profile:
$konduitserve-idinf_server-cconfig.json-pGPU
To learn more about profiles navigate to the following section:
You’ll see output like this, although the version number, etc. may be different based on your local machine:
Startingkonduitserver......16:06:49.704 [vert.x-worker-thread-0] INFO org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
16:06:49.722 [vert.x-worker-thread-0] INFO o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Linux]16:06:49.722 [vert.x-worker-thread-0] INFO o.n.l.a.o.e.DefaultOpExecutioner - Cores: [12]; Memory: [5.2GB];16:06:49.722 [vert.x-worker-thread-0] INFO o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]16:06:49.729 [vert.x-worker-thread-0] INFO o.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 11.0.22116:06:49.730 [vert.x-worker-thread-0] INFO o.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [GeForce RTX 2060]; cc: [7.5]; Total memory: [6222839808]
16:06:49.731 [vert.x-worker-thread-0] INFO o.nd4j.linalg.jcublas.JCublasBackend - Backend build information:GCC:"9.3.0"STDversion:201402LCUDA:11.0.221DEFAULT_ENGINE:samediff::ENGINE_CUDAHAVE_FLATBUFFERS..
Starts a server in the background with an id of 'inf_server' using 'config.yaml' as configuration file without creating the manifest jar file before launching the server:
$konduitserve-idinf_server-cconfig.yaml-b-rwm
The output will be like this showing the server is running in background: