OpenVINO Model Server Adapter¶
The OVMSAdapter
implements ModelAdapter
interface. The OVMSAdapter
makes it possible to use Model API with models hosted in OpenVINO Model Server.
Prerequisites¶
OVMSAdapter
enables inference via gRPC calls to OpenVINO Model Server, so in order to use it you need two things:
OpenVINO Model Server that serves your model
ovmsclient
package installed to enable communication with the model server
Deploy OpenVINO Model Server¶
Model Server is distributed as a docker image and it’s available in DockerHub, so you can use it with docker run
command. See model server documentation to learn how to deploy OpenVINO optimized models with OpenVINO Model Server.
Install ovmsclient¶
ovmsclient
package is distributed on PyPi, so the easiest way to install it is via:
pip3 install ovmsclient
Model configuration¶
When using OpenVINO Model Server model cannot be directly accessed from the client application (like OMZ demos). Therefore any configuration must be done on model server side.
Input reshaping¶
For some use cases you may want your model to reshape to match input of certain size. In that case, you should provide --shape auto
parameter to model server startup command. With that option, model server will reshape model input on demand to match the input data.
Inference options¶
It’s possible to configure inference related options for the model in OpenVINO Model Server with options:
--target_device
- name of the device to load the model to--nireq
- number of InferRequests--plugin_config
- configuration of the device plugin
See model server configuration parameters for more details.
Example OVMS startup command¶
docker run -d --rm -v /home/user/models:/models -p 9000:9000 openvino/model_server:latest --model_path /models/model1 --model_name model1 --port 9000 --shape auto --nireq 32 --target_device CPU --plugin_config "{\"CPU_THROUGHPUT_STREAMS\": \"CPU_THROUGHPUT_AUTO\"}"
Note: In demos, while using
--adapter ovms
, inference options like:-nireq
,-nstreams
-nthreads
as well as device specification with-d
will be ignored.
Running demos with OVMSAdapter¶
To run the demo with model served in OpenVINO Model Server, you would have to provide --adapter ovms
option and modify -m
parameter to indicate model inference service instead of the model files. Model parameter for OVMSAdapter
follows this schema:
<service_address>/models/<model_name>[:<model_version>]
<service_address>
- OVMS gRPC service address in form<address>:<port>
<model_name>
- name of the target model (the one specified bymodel_name
parameter in the model server startup command)<model_version>
(optional) - version of the target model (default: latest)
Assuming that model server runs on the same machine as the demo, exposes gRPC service on port 9000 and serves model called model1
, the value of -m
parameter would be:
localhost:9000/models/model1
- requesting latest model versionlocalhost:9000/models/model1:2
- requesting model version number 2