LogoLogo
HomeCommunity
EN master
EN master
  • Introduction
  • Components
  • Quickstart
    • Using Docker
    • Using Java SDK
    • Using Python SDK
    • Using CLI
  • Building from source
  • Installing Binaries
  • Configurations
    • JSON
    • YAML
  • GitHub
  • Examples
    • Java
      • Server
        • Pipeline Steps
          • Image To NDArray Step
          • Python Step
          • DL4J Step
          • Keras Step
          • ONNX Step
          • Tensorflow Step
        • Sequence Pipeline
        • Graph Pipeline
      • Client
        • Running Predictions
        • Inspecting a Server
    • Python
      • Server
        • Pipeline Steps
          • Image To NDArray Step
          • Python Step
          • DL4J Step
        • Sequence Pipeline
        • Graph Pipeline
      • Client
        • Running Predictions
        • Inspecting a Server
    • IPython Notebook
      • Basic
      • ONNX
        • Pytorch (IRIS)
        • Pytorch (MNIST)
      • Keras
      • Tensorflow
      • DL4J
    • CLI
      • Use-Cases
        • Creating a Sequence Pipeline
        • Creating a Graph Pipeline
        • Create Server URL with Inspection Queries
        • Adding Extra Classpaths
        • Multiple Instances of a Server
      • Commands
        • Serve Command
        • Logs Command
        • Inspect Command
        • Profile Command
  • How-To Guides
    • Serving a BMI Model
      • With HTML Content
    • Performing Object Detection
    • RPA Use-Case
    • Showing Metrics
      • Prometheus
      • Grafana
  • References
    • Pipeline Steps
      • IMAGE_TO_NDARRAY
      • IMAGE_CROP
      • IMAGE_RESIZE
      • DEEPLEARNINGL4J
      • KERAS
      • ND4JTENSORFLOW
      • ONNX
      • TENSORFLOW
      • SAMEDIFF
      • CLASSIFIER_OUTPUT
      • REGRESSION_OUTPUT
      • LOGGING
      • BOUNDING_BOX_FILTER
      • BOUNDING_BOX_TO_POINT
      • CROP_GRID
      • CROP_FIXED_GRID
      • DRAW_BOUNDING_BOX
      • DRAW_FACE_KEY_POINT
      • DRAW_GRID
      • DRAW_FIXED_GRID
      • DRAW_HEATMAP
      • DRAW_POINTS
      • DRAW_SEGMENTATION
      • EXTRACT_BOUNDING_BOX
      • SSD_TO_BBOX
      • YOLO_BBOX
      • RELATIVE_TO_ABSOLUTE
      • SHOW_IMAGE
      • FRAME_CAPTURE
      • VIDEO_CAPTURE
      • PERSPECTIVE_TRANSFORM
    • Inference Configuration
      • MQTT Configuration
      • KAFKA Configuration
    • CLI Commands
      • Serve Command
      • Logs Command
      • Inspect Command
      • Pythonpaths Command
      • Build Command
      • Config Command
      • Predict Command
      • Profile Command
  • Change Logs
    • Version 0.1.0
  • Contribution Guidelines
Powered by GitBook
On this page

Was this helpful?

  1. References
  2. Pipeline Steps

YOLO_BBOX

YoloToBoundingBoxStep is used to convert an NDArray for the predictions of a YOLO model to a list of bounding box.The NDArray is assumed to be in standard YOLO output format, after activation functions (sigmoid/softmax) have been applied.

Input must be a float or double NDArray with shape [minibatch, B*(5+C), H, W] if nchw is true or [minibatch, H, W, B*(5+C)] if nchw is false where B is number of bounding box priors, C is number of classes, H is output/label height and W is output/label width.

Along the channel dimension for each box prior, we have the following values:

  • 0: px = predicted x location within grid cell, 0.0 to 1.0

  • 1: py = predicted y location within grid cell, 0.0 to 1.0

  • 2: pw = predicted width, in grid cell, for example 0.0 to H (for example, pw = 2.0 -> 2.0/W fraction of image)

  • 3: ph = predicted height, in grid cell, for example 0.0 to H (for example, ph = 2.0 -> 2.0/H fraction of image)

  • 4: c = object confidence - i.e., probability an object is present or not, 0.0 to 1.0

  • 5 to 4+C = probability of class (given an object is present), 0.0 to 1.0, with values summing to 1.0

Note that the height/width dimensions are grid cell units - for example, with 416x416 input, 32 down sampling by the network we have 13x13 grid cells (each corresponding to 32 pixels in the input image). Thus, a center of X of 5.5 would be xPixels = 5.5x32 = 176 pixels from left. Widths and heights are similar: in this example, a width of 13 would be the entire image (416 pixels), and a height of 6.5 would be 6.5/13 = 0.5 of the image (208 pixels).

Configs

Descriptions

input

Name of the input - optional. If not set, the input is inferred (assuming a single NDArray exists in the input).

output

Name of the input - optional. If not set, bounding_boxes is used

nchw

The data format - NCHW (true) or NHWC (false) as known as channels firstfor true or channels last for false.

threshold

The threshold, in range 0.0 to 1.0. Any boxes with object confidence less than this will be ignored. Default value is 0.5.

nmsThreshold

Non-max suppression threshold to use, to filter closely overlapping objects. Default value is 0.50.

numClasses

Number of classes. Not required if classLabels are provided.

classLabels

Optional - names of the object classes.

keepOtherValues

If true: keep all other input fields in the data instance. If false: only return the list of bounding box. Default is set to true.

PreviousSSD_TO_BBOXNextRELATIVE_TO_ABSOLUTE

Last updated 4 years ago

Was this helpful?