What does mean «faster_rcnn» → «image_resizer» → «keep_aspect_ratio_resizer» in TensorFlow?

dmitrii_fediuk · May 13, 2019, 3:34am

tensorflow/models/blob/v1.13.0/research/object_detection/samples/configs/faster_rcnn_resnet101_voc07.config#L11-L14


keep_aspect_ratio_resizer {
  min_dimension: 600
  max_dimension: 1024
}

dmitrii_fediuk · May 13, 2019, 3:35am

What does mean «faster_rcnn» → «image_resizer» in TensorFlow?

github.com

tensorflow/models/blob/v1.13.0/research/object_detection/protos/image_resizer.proto#L22-L44


      
          // Configuration proto for image resizer that keeps aspect ratio.
          message KeepAspectRatioResizer {
            // Desired size of the smaller image dimension in pixels.
            optional int32 min_dimension = 1 [default = 600];
          
            // Desired size of the larger image dimension in pixels.
            optional int32 max_dimension = 2 [default = 1024];
          
            // Desired method when resizing image.
            optional ResizeType resize_method = 3 [default = BILINEAR];
          
            // Whether to pad the image with zeros so the output spatial size is
            // [max_dimension, max_dimension]. Note that the zeros are padded to the
            // bottom and the right of the resized image.
            optional bool pad_to_max_dimension = 4 [default = false];
          
            // Whether to also resize the image channels from 3 to 1 (RGB to grayscale).
            optional bool convert_to_grayscale = 5 [default = false];
          
            // Per-channel pad value. This is only used when pad_to_max_dimension is True.

This file has been truncated. show original

Specifying the keep_aspect_ratio_resizer follows the image resizing scheme described in the Faster R-CNN paper.
In this case it always resizes an image so that the smaller edge is 600 pixels and if the longer edge is greater than 1024 edges, it resizes such that the longer edge is 1024 pixels.
The resulting image always has the same aspect ratio as the input image.

github.com/tensorflow/models/issues/1794#issuecomment-311569473

The bounding box coordinates are normalised to the range [0 .. 1], so resizing the images won't affect those annotations.
Masks are resized or otherwise transformed along with the image, and always have the same dimensions as the image.

github.com/tensorflow/models/issues/1794#issuecomment-432929170

dmitrii_fediuk · May 19, 2019, 1:44am