cd.ops

This submodule contains PyTorch operations.

Drawing Operations

draw_contours_(canvas, contours, close=True)

Draw contours.

Draw contours on canvas.

Note

This is an inplace operation.

Parameters
  • canvas – Tensor[h, w].

  • contours – Contours in (x, y) format. Tensor[num_contours, num_points, 2].

  • close – Whether to close contours. This is necessary if the last point of a contour is not equal to the first.

Box Operations

contours2boxes(contours, axis=-2)

Contours to boxes.

Converts contours to bounding boxes in (x0, y0, x1, y1) format.

Parameters
  • contours – Contours as Tensor[(…, )num_points, 2]

  • axis – The num_points axis.

Returns:

nms(boxes, scores, thresh=0.5) Tensor

Non-maximum suppression.

Perform non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).

Notes

  • Use torchvision.ops.boxes.nms if possible; This is just a “pure-python” alternative

  • cd.ops.boxes.nms for 8270 boxes: 13.9 ms ± 131 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

  • tv.ops.boxes.nms for 8270 boxes: 1.84 ms ± 4.91 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

  • cd.ops.boxes.nms for 179 boxes: 265 µs ± 1.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

  • tv.ops.boxes.nms for 179 boxes: 103 µs ± 2.61 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Parameters
  • boxes – Boxes. Tensor[num_boxes, 4] in (x0, y0, x1, y1) format.

  • scores – Scores. Tensor[num_boxes].

  • thresh – Threshold. Discards all overlapping boxes with IoU > thresh.

Returns

Keep indices. Tensor[num_keep].

pairwise_box_iou(boxes1: Tensor, boxes2: Tensor) Tensor
pairwise_generalized_box_iou(boxes1: Tensor, boxes2: Tensor) Tensor

Losses

iou_loss(boxes, boxes_targets, reduction='mean', generalized=True, method='linear', min_size=None)
log_margin_loss(inputs: Tensor, targets: Tensor, m_pos=0.9, m_neg=None, exponent=1, reduction='mean', eps=1e-06)
margin_loss(inputs: Tensor, targets: Tensor, m_pos=0.9, m_neg=None, exponent=2, reduction='mean')
r1_regularization(logits, inputs, gamma=1.0, reduction='sum')

R1 regularization.

A gradient penalty regularization. This regularization may for example be applied to a discriminator with real data:

R_1(\psi) &:= \frac{\gamma}{2} \mathbb E_{ p_{\mathcal D}(x)} \left[\|\nabla D_\psi(x)\|^2\right]

References

Examples

>>> real.requires_grad_(True)
... real_logits = discriminator(real)
... loss_d_real = F.softplus(-real_logits)
... loss_d_r1 = r1_regularization(real_logits, real)
... loss_d_real = (loss_d_r1 + loss_d_real).mean()
... loss_d_real.backward()
... real.requires_grad_(False)
Parameters
  • logits – Logits.

  • inputs – Inputs.

  • gamma – Gamma.

  • reduction – How to reduce all non-batch dimensions. E.g. 'sum' or 'mean'.

Returns

Penalty Tensor[n].

reduce_loss(x: Tensor, reduction: str, **kwargs)

Reduce loss.

Reduces Tensor according to reduction.

Parameters
  • x – Input.

  • reduction – Reduction method. Must be a symbol of torch.

  • **kwargs – Additional keyword arguments.

Returns

Reduced Tensor.

Normalization

pixel_norm(x, dim=1, eps=1e-08)

Pixel normalization.

References

Parameters
  • x – Input Tensor.

  • dim – Dimension to normalize.

  • eps – Epsilon.

Returns

Normalized Tensor.

Common Operations

downsample_labels(inputs, size: List[int])

Down-sample via max-pooling and interpolation

Notes

  • Downsampling can lead to loss of labeled instances, both during max pooling and interpolation.

  • Typical timing: 0.08106 ms for 256x256

Parameters
  • inputs – Label Tensor to resize. Shape (n, c, h, w)

  • size – Tuple containing target height and width.

Returns:

minibatch_std_layer(x, channels=1, group_channels=None, epsilon=1e-08)

Minibatch standard deviation layer.

The minibatch standard deviation layer first splits the batch dimension into slices of size group_channels. The channel dimension is split into channels slices. For the groups the standard deviation is calculated and averaged over spatial dimensions and channel slice depth. The result is broadcasted to the spatial dimensions, repeated for the batch dimension and then concatenated to the channel dimension of x.

References

Parameters
  • x – Input Tensor[n, c, h, w].

  • channels – Number of averaged standard deviation channels.

  • group_channels – Number of channels per group. Default: batch size.

  • epsilon – Epsilon.

Returns

Tensor[n, c + channels, h, w].

padded_stack2d(*images, dim=0) Tensor

Padding stack.

Stacks 2d images along given axis. Spatial dimensions are padded according to largest height/width.

Parameters
  • *images – Tensor[…, h, w]

  • dim – Stack dimension.

Returns

Tensor

split_spatially(x, height, width=None)

Split spatially.

Splits spatial dimensions of Tensor x into patches of size (height, width) and adds the patches to the batch dimension.

Parameters
  • x – Input Tensor[n, c, h, w].

  • height – Patch height.

  • width – Patch width.

Returns

Tensor[n * h//height * w//width, c, height, width].

strided_upsampling2d(x, factor=2, const=0)

Strided upsampling.

Upsample by inserting rows and columns filled with constant.

Parameters
  • x – Tensor[n, c, h, w].

  • factor – Upsampling factor.

  • const – Constant used to fill inserted rows and columns.

Returns

Tensor[n, c, h*factor, w*factor].

CPN Operations

batched_box_nms(boxes: List[Tensor], scores: List[Tensor], *args, iou_threshold: float) Tuple[List[Tensor], ...]
fouriers2contours(fourier, locations, samples=64, sampling=None, cache: Optional[Dict[str, Tensor]] = None, cache_size: int = 16)
Parameters
  • fourier – Tensor[…, order, 4]

  • locations – Tensor[…, 2]

  • samples – Number of samples. Only used for default sampling, ignored otherwise.

  • sampling – Sampling t. Default is linspace 0..1. Device should match fourier and locations.

  • cache – Cache for initial zero tensors. When fourier shapes are consistent this can increase execution times.

  • cache_size – Cache size.

Returns

Contours.

get_scale(actual_size, original_size, flip=True, dtype=torch.float32)
order_weighting(order, max_w=5, min_w=1, spread=None) Tensor
refinement_bucket_weight(index, base_index)
rel_location2abs_location(locations, cache: Optional[Dict[str, Tensor]] = None, cache_size: int = 16)
Parameters
  • locations – Tensor[…, 2, h, w]. In xy format.

  • cache – can be None.

  • cache_size

Returns:

resolve_refinement_buckets(samplings, num_buckets)
scale_contours(actual_size, original_size, contours)
Parameters
  • actual_size – Image size. E.g. (256, 256)

  • original_size – Original image size. E.g. (512, 512)

  • contours – Contours that are to be scaled to from actual_size to original_size. E.g. array of shape (1, num_points, 2) for a single contour or tuple/list of (num_points, 2) arrays. Last dimension is interpreted as (x, y).

Returns

Rescaled contours.

scale_fourier(actual_size, original_size, fourier, location)
Parameters
  • actual_size – Image size. E.g. (256, 256)

  • original_size – Original image size. E.g. (512, 512)

  • fourier – Fourier descriptor. E.g. array of shape (…, order, 4).

  • location – Location. E.g. array of shape (…, 2). Last dimension is interpreted as (x, y).

Returns

Rescaled fourier, rescaled location