cd.ops
This submodule contains PyTorch operations.
Drawing Operations
- draw_contours_(canvas, contours, close=True)
Draw contours.
Draw
contours
oncanvas
.Note
This is an inplace operation.
- Parameters:
canvas – Tensor[h, w].
contours – Contours in (x, y) format. Tensor[num_contours, num_points, 2].
close – Whether to close contours. This is necessary if the last point of a contour is not equal to the first.
Box Operations
- contours2boxes(contours, axis=-2)
Contours to boxes.
Converts contours to bounding boxes in (x0, y0, x1, y1) format.
- Parameters:
contours – Contours as Tensor[(…, )num_points, 2]
axis – The
num_points
axis.
Returns:
- filter_by_box_voting(boxes, thresh, min_vote, return_votes: bool = False)
Filter by box voting.
Filter boxes by popular vote. A box receives a vote if it has an IoU larger than thresh with another box. Each box also votes for itself, hence, the smallest possible vote is 1.
- Parameters:
boxes – Boxes.
thresh – IoU threshold for two boxes to be considered redundant, counting as a vote for both boxes.
min_vote – Minimum voting for a box to be accepted. A vote is the sum of IoUs of a box compared to all boxes, including itself. Hence, the smallest possible vote is 1.
return_votes – Whether to return voting results.
- Returns:
Keep indices and optionally voting results.
- nms(boxes, scores, thresh=0.5) Tensor
Non-maximum suppression.
Perform non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).
Notes
Use
torchvision.ops.boxes.nms
if possible; This is just a “pure-python” alternativecd.ops.boxes.nms
for 8270 boxes: 13.9 ms ± 131 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)tv.ops.boxes.nms
for 8270 boxes: 1.84 ms ± 4.91 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)cd.ops.boxes.nms
for 179 boxes: 265 µs ± 1.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)tv.ops.boxes.nms
for 179 boxes: 103 µs ± 2.61 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
- Parameters:
boxes – Boxes. Tensor[num_boxes, 4] in (x0, y0, x1, y1) format.
scores – Scores. Tensor[num_boxes].
thresh – Threshold. Discards all overlapping boxes with
IoU > thresh
.
- Returns:
Keep indices. Tensor[num_keep].
- pairwise_box_iou(boxes1: Tensor, boxes2: Tensor) Tensor
- pairwise_generalized_box_iou(boxes1: Tensor, boxes2: Tensor) Tensor
Losses
- iou_loss(boxes, boxes_targets, reduction='mean', generalized=True, method='linear', min_size=None)
- log_margin_loss(inputs: Tensor, targets: Tensor, m_pos=0.9, m_neg=None, exponent=1, reduction='mean', eps=1e-06)
- margin_loss(inputs: Tensor, targets: Tensor, m_pos=0.9, m_neg=None, exponent=2, reduction='mean')
- r1_regularization(logits, inputs, gamma=1.0, reduction='sum')
R1 regularization.
A gradient penalty regularization. This regularization may for example be applied to a discriminator with real data:
References
Examples
>>> real.requires_grad_(True) ... real_logits = discriminator(real) ... loss_d_real = F.softplus(-real_logits) ... loss_d_r1 = r1_regularization(real_logits, real) ... loss_d_real = (loss_d_r1 + loss_d_real).mean() ... loss_d_real.backward() ... real.requires_grad_(False)
- Parameters:
logits – Logits.
inputs – Inputs.
gamma – Gamma.
reduction – How to reduce all non-batch dimensions. E.g.
'sum'
or'mean'
.
- Returns:
Penalty Tensor[n].
- reduce_loss(x: Tensor, reduction: str, **kwargs)
Reduce loss.
Reduces Tensor according to
reduction
.- Parameters:
x – Input.
reduction – Reduction method. Must be a symbol of
torch
.**kwargs – Additional keyword arguments.
- Returns:
Reduced Tensor.
Normalization
- pixel_norm(x, dim=1, eps=1e-08)
Pixel normalization.
References
https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf (“Local Repose Norm”)
- Parameters:
x – Input Tensor.
dim – Dimension to normalize.
eps – Epsilon.
- Returns:
Normalized Tensor.
Common Operations
- downsample_labels(inputs, size: List[int])
Down-sample via max-pooling and interpolation
Notes
Downsampling can lead to loss of labeled instances, both during max pooling and interpolation.
Typical timing: 0.08106 ms for 256x256
- Parameters:
inputs – Label Tensor to resize. Shape (n, c, h, w)
size – Tuple containing target height and width.
Returns:
- interpolate_vector(v, size, **kwargs)
Interpolate vector.
- Parameters:
v – Vector as
Tensor[d]
.size – Target size.
**kwargs – Keyword arguments for
F.interpolate
Returns:
- minibatch_std_layer(x, channels=1, group_channels=None, epsilon=1e-08)
Minibatch standard deviation layer.
The minibatch standard deviation layer first splits the batch dimension into slices of size
group_channels
. The channel dimension is split intochannels
slices. For the groups the standard deviation is calculated and averaged over spatial dimensions and channel slice depth. The result is broadcasted to the spatial dimensions, repeated for the batch dimension and then concatenated to the channel dimension ofx
.References
- Parameters:
x – Input Tensor[n, c, h, w].
channels – Number of averaged standard deviation channels.
group_channels – Number of channels per group. Default: batch size.
epsilon – Epsilon.
- Returns:
Tensor[n, c + channels, h, w].
- pad_to_div(v, div=32, nd=2, return_pad=False, **kwargs)
Pad to div.
Applies padding to input Tensor to make it divisible by div.
- Parameters:
v – Input Tensor.
div – Div tuple. If single integer, nd is used to define number of dimensions to pad.
nd – Number of dimensions to pad. Only used if div is not a tuple or list.
return_pad – Whether to return padding values.
**kwargs – Additional keyword arguments for F.pad.
- Returns:
Padded Tensor.
- pad_to_size(v, size, return_pad=False, **kwargs)
Pad tp size.
Applies padding to end of each dimension.
- Parameters:
v – Input Tensor.
size – Size tuple. Last element corresponds to last dimension of input v.
return_pad – Whether to return padding values.
**kwargs – Additional keyword arguments for F.pad.
- Returns:
Padded Tensor.
- padded_stack2d(*images, dim=0) Tensor
Padding stack.
Stacks 2d images along given axis. Spatial dimensions are padded according to largest height/width.
- Parameters:
*images – Tensor[…, h, w]
dim – Stack dimension.
- Returns:
Tensor
- split_spatially(x, size)
Split spatially.
Splits spatial dimensions of Tensor
x
into patches of givensize
and adds the patches to the batch dimension.- Parameters:
x – Input Tensor[n, c, h, w, …].
size – Patch size of the splits.
- Returns:
Tensor[n * h//height * w//width, c, height, width].
- strided_upsampling2d(x, factor=2, const=0)
Strided upsampling.
Upsample by inserting rows and columns filled with
constant
.- Parameters:
x – Tensor[n, c, h, w].
factor – Upsampling factor.
const – Constant used to fill inserted rows and columns.
- Returns:
Tensor[n, c, h*factor, w*factor].
CPN Operations
- batched_box_nms(boxes: List[Tensor], scores: List[Tensor], *args, iou_threshold: float) Tuple[List[Tensor], ...]
- batched_box_nmsi(boxes: List[Tensor], scores: List[Tensor], iou_threshold: float) List[Tensor]
- filter_contours_by_stitching_rule(contours, tile_size, overlaps, rule='ex_br', offsets=None, indices=False)
Notes
The implemented stitching rules are considered greedy algorithms.
Border exclusion rules assume border behaviour of models to be consistent, which may not be the case in practice
- Parameters:
contours – Contours. Tensor[num_contours, num_points, 2]
tile_size – Tile size. Tensor[2] or tuple as (height, width).
overlaps – Overlaps for start and end of each spatial dimension. Tensor[2, 2].
rule – Stitching rule. Comma separation allowed.
offsets – Optional offsets for contours.
indices – Whether to return keep indices instead of a keep mask.
- Returns:
Keep indices or mask.
- fouriers2contours(fourier, locations, samples=64, sampling=None, cache: Dict[str, Tensor] | None = None, cache_size: int = 16)
- Parameters:
fourier – Tensor[…, order, 4]
locations – Tensor[…, 2]
samples – Number of samples. Only used for default sampling, ignored otherwise.
sampling – Sampling t. Default is linspace 0..1. Device should match fourier and locations.
cache – Cache for initial zero tensors. When fourier shapes are consistent this can increase execution times.
cache_size – Cache size.
- Returns:
Contours.
- get_scale(actual_size, original_size, flip=True, dtype=torch.float32)
- order_weighting(order, max_w=5, min_w=1, spread=None) Tensor
- refinement_bucket_weight(index, base_index)
- rel_location2abs_location(locations, cache: Dict[str, Tensor] | None = None, cache_size: int = 16)
- Parameters:
locations – Tensor[…, 2, h, w]. In xy format.
cache – can be None.
cache_size –
Returns:
- remove_border_contours(contours, size, padding=1, top=True, right=True, bottom=True, left=True, offsets=None)
Remove border contours.
Remove contours that touch border regions.
- Parameters:
contours – Contours as
Tensor[num_contours, num_points, 2]
.size – Context size.
padding – Padding. Determines the thickness of the border region.
padding=1
removes all contours that overlap with the outer 1px frame.top – Whether to test top border.
right – Whether to test right border.
bottom – Whether to test bottom border.
left – Whether to test left border.
offsets – Optional contour offsets in xy format.
- Returns:
Keep mask as
Tensor[num_contours]
.
- resolve_refinement_buckets(samplings, num_buckets)
- scale_contours(actual_size, original_size, contours)
- Parameters:
actual_size – Image size. E.g. (256, 256)
original_size – Original image size. E.g. (512, 512)
contours – Contours that are to be scaled to from actual_size to original_size. E.g. array of shape (1, num_points, 2) for a single contour or tuple/list of (num_points, 2) arrays. Last dimension is interpreted as (x, y).
- Returns:
Rescaled contours.
- scale_fourier(actual_size, original_size, fourier, location)
- Parameters:
actual_size – Image size. E.g. (256, 256)
original_size – Original image size. E.g. (512, 512)
fourier – Fourier descriptor. E.g. array of shape (…, order, 4).
location – Location. E.g. array of shape (…, 2). Last dimension is interpreted as (x, y).
- Returns:
Rescaled fourier, rescaled location