一、主要貢獻

作者以RetinaNet和FCOS為例，分析了anchor-based和anchor-free的性能差異的原因：

1、每個位置的anchor數(shù)量不同。retinanet每個點多個anchor，fcos每個點只有一個anchor point
2、正負樣本的定義方法不同。retinanet使用IOU的雙閾值，fcos使用空間和尺度限制
3、回歸的初始狀態(tài)。retinanet是修改先驗的anchor；fcos是使用anchor point。

ATSS論文的主要貢獻：

1、指出anchor-based和anchor-free的檢測方法的本質區(qū)別是由于正負樣本的定義不同
2、提出一個通過目標的統(tǒng)計特征，在訓練過程中自適應進行正負樣本分配
3、證明在一個位置放置多個anchor去檢測目標是一個低效的方法
4、在沒有任何成本的情況下達到了COCO上最好的表現(xiàn)

拋出了一個在目標檢測領域的核心問題，即label asign，如何分配正負樣本？

二、分析anchor-free和anchor-based方法的差距

作者為了公平的比較兩者實際的差異，使用相同的訓練方法和tricks，并且將RetinaNet每個位置的anchor設為1。但是兩者依舊存在0.8%的差距。

image.png

作者繼續(xù)分析了存在差距的原因：

1、正負樣本的定義方法

image.png
2、回歸的初始狀態(tài)，即對anchor回歸還是對一個中心點回歸。

image.png

通過以下實驗的，得出結論：正負樣本的定義方法才是核心原因

image.png

三、提出Adaptive Training Sample Selection

在訓練的過程中，通過目標的統(tǒng)計特征，自動進行正負樣本的劃分。具體過程：

1、對于每個ground-truth $g$ ，通過 $L2$ 距離選擇 $k$ 個離其中心點最近的anchor，對于 $\mathcal L$ 層特征金字塔，共存在 $k \times \mathcal L$ 個候選的正樣本。
2、計算挑選出來的候選的正樣本和 $g$ 之間的IOU。計算相應的均值 $m_g$ 和標準差 $v_g$ 。
3、通過均值和標準差這兩個統(tǒng)計特征，得到閾值 $t_g = m_g + v_g$
4、如果候選樣本中IOU大于 $t_g$ ，并且候選樣本的中心點位于ground-truth中，將其標記為正樣本
5、如果一個anchor box被分配給了多個ground-truth，僅保留IOU最大的。

image.png
1、為什么通過中心點的歐式距離選擇候選的正樣本？
對于RetinaNet和FCOS，越靠近ground-truth，預測效果越好。
2、為什么使用了均值和標準差作為IOU閾值？
可以自動調節(jié)選取正負樣本的閾值。比如當出現(xiàn)高方差的時候，往往意味著有一個FPN層出現(xiàn)了較高的IOU，說明該層非常適合這個物體的預測，因此最終的正樣本都出自該層；而出現(xiàn)低方差的時候，說明有多個FPN層適合預測這個物體，因此會在多個層選取正樣本。

image.png
3、為什么限制anchor box的中心點要在ground-truth中？
中心點在ground-truth之外的anchor box往往屬于poor candidates。使用ground-truth外的特征去預測ground-truth。
4、采用這種label asign劃分正負樣本是否有效
根據(jù)統(tǒng)計統(tǒng)計學，雖然不是標準的正態(tài)分布，但是仍然大約會有16%的候選樣本會被劃分為正樣本，每一個ground-truth在不同尺度、不同比例、不同位置都會分配 $0.2 \times k \times \mathcal L$ 個正樣本。相反對于RetinaNet和FCOS的分配策略而言，大的物體會有更多的正樣本，這并不是一種公平的方式。
5、如何選擇超參數(shù) $k$ ？
對于 $k$ 的選擇并不敏感。

image.png

四、結果驗證

1、使用了 ATSS后，RetinaNet和FCOS無明顯差距

image.png

2、不同尺度和不同比例的anchor box效果都很魯棒

image.png

3、引入ATSS策略后，設置anchor數(shù)量與結果沒有明顯的關系。

image.png

4、ATSS的性能

image.png

五、源碼實現(xiàn)

源碼參考了mmdetection的實現(xiàn)：

@BBOX_ASSIGNERS.register_module()
class ATSSAssigner(BaseAssigner):
    """Assign a corresponding gt bbox or background to each bbox.

    Each proposals will be assigned with `0` or a positive integer
    indicating the ground truth index.

    - 0: negative sample, no assigned gt
    - positive integer: positive sample, index (1-based) of assigned gt

    Args:
        topk (float): number of bbox selected in each level
    """

    def __init__(self,
                 topk,
                 iou_calculator=dict(type='BboxOverlaps2D'),
                 ignore_iof_thr=-1):
        self.topk = topk
        self.iou_calculator = build_iou_calculator(iou_calculator)
        self.ignore_iof_thr = ignore_iof_thr

    # https://github.com/sfzhang15/ATSS/blob/master/atss_core/modeling/rpn/atss/loss.py

    def assign(self,
               bboxes,
               num_level_bboxes,
               gt_bboxes,
               gt_bboxes_ignore=None,
               gt_labels=None):
        """Assign gt to bboxes.

        The assignment is done in following steps

        1. compute iou between all bbox (bbox of all pyramid levels) and gt
        2. compute center distance between all bbox and gt
        3. on each pyramid level, for each gt, select k bbox whose center
           are closest to the gt center, so we total select k*l bbox as
           candidates for each gt
        4. get corresponding iou for the these candidates, and compute the
           mean and std, set mean + std as the iou threshold
        5. select these candidates whose iou are greater than or equal to
           the threshold as postive
        6. limit the positive sample's center in gt


        Args:
            bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4).
            num_level_bboxes (List): num of bboxes in each level
            gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
            gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are
                labelled as `ignored`, e.g., crowd boxes in COCO.
            gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).

        Returns:
            :obj:`AssignResult`: The assign result.
        """
        INF = 100000000
        bboxes = bboxes[:, :4]
        num_gt, num_bboxes = gt_bboxes.size(0), bboxes.size(0)

        # compute iou between all bbox and gt
        overlaps = self.iou_calculator(bboxes, gt_bboxes)

        # assign 0 by default
        assigned_gt_inds = overlaps.new_full((num_bboxes, ),
                                             0,
                                             dtype=torch.long)

        if num_gt == 0 or num_bboxes == 0:
            # No ground truth or boxes, return empty assignment
            max_overlaps = overlaps.new_zeros((num_bboxes, ))
            if num_gt == 0:
                # No truth, assign everything to background
                assigned_gt_inds[:] = 0
            if gt_labels is None:
                assigned_labels = None
            else:
                assigned_labels = overlaps.new_full((num_bboxes, ),
                                                    -1,
                                                    dtype=torch.long)
            return AssignResult(
                num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels)

        # compute center distance between all bbox and gt
        gt_cx = (gt_bboxes[:, 0] + gt_bboxes[:, 2]) / 2.0
        gt_cy = (gt_bboxes[:, 1] + gt_bboxes[:, 3]) / 2.0
        gt_points = torch.stack((gt_cx, gt_cy), dim=1)

        bboxes_cx = (bboxes[:, 0] + bboxes[:, 2]) / 2.0
        bboxes_cy = (bboxes[:, 1] + bboxes[:, 3]) / 2.0
        bboxes_points = torch.stack((bboxes_cx, bboxes_cy), dim=1)

        distances = (bboxes_points[:, None, :] -
                     gt_points[None, :, :]).pow(2).sum(-1).sqrt()

        if (self.ignore_iof_thr > 0 and gt_bboxes_ignore is not None
                and gt_bboxes_ignore.numel() > 0 and bboxes.numel() > 0):
            ignore_overlaps = self.iou_calculator(
                bboxes, gt_bboxes_ignore, mode='iof')
            ignore_max_overlaps, _ = ignore_overlaps.max(dim=1)
            ignore_idxs = ignore_max_overlaps > self.ignore_iof_thr
            distances[ignore_idxs, :] = INF
            assigned_gt_inds[ignore_idxs] = -1

        # Selecting candidates based on the center distance
        candidate_idxs = []
        start_idx = 0
        for level, bboxes_per_level in enumerate(num_level_bboxes):
            # on each pyramid level, for each gt,
            # select k bbox whose center are closest to the gt center
            end_idx = start_idx + bboxes_per_level
            distances_per_level = distances[start_idx:end_idx, :]
            selectable_k = min(self.topk, bboxes_per_level)
            _, topk_idxs_per_level = distances_per_level.topk(
                selectable_k, dim=0, largest=False)
            candidate_idxs.append(topk_idxs_per_level + start_idx)
            start_idx = end_idx
        candidate_idxs = torch.cat(candidate_idxs, dim=0)

        # get corresponding iou for the these candidates, and compute the
        # mean and std, set mean + std as the iou threshold
        candidate_overlaps = overlaps[candidate_idxs, torch.arange(num_gt)]
        overlaps_mean_per_gt = candidate_overlaps.mean(0)
        overlaps_std_per_gt = candidate_overlaps.std(0)
        overlaps_thr_per_gt = overlaps_mean_per_gt + overlaps_std_per_gt

        is_pos = candidate_overlaps >= overlaps_thr_per_gt[None, :]

        # limit the positive sample's center in gt
        for gt_idx in range(num_gt):
            candidate_idxs[:, gt_idx] += gt_idx * num_bboxes
        ep_bboxes_cx = bboxes_cx.view(1, -1).expand(
            num_gt, num_bboxes).contiguous().view(-1)
        ep_bboxes_cy = bboxes_cy.view(1, -1).expand(
            num_gt, num_bboxes).contiguous().view(-1)
        candidate_idxs = candidate_idxs.view(-1)

        # calculate the left, top, right, bottom distance between positive
        # bbox center and gt side
        l_ = ep_bboxes_cx[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 0]
        t_ = ep_bboxes_cy[candidate_idxs].view(-1, num_gt) - gt_bboxes[:, 1]
        r_ = gt_bboxes[:, 2] - ep_bboxes_cx[candidate_idxs].view(-1, num_gt)
        b_ = gt_bboxes[:, 3] - ep_bboxes_cy[candidate_idxs].view(-1, num_gt)
        is_in_gts = torch.stack([l_, t_, r_, b_], dim=1).min(dim=1)[0] > 0.01
        is_pos = is_pos & is_in_gts

        # if an anchor box is assigned to multiple gts,
        # the one with the highest IoU will be selected.
        overlaps_inf = torch.full_like(overlaps,
                                       -INF).t().contiguous().view(-1)
        index = candidate_idxs.view(-1)[is_pos.view(-1)]
        overlaps_inf[index] = overlaps.t().contiguous().view(-1)[index]
        overlaps_inf = overlaps_inf.view(num_gt, -1).t()

        max_overlaps, argmax_overlaps = overlaps_inf.max(dim=1)
        assigned_gt_inds[
            max_overlaps != -INF] = argmax_overlaps[max_overlaps != -INF] + 1

        if gt_labels is not None:
            assigned_labels = assigned_gt_inds.new_full((num_bboxes, ), -1)
            pos_inds = torch.nonzero(
                assigned_gt_inds > 0, as_tuple=False).squeeze()
            if pos_inds.numel() > 0:
                assigned_labels[pos_inds] = gt_labels[
                    assigned_gt_inds[pos_inds] - 1]
        else:
            assigned_labels = None
        return AssignResult(
            num_gt, assigned_gt_inds, max_overlaps, labels=assigned_labels)

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Adaptive Training Sample Selection

Adaptive Training Sample Selection

一、主要貢獻

二、分析anchor-free和anchor-based方法的差距

三、提出Adaptive Training Sample Selection

四、結果驗證

五、源碼實現(xiàn)

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Adaptive Training Sample Selection

一、主要貢獻

二、分析anchor-free和anchor-based方法的差距

三、提出Adaptive Training Sample Selection

四、結果驗證

五、源碼實現(xiàn)

相關閱讀更多精彩內容

友情鏈接更多精彩內容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

一、主要貢獻

二、分析anchor-free和anchor-based方法的差距

三、提出Adaptive Training Sample Selection

四、結果驗證

五、源碼實現(xiàn)