深度学习论文: Learning to Resize Images for Computer Vision Tasks及其PyTorch实现

深度学习论文: Learning to Resize Images for Computer Vision Tasks及其PyTorch实现Learning to Resize Images for Computer Vision TasksPDF: https://arxiv.org/pdf/2103.09950.pdfPyTorch代码: https://github.com/shanglia

mingo_敏

2357人浏览 · 2022-01-10 10:33:34

mingo_敏 · 2022-01-10 10:33:34 发布

深度学习论文: Learning to Resize Images for Computer Vision Tasks及其PyTorch实现
Learning to Resize Images for Computer Vision Tasks
PDF: https://arxiv.org/pdf/2103.09950.pdf
PyTorch代码: https://github.com/shanglianlm0525/CvPytorch
PyTorch代码: https://github.com/shanglianlm0525/PyTorch-Networks

1 概述

图像预处理的一个重要操作就是resize，把不同大小的图像缩放到同一尺寸，但目前用到的resize技术仍然是老旧的，无法根据数据变换。Google Research提出一个可学习的resizer，只需在预处理部分略作修改，即可提升CV模型性能！
在这里插入图片描述

2 Resizer

提出的resizer模型架构如下图：
在这里插入图片描述
主要包括了两个重要的特性：（1）双线性特征调整大小（bilinear feature resizing），以及（2）跳过连接（skip connection），该连接可容纳双线性调整大小的图像和CNN功能的组合。

第一个特性考虑到以原始分辨率计算的特征与模型的一致性。跳过连接可以简化学习过程，因为重定大小器模型可以直接将双线性重定大小的图像传递到基线任务中。

与一般的编码器-解码器架构不同，这篇论文中所提出的体系结构允许将图像大小调整为任何目标大小和纵横比。并且可学习的resizer性能几乎不依赖于双线性重定器的选择，这意味着它可以直接替换其他现成的方法。

3 Experiments

在这里插入图片描述
PyTorch代码:

import torch
import torch.nn as nn
import torch.nn.functional as F
from functools import partial

"""
    Learning to Resize Images for Computer Vision Tasks
    https://arxiv.org/pdf/2105.04714.pdf
"""

def conv1x1(in_chs, out_chs = 16):
    return nn.Conv2d(in_chs, out_chs, kernel_size=1, stride=1, padding=0)


def conv3x3(in_chs, out_chs = 16):
    return nn.Conv2d(in_chs, out_chs, kernel_size=3, stride=1, padding=1)


def conv7x7(in_chs, out_chs = 16):
    return nn.Conv2d(in_chs, out_chs, kernel_size=7, stride=1, padding=3)


class ResBlock(nn.Module):
    def __init__(self, in_chs,out_chs = 16):
        super(ResBlock, self).__init__()
        self.layers = nn.Sequential(
            conv3x3(in_chs, out_chs),
            nn.BatchNorm2d(out_chs),
            nn.LeakyReLU(0.2),
            conv3x3(out_chs, out_chs),
            nn.BatchNorm2d(out_chs)
        )
    def forward(self, x):
        identity = x
        out = self.layers(x)
        out += identity
        return out


class Resizer(nn.Module):
    def __init__(self, in_chs, out_size, n_filters = 16, n_res_blocks = 1, mode = 'bilinear'):
        super(Resizer, self).__init__()
        self.interpolate_layer = partial(F.interpolate, size=out_size, mode=mode,
            align_corners=(True if mode in ('linear', 'bilinear', 'bicubic', 'trilinear') else None))
        self.conv_layers = nn.Sequential(
            conv7x7(in_chs, n_filters),
            nn.LeakyReLU(0.2),
            conv1x1(n_filters, n_filters),
            nn.LeakyReLU(0.2),
            nn.BatchNorm2d(n_filters)
        )
        self.residual_layers = nn.Sequential()
        for i in range(n_res_blocks):
            self.residual_layers.add_module(f'res{i}', ResBlock(n_filters, n_filters))
        self.residual_layers.add_module('conv3x3', conv3x3(n_filters, n_filters))
        self.residual_layers.add_module('bn', nn.BatchNorm2d(n_filters))
        self.final_conv = conv7x7(n_filters, in_chs)

    def forward(self, x):
        identity = self.interpolate_layer(x)
        conv_out = self.conv_layers(x)
        conv_out = self.interpolate_layer(conv_out)
        conv_out_identity = conv_out
        res_out = self.residual_layers(conv_out)
        res_out += conv_out_identity
        out = self.final_conv(res_out)
        out += identity
        return out

AtomGit 开源协作平台测评赛

瓜分20万奖金获得内推名额丰厚实物奖励易参与易上手

更多推荐

mac 使用brew卸载安装node

mac 使用brew卸载安装node卸载1. 查看当前安装的node版本：node -v2. 卸载node：brew uninstall node@版本号 --force比如安装的是12.18.1，使用brew uninstall node@12 --force。还有另外两种现在不能用的方法：使用brew uninstall node，会报错：Error: No such keg: /usr/lo