计算机视觉论文-2021-06-14

本专栏是计算机视觉方向论文收集积累，时间：2021年6月14日，来源：paper digest欢迎关注原创公众号【计算机视觉联盟】，回复【西瓜书手推笔记】可获取我的机器学习纯手推笔记！直达笔记地址：机器学习手推笔记（GitHub地址）1, TITLE:Recovery of Meteorites Using An Autonomous Drone and Machine LearningAUTHO

SophiaCV

636人浏览 · 2021-06-21 21:43:28

SophiaCV · 2021-06-21 21:43:28 发布

本专栏是计算机视觉方向论文收集积累，时间：2021年6月14日，来源：paper digest

欢迎关注原创公众号 【计算机视觉联盟】，回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记！

直达笔记地址：机器学习手推笔记（GitHub地址）

1, TITLE: Recovery of Meteorites Using An Autonomous Drone and Machine Learning
AUTHORS: ROBERT I. CITRON et. al.
CATEGORY: astro-ph.EP [astro-ph.EP, cs.CV, cs.LG]
HIGHLIGHT: Here, we describe a proof-of-concept meteorite classifier that deploys off-line a combination of different convolution neural networks to recognize meteorites from images taken by drones in the field.

2, TITLE: A Modular Framework for Object-based Saccadic Decisions in Dynamic Scenes
AUTHORS: Nicolas Roth ; Pia Bideau ; Olaf Hellwich ; Martin Rolfs ; Klaus Obermayer
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Here, we present a new model for simulating human eye-movement behavior in dynamic real-world scenes.

3, TITLE: Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection
AUTHORS: Jeffri M. Llerena ; Luis Felipe Zeni ; Lucas N. Kristen ; Claudio Jung
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we explore a fuzzy representation of object regions using Gaussian distributions, which provides an implicit binary representation as (potentially rotated) ellipses.

4, TITLE: Conterfactual Generative Zero-Shot Semantic Segmentation
AUTHORS: Feihong Shen ; Jun Liu ; Ping Hu
CATEGORY: cs.CV [cs.CV, 68T07, I.2.10]
HIGHLIGHT: In this work, we consider counterfactual methods to avoid the confounder in the original model.

5, TITLE: ViT-Inception-GAN for Image Colourising
AUTHORS: Tejas Bana ; Jatan Loya ; Siddhant Kulkarni
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In our proposed method, we attempt to colourise images using Vision Transformer - Inception - Generative Adversarial Network (ViT-I-GAN), which has an Inception-v3 fusion embedding in the generator.

6, TITLE: Learning The Precise Feature for Cluster Assignment
AUTHORS: Yanhai Gan ; Xinghui Dong ; Huiyu Zhou ; Feng Gao ; Junyu Dong
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Based on this, we propose a general-purpose deep clustering framework which radically integrates representation learning and clustering into a single pipeline for the first time.

7, TITLE: MlTr: Multi-label Classification with Transformer
AUTHORS: XING CHENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we point out the three crucial problems that CNN-based methods encounter and explore the possibility of conducting specific transformer modules to settle them.

8, TITLE: Team RUC_AIM3 Technical Report at ActivityNet 2021: Entities Object Localization
AUTHORS: Ludan Ruan ; Jieting Chen ; Yuqing Song ; Shizhe Chen ; Qin Jin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, in this work, we propose to divide these two modules into two stages and improve them respectively to boost the whole system performance.

9, TITLE: Calibration and Auto-Refinement for Light Field Cameras
AUTHORS: Yuriy Anisimov ; Gerd Reis ; Didier Stricker
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents an approach for light field camera calibration and rectification, based on pairwise pattern-based parameters extraction.

10, TITLE: Part-aware Panoptic Segmentation
AUTHORS: Daan de Geus ; Panagiotis Meletis ; Chenyang Lu ; Xiaoxiao Wen ; Gijs Dubbelman
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing.

11, TITLE: Spectral Unsupervised Domain Adaptation for Visual Recognition
AUTHORS: Jingyi Zhang ; Jiaxing Huang ; Shijian Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose Spectral UDA (SUDA), an efficient yet effective UDA technique that works in the spectral space and is generic across different visual recognition tasks in detection, classification and segmentation.

12, TITLE: Small Object Detection for Near Real-Time Egocentric Perception in A Manual Assembly Scenario
AUTHORS: Hooman Tavakoli ; Snehal Walunj ; Parsha Pahlevannejad ; Christiane Plociennik ; Martin Ruskowski
CATEGORY: cs.CV [cs.CV, cs.AI, cs.HC]
HIGHLIGHT: We describe a near real-time small object detection pipeline for egocentric perception in a manual assembly scenario: We generate a training data set based on CAD data and realistic backgrounds in Unity.

13, TITLE: AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation
AUTHORS: MINGXIANG CHEN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.

14, TITLE: Step-Wise Hierarchical Alignment Network for Image-Text Matching
AUTHORS: Zhong Ji ; Kexin Chen ; Haoran Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Different from them, in this work, we propose a step-wise hierarchical alignment network (SHAN) that decomposes image-text matching into multi-step cross-modal reasoning process.

15, TITLE: Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG By Synthetic Augmentation
AUTHORS: Yunhao Ba ; Zhen Wang ; Kerim Doruk Karinca ; Oyku Deniz Bozkurt ; Achuta Kadambi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: A joint optimization framework is utilized to translate real videos from light-skinned subjects to dark skin tones while retaining their pulsatile signals.

16, TITLE: Pedestrian Attribute Recognition in Video Surveillance Scenarios Based on View-attribute Attention Localization
AUTHORS: Weichen Chen ; Xinyi Yu ; Linlin Ou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel view-attribute localization method based on attention (VALA), which relies on the strong relevance between attributes and views to capture specific view-attributes and to localize attribute-corresponding areas by attention mechanism.

17, TITLE: SimSwap: An Efficient Framework For High Fidelity Face Swapping
AUTHORS: Renwang Chen ; Xuanhong Chen ; Bingbing Ni ; Yanhao Ge
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose an efficient framework, called Simple Swap (SimSwap), aiming for generalized and high fidelity face swapping.

18, TITLE: Efficient Deep Learning Architectures for Fast Identification of Bacterial Strains in Resource-Constrained Devices
AUTHORS: R. Gallardo Garc�a ; S. Jarqu�n Rodr�guez ; B. Beltr�n Mart�nez ; C. Hern�ndez Gracidas ; R. Mart�nez Torres
CATEGORY: cs.CV [cs.CV, 68T07 (Primary), 68U10 (Secondary), I.4; J.3]
HIGHLIGHT: This work presents twelve fine-tuned deep learning architectures to solve the bacterial classification problem over the Digital Image of Bacterial Species Dataset.

19, TITLE: Neural Network Modeling of Probabilities for Coding The Octree Representation of Point Clouds
AUTHORS: Emre Can Kaya ; Ioan Tabus
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: This paper describes a novel lossless point cloud compression algorithm that uses a neural network for estimating the coding probabilities for the occupancy status of voxels, depending on wide three dimensional contexts around the voxel to be encoded.

20, TITLE: A Framework to Enhance Generalization of Deep Metric Learning Methods Using General Discriminative Feature Learning and Class Adversarial Neural Networks
AUTHORS: Karrar Al-Kaabi ; Reza Monsefi ; Davood Zabihzadeh
CATEGORY: cs.CV [cs.CV, cs.IR, cs.LG, 6804 (Primary)]
HIGHLIGHT: To address this limitation, we propose a framework to enhance the generalization power of existing DML methods in a Zero-Shot Learning (ZSL) setting by general yet discriminative representation learning and employing a class adversarial neural network.

21, TITLE: Shallow Optical Flow Three-Stream CNN for Macro- and Micro-Expression Spotting from Long Videos
AUTHORS: Gen-Bing Liong ; John See ; Lai-Kuan Wong
CATEGORY: cs.CV [cs.CV, cs.MM, I.4; I.5.1]
HIGHLIGHT: In this paper, we propose a shallow optical flow three-stream CNN (SOFTNet) model to predict a score that captures the likelihood of a frame being in an expression interval.

22, TITLE: Attention-based Partial Face Recognition
AUTHORS: Stefan H�rmann ; Zeyuan Zhang ; Martin Knoche ; Torben Teepe ; Gerhard Rigoll
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel approach to partial face recognition capable of recognizing faces with different occluded areas.

23, TITLE: Refining Pseudo Labels with Clustering Consensus Over Generations for Unsupervised Object Re-identification
AUTHORS: Xiao Zhang ; Yixiao Ge ; Yu Qiao ; Hongsheng Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle the challenge, we propose to properly estimate pseudo label similarities between consecutive training generations with clustering consensus and refine pseudo labels with temporally propagated and ensembled pseudo labels.

24, TITLE: Scale-invariant Scale-channel Networks: Deep Networks That Generalise to Previously Unseen Scales
AUTHORS: Ylva Jansson ; Tony Lindeberg
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present a systematic study of this methodology by implementing different types of scale channel networks and evaluating their ability to generalise to previously unseen scales.

25, TITLE: Bridge The Gap Between Model-based and Model-free Human Reconstruction
AUTHORS: Lixiang Lin ; Jianke Zhu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address these issues, we propose a novel topology-preserved human reconstruction approach by bridging the gap between model-based and model-free human reconstruction.

26, TITLE: Instance-Level Task Parameters: A Robust Multi-task Weighting Framework
AUTHORS: Pavan Kumar Anasosalu Vasu ; Shreyas Saxena ; Oncel Tuzel
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Recent works have shown that deep neural networks benefit from multi-task learning by learning a shared representation across several related tasks.

27, TITLE: Learning Compositional Shape Priors for Few-Shot 3D Reconstruction
AUTHORS: MATEUSZ MICHALKIEWICZ et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work we experimentally demonstrate that naive baselines fail in this few-shot learning setting, in which the network must learn informative shape priors for inference of new categories.

28, TITLE: A Deep Learning Approach to Clustering Visual Arts
AUTHORS: Giovanna Castellano ; Gennaro Vessio
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address these issues, in this paper we propose DELIUS: a DEep learning approach to cLustering vIsUal artS.

29, TITLE: K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
AUTHORS: XIU SU et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, instead of counting on a single supernet, we introduce $K$-shot supernets and take their weights for each operation as a dictionary.

30, TITLE: An Image Forensic Technique Based on JPEG Ghosts
AUTHORS: Divakar Singh
CATEGORY: cs.CV [cs.CV, cs.CR]
HIGHLIGHT: In thispaper, we propose a digital image forensic technique for JPEG images.

31, TITLE: A Self-adapting Super-resolution Structures Framework for Automatic Design of GAN
AUTHORS: Yibo Guo ; Haidi Wang ; Yiming Fan ; Shunyao Li ; Mingliang Xu
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: In the existing works, experts have gradually explored a set of optimal model parameters based on empirical values or performing brute-force search.

32, TITLE: Predicting Next Local Appearance for Video Anomaly Detection
AUTHORS: Pankaj Raj Roy ; Guillaume-Alexandre Bilodeau ; Lama Seoud
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a local anomaly detection method in videos.

33, TITLE: Survey of Image Based Graph Neural Networks
AUTHORS: Usman Nazir ; He Wang ; Murtaza Taj
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this survey paper, we analyze image based graph neural networks and propose a three-step classification approach.

34, TITLE: Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning
AUTHORS: LIANGQIONG QU et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we demonstrate that attention-based architectures (e.g., Transformers) are fairly robust to distribution shifts and hence improve federated learning over heterogeneous data.

35, TITLE: Coordinate Independent Convolutional Networks -- Isometry and Gauge Equivariant Convolutions on Riemannian Manifolds
AUTHORS: Maurice Weiler ; Patrick Forr� ; Erik Verlinde ; Max Welling
CATEGORY: cs.LG [cs.LG, cs.CG, cs.CV, stat.ML]
HIGHLIGHT: To exemplify the design of coordinate independent convolutions, we implement a convolutional network on the M\"obius strip.

36, TITLE: Sparse and Imperceptible Adversarial Attack Via A Homotopy Algorithm
AUTHORS: Mingkang Zhu ; Tianlong Chen ; Zhangyang Wang
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this paper, we address this challenge by proposing a homotopy algorithm, to jointly tackle the sparsity and the perturbation bound in one unified framework.

37, TITLE: Progressive-Scale Boundary Blackbox Attack Via Projective Gradient Estimation
AUTHORS: JIAWEI ZHANG et. al.
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this paper, we show that such efficiency highly depends on the scale at which the attack is applied, and attacking at the optimal scale significantly improves the efficiency.

38, TITLE: Within-layer Diversity Reduces Generalization Gap
AUTHORS: Firas Laakom ; Jenni Raitoharju ; Alexandros Iosifidis ; Moncef Gabbouj
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage diversity of the activations within the same layer.

39, TITLE: PyGAD: An Intuitive Genetic Algorithm Python Library
AUTHORS: Ahmed Fawzy Gad
CATEGORY: cs.NE [cs.NE, cs.CV, cs.LG, math.OC]
HIGHLIGHT: This paper introduces PyGAD, an open-source easy-to-use Python library for building the genetic algorithm.

40, TITLE: KRADA: Known-region-aware Domain Alignment for Open World Semantic Segmentation
AUTHORS: CHENHONG ZHOU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Hence, in this paper, we consider a new, more realistic, and more challenging problem setting where the pixel-level classifier has to be trained with labeled images and unlabeled open-world images -- we name it open world semantic segmentation (OSS).