Some interesting Computer Vision papers from ICCV 2017
The International Conference on Computer Vision (ICCV) is one of the top-tier conferences in computer vision. This year it was held in Venice, Italy. Out of 2143 valid submissions at ICCV, 621 papers were accepted with an acceptance rate of 28.9%. Some interesting highlights include the application of reinforcement learning for object tracking, visual dialog and activity forecasting, along with the improvement in the generation as well as the applications of Generative Adversarial Networks (GANs). The proposal of novel loss functions (focal loss, range loss) to address class imbalance was exciting.
There were a lot of interesting and exciting papers. Here is my list of some of the interesting papers from ICCV 2017 categorized by applications.
Segmentation:
- Mask R-CNN (Best paper award)
- Segmentation-Aware Convolutional Networks Using Local Attention Masks
- Learning Video Object Segmentation with Visual Memory
- Predicting Deeper into the Future of Semantic Segmentation
- SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
- Universal Adversarial Perturbations Against Semantic Image Segmentation
- Bringing Background into the Foreground: Making All Classes Equal in Weakly- supervised Video Semantic Segmentation
- Unsupervised object segmentation in video by efficient selection of highly probable positive features
Tracking:
- Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
- Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning
- Robust Object Tracking based on Temporal and Spatial Deep Networks
- Non-Markovian Globally Consistent Multi-Object Tracking
- Learning Policies for Adaptive Tracking with Deep Feature Cascades
Detection:
- Focal Loss for Dense Object Detection (Best student paper award)
- Flow-Guided Feature Aggregation for Video Object Detection
- Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
- Spatial Memory for Context Reasoning in Object Detection
- Temporal Dynamic Graph LSTM for Action-driven Video Object Detection
- Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
- Soft-NMS — Improving Object Detection With One Line of Code
- Chained Cascade Network for Object Detection
Optimization:
- A discriminative view of MRF pre-processing algorithms
- High Order Tensor Formulation for Convolutional Sparse Coding
- Performance Guaranteed Network Acceleration via High-Order Residual Quantization
- Robust Kronecker-Decomposable Component Analysis for Low-Rank Modeling
Generative Adversarial Networks:
- Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis
- Generative Adversarial Networks Conditioned by Brain Signals
- StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
- Least Squares Generative Adversarial Networks
Face Analysis:
- Fast Face-swap Using Convolutional Neural Networks
- Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos
- Realistic Dynamic Facial Textures from a Single Image using GANs
- Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources
- Learning Discriminative Aggregation Network for Video-based Face Recognition
- Range Loss for Deep Face Recognition with Long-Tailed Training Data
- Temporal Non-Volume Preserving Approach to Facial Age-Progression and Age-Invariant Face Recognition
- DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Action Analysis:
- Online Real-time Multiple Spatiotemporal Action Localisation and Prediction
- Temporal Action Detection with Structured Segment Networks
- Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-supervised Object and Action Localization
- Learning long-term dependencies for action recognition with a biologically-inspired deep network
- Spatial-Aware Object Embeddings for Zero-Shot Localization and Classification of Actions
- Unsupervised Action Discovery and Localization in Videos
- TORNADO: A Spatio-Temporal Convolutional Regression Network for Video Action Proposal
General/Other interesting problems:
- One Network to Solve Them All — Solving Linear Inverse Problems using Deep Projection Models
- Deformable Convolutional Networks
- Open Set Domain Adaptation
- Makeup-Go: Blind Reversion of Portrait Edit
- EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis
- Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
- Representation Learning by Learning to Count
- Towards Diverse and Natural Image Descriptions via a Conditional GAN
That’s all. Feel free to add more in the comments. Enjoy!