[关闭]
@diyer22 2017-12-02T15:43:15.000000Z 字数 4073 阅读 965

review of segmentation

dl


review of segmentation

2.2 Transfer Learning

2.3 Data Preprocessing and Augmentation

Despite the power and flexibility of the FCN model,it still lacks various features which hinder its applicationto certain problems and situations:

4.1 Decoder Variants

4.2 Integrating Context Knowledge
Even purely CNNs – without pooling layers – are limited since the receptive field of their units can only grow linearly with the number of layers

Many approaches can be taken to make CNNs awar eof that global information:

4.2.1 Conditional Random Fields

4.2.2 Dilated Convolutions
Those works also show a common trend: dilated convolutions are tightly coupled to multi-scale context aggregation

4.2.3 Multi-scale Prediction
the filters will implicitly learn to detect features at specific scales (presumably with certain invariance degree)

use multi-scale networks which generally make use of multiple networks that target different scales and then merge the predictions to produce a single output

[74] 2015 Predicting depth, surface normalsand semantic labels with a common multi-scale convolutionalarchitecture,” inProceedings of the IEEE International Conference onComputer Vision, 2015
like segNet

[76] 2015 Predicting depth, surface normalsand semantic labels with a common multi-scale convolutionalarchitecture,” inProceedings of the IEEE International Conference onComputer Vision, 2015
特征融合

4.2.4 Feature Fusion

[77],[84]

4.3 Instance Segmentation

4.4 RGB-D Data
Different techniques such as Horizontal Height Angle (HHA) [11] are used for encoding the depth into three channels as follows: horizontal disparity, height above ground, and the angle between local surface normal and the inferred gravity direction

leverage a multi-viewapproach to improve existing single-view works

4.5 3D Data
take a point cloud and parse it through a dense voxel grid, generating a set of occupancy voxels which are used as input to a 3D CNN to produce one label per voxel.They then map back the labels to the point cloud
it has somedisadvantages:
* quantization
* loss of spatial information
* unnecessarily large representations

PointNet is based on fully connected layers instead of convolutional ones

4.6 Video Sequences
features from shallow layers change faster than deepones
processing them at different update rates depending on their depth. By doing this, deep features can be persisted over frames thanks to their semantic stability, thus saving inference time

5 DISCUSSION
we will gather the results of the methods on the most representative datasets using the previously described met-rics.
5.1 Evaluation Metrics

As we have observed, many methods report results on non-standard datasets or they are not even tested at all. Thatmakes comparisons impossible

5.3 Summary

5.4 Future Research Directions

3D datasets: lack data,ILSVRC will feature 3D data in 2018

Sequence datasets: lack of large-scale data

Point cloud segmentation using Graph ConvolutionalNetworks (GCNs): treat point clouds as graphs andapply convolutions over them

Context knowledge:

Real-time segmentation:

Memory:Pruning to simplify a network

Temporal coherency on sequences: it is important to work on video streams

Multi-view integration: Use of multiple views in re-cently proposed segmentation works is mostly lim-ited to RGB-D cameras and in particular focused on single-object segmentation.

2D 常规图像
2D 交通图像
2D 其他数据集
2.5D (RGB-D)数据集
3D CAD数据集
3D 点云数据集

FCN介绍
解码器变体(SegNet)
整合全局信息
条件随机场(CRF)
膨胀卷积(Dilated Convolutions)
多尺度
特征融合
RNN
实例(Instance)分割
2.5D(RGB-D)数据
3D数据
视频序列

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注