[关闭]
@wanghuijiao 2021-03-24T16:16:54.000000Z 字数 4262 阅读 2060

视频理解领域小样本学习调研报告

学习笔记


0 前言

1. 分类

Action Genome(li Feifei2019)提出的分类:

ProtoGAN提出的分类

2. 常用数据集总结

数据集 动作类别数 总视频数 train:val:test 或train:test SOTA
UCF101 101 13320 51:50 95.5%(by AMeFu-Net)
HMDB51 51 6766 26:25 75.5% (by AMeFu-Net)
Olympic-Sports 16 783 8:8 86.3%(by ProtoGAN)
miniMIT 200 200*550 120:40:40 56.7%(by ARN)
Kinetics-100 100 100*100 64:12:24 86.8%(by AMeFu-Net)
Something-Something V2-100 100 100*100 64:12:24 52.3%(by OTAM)

- 小样本数据集长啥样
- Kinetics-100: 来自于Kinetics,从中抽取了100个类别,每类100个视频;对于小样本视频识别任务,CMN中第一次依据类别将其划分为64:12:24,分别作为train/val/test集。
- Something-Something-V2: OTAM采用了和Kinetics-100同样的设置:从SSV2中随机抽取100类,每类随机抽取100个视频片段样本,按类别划分64类用于训练,12类用于验证,24类用于测试。
- UCF101:
- HMDB51:
- 咋训练
- Episode训练: 很多方法采用了叫做Episode训练的方式,这种方式来源于元学习,就是将数据集划分成一个小单位叫做episode,每次训练或者测试时,读取数据和计算分类精度都是按照epidsode计算的,最终取平均值即为最终的精度。
- Episode长啥样:每个episode由支撑集和query视频组成,模仿人类学习的过程,比如:人类在见到一头大象时,会将

结论

3. 开源代码

3.1 TRX

3.2 Few-shot-action-recognition

3.3 SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition

4. 论文简述

4.1 ProtoGAN: Towards Few Shot Learning for Action Recognition

4.2 A Generative Approach to Zero-Shot and Few-Shot Action Recognition

4.3 TARN: Temporal Attentive Relation Network for Few-Shot and Zero-Shot Action Recognition

4.4 CMN: Compound memory networks for few-shot video classification.

4.5 OTAM: Few-shot video classification via temporal alignment

4.6 ARN: Few-shot Action Recognition with Permutation-invariant Attention

4.7 AMeFu-Net:Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition

4.8 Temporal-Relational CrossTransformers for Few-Shot Action Recognition

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注