@Matrixzhu
2021-11-18T17:32:16.000000Z
字数 959
阅读 425
In this experiment we manually labeled three dogs' videos. We select three video clips in each video, 30-60s, 165-195s, 270-300s. The extraction FPS is 2, which means select 2 frames within each second. There are 180 frames extracted in each video, and label 540 frames for three videos. Labeling speed is 30s for one frame on average. The size of raw video is (width: 2164, height: 1624), with 30 FPS. We re-scale the raw frames into the size (width: 541, height:406)
In this experiment we select 5 dog videos from OFT dataset, which include 900 frames totally. 5-fold cross validation is applied to test the model performance. The PCK result for every fold before and after the Savitzky–Golay filter is present in the table belPCK score means the percenntage of joints which are within a normalized distance to the ground truth locations.
Fold | Train PCK smoothed | Test PCK smoothed | Train PCK score @ 0.1 | Test PCK score @ 0.1 |
---|---|---|---|---|
1 | 0.92 | 0.82 | 0.90 | 0.81 |
2 | 0.88 | 0.78 | 0.87 | 0.76 |
3 | 0.93 | 0.82 | 0.90 | 0.80 |
4 | 0.90 | 0.79 | 0.89 | 0.77 |
5 | 0.89 | 0.80 | 0.88 | 0.78 |
mean | 0.90 | 0.80 | 0.88 | 0.78 |
std | 0.018 | 0.016 | 0.011 | 0.018 |