@Matrixzhu 2021-09-22T08:59:13.000000Z 字数 1858 阅读 528

Experiment result

Representation

Key-points selection

Totally 19 key points are selected. Left and right eyes are not include, because they are not labeled in Stanfortdextra dataset. For every imge only visible key points are labeled.

keypoint name	poportion in dataset
Left front leg: paw	0.92
Left front leg: middle	0.93
Left rear leg: paw	0.86
Left rear leg: middle joint	0.69
Left rear leg: top	0.65
Right front leg: paw	0.55
Right front leg: middle joint	0.87
Right front leg: top	0.88
Right rear leg: paw	0.84
Right rear leg: middle joint	0.68
Right rear leg: top	0.66
Tail start	0.57
Tail end	0.58
Base of left ear	0.86
Base of right ear	0.85
Nose	0.99
Left ear tip	0.44
Right ear tip	0.45

Selected dog breeds

There are 6 types of dog are selected randomly.

Dandie_Dinmont
Tibetan_terrier
bluetick
Rhodesian_ridgeback
Brittany_spaniel
Brabancon_griffon

Example image with ground true for each breed

Training dataset description

The image dataset I use called StanfordExtra. It contains 20580 dog images in 120 breed. There are 12538 images for labeld with ground true keypoints. For this experiment I select 6 breeds of dog ( mensioned before), 60 images for each breed. Since every dog is in different backgrounds and poses, and some images only contain small poportion of my selected keypoints. Thus, my image selection critierion is that only the image contains more than 60 percent visible keypoints out of my label set can be used as training data. There are 353 images are selected. The dataset fraction rare used is 0.9. There are 302 training image and 51 test image.

Model result

The model has been trained for 200,000 iterations which is recommend by the Deeplabcut official document. Mean square error is used as lose function, which is measure by the Euclideam distance between the ground true point's coordinate and prediction point's coordinate. The unite is number of pixel.