Search results for: openpose

3 Multiperson Drone Control with Seamless Pilot Switching Using Onboard Camera and Openpose Real-Time Keypoint Detection

Authors: Evan Lowhorn, Rocio Alba-Flores

Abstract:

Traditional classification Convolutional Neural Networks (CNN) attempt to classify an image in its entirety. This becomes problematic when trying to perform classification with a drone’s camera in real-time due to unpredictable backgrounds. Object detectors with bounding boxes can be used to isolate individuals and other items, but the original backgrounds remain within these boxes. These basic detectors have been regularly used to determine what type of object an item is, such as “person” or “dog.” Recent advancement in computer vision, particularly with human imaging, is keypoint detection. Human keypoint detection goes beyond bounding boxes to fully isolate humans and plot points, or Regions of Interest (ROI), on their bodies within an image. ROIs can include shoulders, elbows, knees, heads, etc. These points can then be related to each other and used in deep learning methods such as pose estimation. For drone control based on human motions, poses, or signals using the onboard camera, it is important to have a simple method for pilot identification among multiple individuals while also giving the pilot fine control options for the drone. To achieve this, the OpenPose keypoint detection network was used with body and hand keypoint detection enabled. OpenPose supports the ability to combine multiple keypoint detection methods in real-time with a single network. Body keypoint detection allows simple poses to act as the pilot identifier. The hand keypoint detection with ROIs for each finger can then offer a greater variety of signal options for the pilot once identified. For this work, the individual must raise their non-control arm to be identified as the operator and send commands with the hand on their other arm. The drone ignores all other individuals in the onboard camera feed until the current operator lowers their non-control arm. When another individual wish to operate the drone, they simply raise their arm once the current operator relinquishes control, and then they can begin controlling the drone with their other hand. This is all performed mid-flight with no landing or script editing required. When using a desktop with a discrete NVIDIA GPU, the drone’s 2.4 GHz Wi-Fi connection combined with OpenPose restrictions to only body and hand allows this control method to perform as intended while maintaining the responsiveness required for practical use.

Keywords: computer vision, drone control, keypoint detection, openpose

Procedia PDF Downloads 184

2 Real Time Multi Person Action Recognition Using Pose Estimates

Authors: Aishrith Rao

Abstract:

Human activity recognition is an important aspect of video analytics, and many approaches have been recommended to enable action recognition. In this approach, the model is used to identify the action of the multiple people in the frame and classify them accordingly. A few approaches use RNNs and 3D CNNs, which are computationally expensive and cannot be trained with the small datasets which are currently available. Multi-person action recognition has been performed in order to understand the positions and action of people present in the video frame. The size of the video frame can be adjusted as a hyper-parameter depending on the hardware resources available. OpenPose has been used to calculate pose estimate using CNN to produce heap-maps, one of which provides skeleton features, which are basically joint features. The features are then extracted, and a classification algorithm can be applied to classify the action.

Keywords: human activity recognition, computer vision, pose estimates, convolutional neural networks

Procedia PDF Downloads 141

1 Fine Grained Action Recognition of Skateboarding Tricks

Authors: Frederik Calsius, Mirela Popa, Alexia Briassouli

Abstract:

In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

Keywords: activity recognition, fused deep representations, fine-grained dataset, temporal modeling

Procedia PDF Downloads 231