Know the Angles: An Efficient Approach for the Automatic Recognition of Human Activities

ETRI Journal Editorial Office
ETRI Journal
Published in
3 min readMay 16, 2022

Scientists develop AI model that can accurately and efficiently determine human activities from video footage

Scientists have developed an innovative approach for automatic human activity recognition — a computer vision tool that allows a system to classify a person’s action or state using video footage. The proposed solution is based around training a machine-learning model to analyze patterns in key skeletal joint angles so as to recognize activities. Their approach is low cost but high efficiency, paving the way to applications in video surveillance, healthcare, and human–computer interactions.

This study paves the way for computer vision applications involving human activity recognition

Computer vision is one of the most researched fields in artificial intelligence because of its many applications and the prospect of automatizing countless human tasks. In particular, the subfield of human activity recognition (HAR) has garnered much attention. It involves the training of a machine-learning model to automatically differentiate between various human activities or states based solely on video footage. The HAR problem presents many challenges that have to be dealt with, including rotation or scale variations, camera motion, large inter-class variation, and data margin issues. While several solutions have been proposed, striking the right balance between cost, computational efficiency, and accuracy has proven difficult because available HAR systems typically operate using large amounts of data.

In a recent study published in ETRI Journal, a team of scientists including Dr. Ömer Faruk İnce of Korea Institute of Science and Technology, Korea, proposed a new and highly practical approach to HAR. Their strategy involves the use of a Kinect v2, a commercially available device that uses a camera and a depth sensor and keeps track of a person’s skeleton. The word ‘skeleton’ here refers not to human bones, but to a series of 25 connected points that denote the position of certain key locations in the human body, such as the head, neck, shoulders, spine, arms, knees, feet, and so on.

The researchers wanted to build a machine-learning model capable of recognizing various activities when trained with just the 3D angles of a few skeletal joints. “The main motivation of our study was to develop a HAR system offering low cost and high efficiency,” explains Dr. İnce, “Thus, our method uses angle patterns between skeletal joints, which are features invariant to both scale and rotation, to bring a relatively simple yet efficient solution to the HAR problem.” Their model stored the 3D angle values between the selected joints using the sliding kernel method, which ‘cuts’ small slices of time according to a predefined time window.

Once the time-series data is collected, most of the feature processing is conducted in the frequency domain, which is accessed using a mathematical transformation called the wavelet transform. On the frequency side, a very important step in the algorithm takes place, as Dr. İnce explains: “A major contribution of our method involves reducing the dimension of the data in the frequency domain so that analyses can be conducted in a lower dynamic range, which makes the HAR problem easier to handle and solve.

The team showcased the potential of their HAR system through extensive experiments, which verified that it performs adequately. They also highlighted that HAR systems like the one developed on this occasion could find uses in elderly care, lifestyle disease monitoring, public video surveillance, and human–computer interactions, the latter of which could in turn be leveraged for videogames, robot learning, and fitness. Let us hope computer vision technology keeps evolving so that we can reap the benefits from all angles!

Reference

Titles of original papers: Human activity recognition with analysis of angles between skeletal joints using a RGB‐depth sensor

DOI: 10.4218/etrij.2018–0577

Name of author: Ömer Faruk İnce1, Ibrahim Furkan Ince2, Mustafa Eren Yıldırım2,3, Jang Sik Park2, Jong Kwan Song2, Byung Woo Yoon2

Affiliation:

1 Center for Intelligent and Interactive Robotics, Korea Institute of Science and Technology

2 Department of Electronics Engineering, Kyungsung University

3 Department of Electrical and Electronics Engineering, Bahçeşehir University

About Dr. Ömer Faruk İnce

Ömer Faruk İnce received a BS degree in Electrical and Electronics Engineering from Isik University, Turkey, in 2012. He then received MS and PhD degrees in Electronics Engineering from Kyungsung University, Korea, in 2015 and 2018, respectively. Afterwards, he joined and continues to work for the Korea Institute of Science and Technology as a post-doctoral researcher at the Center for Intelligent and Interactive Robotics.

--

--

ETRI Journal Editorial Office
ETRI Journal

ETRI Journal is an international, peer-reviewed multidisciplinary journal edited by Electronics and Telecommunications Research Institute (ETRI), Rep. of Korea.