Human action analysis models in artificial intelligence based proctoring systems and dataset for them
DOI:
https://doi.org/10.15276/aait.06.2023.14Keywords:
Computer vision, neural networks, dataset, transformer, action understanding, video understanding, artificial intelligence based proctoring systems, online proctoring, proctoring system, distance learning, online learningAbstract
This paper describes the approach for building a specialized model for human action analysis in AI-based proctoring systems
and proposes a prototype of dataset which contains data specific to the application area. Boosted development of machine learning
technologies, the availability of devices and the access to the Internet are skyrocketing the development of the field of distance
learning. And in parallel with distance learning systems the AI-based proctoring systems, that provide the functional analysis of
student work by imitating the teacher's assessment, are developing as well. However, despite the development of image processing
and machine learning technology, the functionality of modern proctoring systems is still at a primitive level. Within the image
processing functionality, they focus entirely on tracking students' faces and do not track postures and actions. At the same time,
assessment of physical activity is necessary not only as part of the learning process, but also to keep students healthy according to
regulatory requirements, as they spend the entire duration of learning process in front of computers or other devices during the
distance learning. In existing implementations, this process falls entirely on the shoulders of teachers or even the students themselves,
who work through the lesson materials or tests on their own. Teachers, at the same time, have to either establish contact through
video communication systems and social media (TikTok, Instagram) and/or analyse videos of students doing certain physical
activities in order to organise physical activities evaluation. The lack of such functionality in AI-based proctoring systems slows
down the learning process and potentially harms students' health in the long run. This paper presents additional functionality
requirements for AI-based proctoring systems including human action analysis functionality to assess physical activity and to
monitor hygiene rules for working with computers during the educational process. For this purpose, a foundation model called
InternVideo was used for processing and analysis of student's actions. Based on it, the approach for building a specialized model for
student action analysis was proposed. It includes two modes of student activity evaluation during the distance learning process: static
and dynamic. The static mode (aka working phase) analyses and evaluates the student's behavior during the learning and examination
process, where physical activity is not the main component of learning. The dynamic mode (aka physical education mode) analyses
and assesses the student who purposefully performs physical activity (physical education lesson, exercises for children during the
lesson, etc.). A prototype dataset designed specifically for this application area has also been proposed.