Browsing by Author "Gawesha, A"

Now showing 1 - 2 of 2

Embargo
DS-HPE: Deep Set for Head Pose Estimation
(IEEE, 2023-04-18) Menan, V; Gawesha, A; Samarasinghe, p; Kasthurirathna, D
Head pose estimation is a critical task that is fundamental to a variety of real-world applications, such as virtual and augmented reality, as well as human behavior analysis. In the past, facial landmark-based methods were the dominant approach to head pose estimation. However, recent research has demonstrated the effectiveness of landmark-free methods, which have achieved state-of-the-art (SOTA) results. In this study, we utilize the Deep Set architecture for the first time in the domain of head pose estimation. Deep Set is a specialized architecture that works on a “set” of data as a result of the “permutation-invariance” operator being utilized in the model. As a result, the model is a simple yet powerful and edge-computation-friendly method for estimating head pose. We evaluate our proposed method on two benchmark data sets, and we compare our method against SOTA methods on a challenging video-based data set. Our results indicate that our proposed method not only achieves comparable accuracy to these SOTA methods but also requires less computational time. Furthermore, the simplicity of our proposed method allows for its deployment in resource-constrained environments without the need for expensive hardware such as Graphics Processing Units (GPUs). This work underscores the importance of accurate and resource-efficient head pose estimation in the fields of computer vision and human-computer interaction, and the Deep Set architecture presents a promising approach to achieving this goal.
Embargo
Spatio-temporal graph neural network based child action recognition using data-efficient methods: A systematic analysis
(Elsevier Inc, 2025-06-03) Mohottala, S; Gawesha, A; Kasthurirathna, D; Samarasinghe, P; Abhayaratne, C
This paper presents implementations on child activity recognition (CAR) using spatial–temporal graph neural network (ST-GNN)-based deep learning models with the skeleton modality. Prior implementations in this domain have predominantly utilized CNN, LSTM, and other methods, despite the superior performance potential of graph neural networks. To the best of our knowledge, this study is the first to use an ST-GNN model for child activity recognition employing both in-the-lab, in-the-wild, and in-the-deployment skeleton data. To overcome the challenges posed by small publicly available child action datasets, transfer learning methods such as feature extraction and fine-tuning were applied to enhance model performance. As a principal contribution, we developed an ST-GNN-based skeleton modality model that, despite using a relatively small child action dataset, achieved superior performance (94.81%) compared to implementations trained on a significantly larger (x10) adult action dataset (90.6%) for a similar subset of actions. With ST-GCN-based feature extraction and fine-tuning methods, accuracy improved by 10%–40% compared to vanilla implementations, achieving a maximum accuracy of 94.81%. Additionally, implementations with other ST-GNN models demonstrated further accuracy improvements of 15%–45% over the ST-GCN baseline. The results on activity datasets empirically demonstrate that class diversity, dataset size, and careful selection of pre-training datasets significantly enhance accuracy. In-the-wild and in-the-deployment implementations confirm the real-world applicability of above approaches, with the ST-GNN model achieving 11 FPS on streaming data. Finally, preliminary evidence on the impact of graph expressivity and graph rewiring on accuracy of small dataset-based models is provided, outlining potential directions for future research. The codes are available at https://github.com/sankamohotttala/ST_GNN_HAR_DEML.