Repository logo
Repository
Browse
SLIIT Journals
OPAC
Log In
  1. Home
  2. Browse by Author

Browsing by Author "Sharma, K"

Filter results by typing the first few letters
Now showing 1 - 1 of 1
  • Results Per Page
  • Sort Options
  • Thumbnail Image
    PublicationEmbargo
    Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification
    (Elsevier Ltd, 2026-02-23) Sharma, K; Silva, B. N
    In recent years, larger, deeper, and more complex deep learning models have emerged as a result of advancements in deep learning techniques. Nevertheless, the computational costs have also increased with the growing model size. Thus, Knowledge Distillation has evolved into a cornerstone in contemporary machine learning, facilitating the transfer of knowledge from cumbersome teacher models to more compact student models. However, student learning is persistently challenged by sub-optimal supervision caused by erroneous and ambiguous teacher predictions. Moreover, the learning process is further deteriorated by the complications introduced through frequently encountered noisy labels in real-world datasets. Existing methods often resort to the ensemble of teachers, introducing additional complexity. We propose a novel, simple, and efficient learning method, Corrective Knowledge Distillation (CKD), to alleviate these drawbacks while relying solely on a single-teacher model. The proposed work employs a two-phase learning paradigm. In the initial phase, the teacher selectively teaches extremely confident knowledge to the student, and in the subsequent phase, the student leverages its own past learning experiences, conditioning its knowledge acquisition on the guidance of the teacher. The proposed method consistently exhibits superior performance in addressing sub-optimal supervision, as evidenced by comprehensive experiments on benchmark datasets such as CIFAR-100, CIFAR-100N-Fine, and ImageNet-1K. Notably, CKD surpasses established baselines, achieving substantial accuracy gains of up to 3.53% in real-world scenarios. Furthermore, CKD exhibits exceptional robustness in highly noisy environments, outperforming ensemble techniques by a significant margin of up to 5.18%. Our code is available at https://github.com/Karthick47v2/ckd.

Copyright 2025 © SLIIT. All Rights Reserved.

  • Privacy policy
  • End User Agreement
  • Send Feedback