Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification

In recent years, larger, deeper, and more complex deep learning models have emerged as a result of advancements in deep learning techniques. Nevertheless, the computational costs have also increased with the growing model size. Thus, Knowledge Distillation has evolved into a cornerstone in contemporary machine learning, facilitating the transfer of knowledge from cumbersome teacher models to more compact student models. However, student learning is persistently challenged by sub-optimal supervision caused by erroneous and ambiguous teacher predictions. Moreover, the learning process is further deteriorated by the complications introduced through frequently encountered noisy labels in real-world datasets. Existing methods often resort to the ensemble of teachers, introducing additional complexity. We propose a novel, simple, and efficient learning method, Corrective Knowledge Distillation (CKD), to alleviate these drawbacks while relying solely on a single-teacher model. The proposed work employs a two-phase learning paradigm. In the initial phase, the teacher selectively teaches extremely confident knowledge to the student, and in the subsequent phase, the student leverages its own past learning experiences, conditioning its knowledge acquisition on the guidance of the teacher. The proposed method consistently exhibits superior performance in addressing sub-optimal supervision, as evidenced by comprehensive experiments on benchmark datasets such as CIFAR-100, CIFAR-100N-Fine, and ImageNet-1K. Notably, CKD surpasses established baselines, achieving substantial accuracy gains of up to 3.53% in real-world scenarios. Furthermore, CKD exhibits exceptional robustness in highly noisy environments, outperforming ensemble techniques by a significant margin of up to 5.18%. Our code is available at https://github.com/Karthick47v2/ckd.

Keywords

Confident learning, Knowledge distillation, Model compression

URI

https://rda.sliit.lk/handle/123456789/4767

Collections

Research Papers - Dept of Information Technology

Full item page

Publication:
Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification

DOI

Files

Type:

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Publication: Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification

DOI

Files

Type:

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Publication:
Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification