Paper page - Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning
…Zihao Han , Tiangang Zhang , , Abstract Adaptive Teacher Exposure for Self-Distillation (ATESD) improves large language model reasoning by dynamically adjusting teacher exposure during training through a learnable policy controller. AI-generated summary…
