By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Model Promotion: Using EMA to Balance Learning and Forgetting in IIL | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Model Promotion: Using EMA to Balance Learning and Forgetting in IIL | HackerNoon
Computing

Model Promotion: Using EMA to Balance Learning and Forgetting in IIL | HackerNoon

News Room
Last updated: 2025/11/05 at 1:43 PM
News Room Published 5 November 2025
Share
Model Promotion: Using EMA to Balance Learning and Forgetting in IIL | HackerNoon
SHARE

Table of Links

Abstract and 1 Introduction

  1. Related works

  2. Problem setting

  3. Methodology

    4.1. Decision boundary-aware distillation

    4.2. Knowledge consolidation

  4. Experimental results and 5.1. Experiment Setup

    5.2. Comparison with SOTA methods

    5.3. Ablation study

  5. Conclusion and future work and References

Supplementary Material

  1. Details of the theoretical analysis on KCEMA mechanism in IIL
  2. Algorithm overview
  3. Dataset details
  4. Implementation details
  5. Visualization of dusted input images
  6. More experimental results

4.2. Knowledge consolidation

Different from existing IIL methods that only focus on the student model, we propose to consolidate knowledge from student to teacher for better balance between learning and forgetting. The consolidation is not implemented through learning but through model exponential moving average (EMA). Model EMA was initially introduced by Tarvainen et al. [28] to enhance the generalizability of models. In the vanilla model EMA, the model is trained from scratch, and EMA is applied after every iteration. The underlying mechanism of model EMA is not thoroughly explained before. In this work, we leverage model EMA for knowledge consolidation (KC) in the context of IIL task and explain the mechanism theoretically. According to our theoretical analysis, we propose a new KC-EMA for knowledge consolidation. Mathematically, the model EMA can be formulated as

Hence, the teacher model can achieve a minima training loss on both the old task and the new task, which indicates improved generalization on both the old data and new observations. This has been verified by our experiments in Sec. 5. However, since α < 1, it is noteworthy that the gradient of the teacher model, whether on the old task or the new task, is larger than the initial gradient on the old task or the final gradient of the student model on the new task. That is, the obtained teacher model sacrifices some unilateral performance on either the old data or the new data in order to achieve better generalization on both. From this perspective, the mechanism of vanilla EMA could also be partially explained. In vanilla EMA, where the model starts from scratch and only the new task is considered, we only need to focus on the second term in Equation 13. Since the teacher model has larger gradient on the training data than the student model, it is less possible to overfit to the training data. As a result, the teacher model has better generalization as Tarvainen et al. [28] observed.

:::info
Authors:

(1) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(2) Weifu Fu, Tencent Youtu Lab;

(3) Yuhuan Lin, Tencent Youtu Lab;

(4) Jialin Li, Tencent Youtu Lab;

(5) Yifeng Zhou, Tencent Youtu Lab;

(6) Yong Liu, Tencent Youtu Lab;

(7) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(8) Chengjie Wang, Tencent Youtu Lab.

:::


:::info
This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Digital democracy is the key to staging wartime elections in Ukraine Digital democracy is the key to staging wartime elections in Ukraine
Next Article New retina e-paper could make e-ink displays sharper than your iPhone screen New retina e-paper could make e-ink displays sharper than your iPhone screen
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Acer Chromebox Mini CXM1 Review: A Bite-Size Business Box for the Very Basics
Acer Chromebox Mini CXM1 Review: A Bite-Size Business Box for the Very Basics
News
You Are Repinning on Pinterest All Wrong
You Are Repinning on Pinterest All Wrong
Computing
Honeycomb’s observability platform increasing trust in AI –  News
Honeycomb’s observability platform increasing trust in AI – News
News
You Can Control Your Roku TV With An Android Phone – Here’s How – BGR
You Can Control Your Roku TV With An Android Phone – Here’s How – BGR
News

You Might also Like

You Are Repinning on Pinterest All Wrong
Computing

You Are Repinning on Pinterest All Wrong

14 Min Read
Your Ultimate Guide To Pinterest Aesthetics 2026
Computing

Your Ultimate Guide To Pinterest Aesthetics 2026

10 Min Read
The TechBeat: Measuring Non-Linear User Journeys: Rethinking Funnels Metrics in A/B Testing (12/7/2025) | HackerNoon
Computing

The TechBeat: Measuring Non-Linear User Journeys: Rethinking Funnels Metrics in A/B Testing (12/7/2025) | HackerNoon

7 Min Read
The Day the House Entered Epistemic Hold: A Story of Ternary Logic, Congress, and Credible Evidence | HackerNoon
Computing

The Day the House Entered Epistemic Hold: A Story of Ternary Logic, Congress, and Credible Evidence | HackerNoon

0 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?