By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Instance-Aware Grouped Quantization (IGQ-ViT) Sets New Benchmarks for ViT PTQ | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Instance-Aware Grouped Quantization (IGQ-ViT) Sets New Benchmarks for ViT PTQ | HackerNoon
Computing

Instance-Aware Grouped Quantization (IGQ-ViT) Sets New Benchmarks for ViT PTQ | HackerNoon

News Room
Last updated: 2025/11/17 at 5:03 PM
News Room Published 17 November 2025
Share
Instance-Aware Grouped Quantization (IGQ-ViT) Sets New Benchmarks for ViT PTQ | HackerNoon
SHARE

Table of Links

Abstract and 1. Introduction

  1. Related work

  2. Method

    3.1. Uniform quantizer

    3.2. IGQ-ViT

    3.3. Group size allocation

  3. Experiments

    4.1. Implementation details and 4.2. Results

    4.3. Discussion

  4. Conclusion, Acknowledgements, and References

Supplementary Material

A. More implementation details

B. Compatibility with existing hardwares

C. Latency on practical devices

D. Application to DETR

4. Experiments

In this section, we describe our experimental settings (Sec. 4.1), and evaluate IGQ-ViT on image classification, object detection and semantic segmentation (Sec. 4.2). We then present a detailed analysis of our approach (Sec. 4.3).

4.1. Implementation details

We evaluate our IGQ-ViT framework on the tasks of image classification, object detection, and instance segmentation. We use the ImageNet [8] dataset for image classification, which contains approximately 1.2M images for training, and 50K for validation. We use COCO [22] for object detection and instance segmentation, which includes 118K training, 5K validation, and 20K test images. We adopt various transformer architectures, including ViT [10], DeiT [34], and Swin transformer [25], for image classification. For the tasks of object detection and instance segmentation, we use Mask R-CNN [14] and Cascade Mask R-CNN [5] with Swin transformers as the backbone. Following [9, 21], we randomly sample 32 images from the ImageNet [8] dataset for image classification, and a single image from COCO [22] for object detection and instance segmentation to calibrate the quantization parameters. We apply our instance-aware grouping technique for all input activations of FC layers, and softmax attentions. More detailed settings are available in the supplement.

4.2. Results

Results on ImageNet. We show in Table 2 the top-1 accuracy (%) on the validation split of ImageNet [8] with various ViT architectures. We report the accuracy with an average group size of 8 and 12. We summarize our findings as follows: (1) Our IGQ-ViT framework with 8 groups already outperforms the state of the art except for ViT-B [10] and Swin-S [25] under 6/6-bit setting, while using more groups further boosts the performance. (2) Our approach under 4/4-bit setting consistently outperforms RepQ-ViT [21] by a large margin. Similar to ours, RepQ-ViT also addresses the scale variations between channels, but it can be applied to the activations with preceding LayerNorm only. In contrast, our method handles the scale variations on all input activations of FC layers and softmax attentions, providing better results. (3) Our group size allocation technique boosts the quantization performance for all models, indicating that using the same number of groups for all layers is suboptimal. (4) Exploiting 12 groups for our approach incurs less than 0.9% accuracy drop, compared to the upper bound under the 6/6-bit setting. Note that the results of upper bound are obtained by using a separate quantizer for each channel of activations and each row of softmax attentions.

Results on COCO. We show in Table 3 the quantization results for object detection and instance segmentation on

Table 4. Quantitative comparison of our instance-aware group quantization technique with various configurations under a 4/4- bit setting. We denote by ‘Linear’ and ‘Attention’ the quantization method for linear operations and softmax attentions, respectively. For applying our method, we use a group size of 8 for all layers.

COCO [22]. We quantize the backbones of Swin transformers [25] and the convolutional layers in the neck and head of Mask R-CNN [14] and Cascade Mask R-CNN [5]. We observe that PTQ4ViT [40] and APQ-ViT [9], that use layerwise quantizers for activations, do not perform well. In contrast, IGQ-ViT outperforms the state of the art with 8 groups only, and the quantization performance further boosts by exploiting more groups. In particular, it provides the results nearly the same as the full-precision ones for the the 6/6-bit setting. This suggests that scale variations across different channels or tokens are much more critical for object detection and instance segmentation.

:::info
Authors:

(1) Jaehyeon Moon, Yonsei University and Articron;

(2) Dohyung Kim, Yonsei University;

(3) Junyong Cheon, Yonsei University;

(4) Bumsub Ham, a Corresponding Author from Yonsei University.

:::


:::info
This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license

:::

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Holistic Engineering: Organic Problem Solving for Complex Evolving Systems Holistic Engineering: Organic Problem Solving for Complex Evolving Systems
Next Article Meta releases a new tool to protect reels creators from having their work stolen |  News Meta releases a new tool to protect reels creators from having their work stolen | News
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

OpenZFS 2.4 Squeezes In Some Last Minute Improvements
OpenZFS 2.4 Squeezes In Some Last Minute Improvements
Computing
Trump greenlights sale of F-35 fighter jets to Saudi Arabia 
Trump greenlights sale of F-35 fighter jets to Saudi Arabia 
News
Cloudflare acquires AI deployment startup Replicate –  News
Cloudflare acquires AI deployment startup Replicate – News
News
Steal My AI Prompt for Turning SWOT Into Actionable Strategy | HackerNoon
Steal My AI Prompt for Turning SWOT Into Actionable Strategy | HackerNoon
Computing

You Might also Like

OpenZFS 2.4 Squeezes In Some Last Minute Improvements
Computing

OpenZFS 2.4 Squeezes In Some Last Minute Improvements

1 Min Read
Steal My AI Prompt for Turning SWOT Into Actionable Strategy | HackerNoon
Computing

Steal My AI Prompt for Turning SWOT Into Actionable Strategy | HackerNoon

14 Min Read
GIMP 3.2 RC1 Brings More UI/UX Improvements, Proper SVG Export
Computing

GIMP 3.2 RC1 Brings More UI/UX Improvements, Proper SVG Export

1 Min Read
Steal My AI Prompt for Building Investor-Ready Business Plans | HackerNoon
Computing

Steal My AI Prompt for Building Investor-Ready Business Plans | HackerNoon

14 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?