By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Valve’s ACO Compiler Used By AMD Drivers Optimize Scheduling Heuristic For Newer GPUs
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Valve’s ACO Compiler Used By AMD Drivers Optimize Scheduling Heuristic For Newer GPUs
Computing

Valve’s ACO Compiler Used By AMD Drivers Optimize Scheduling Heuristic For Newer GPUs

News Room
Last updated: 2025/08/06 at 6:39 AM
News Room Published 6 August 2025
Share
SHARE

Merged today for Mesa 25.3-devel to benefit the RADV Vulkan and RadeonSI Gallium3D AMD drivers are improved scheduling heuristics for the ACO compiler back-end developed by Valve.

The ACO compiler can now enjoy improved scheduling heuristics to help with performance on newer AMD Radeon graphics processors. The existing ACO scheduling heuristics were catering to aging Polaris GPUs while now the code is better adapted for more recent GPUs.

Daniel Schürmann opened the merge request nearly one year ago to improve the scheduling heuristic for ACO. Finally this morning it made it into Mesa Git.

Schürmann explains in the merge request:

The ACO scheduling heuristic stems from the era of dinosaurs, more precisely the Polaris family, and wasn’t touched since.

small introduction: Given the instruction sequence of a shader, the ACO scheduler works by gradually moving up memory load instructions until it cannot find any other independent instructions to move down or until the register pressure exceeds certain predetermined limits (more on that later). Then, it tries to move down the first use of the loaded value, so that the distance between load and use is as high as possible. Of course, there is lots of small corner cases and small adjustments for how multiple memory loads are ordered with each other, but that is the rough idea (and there is no plans to ever change that).

GPUs are different: On most modern GPUs, the register file is shared between hardware-threads (aka waves or warps), meaning that the more registers some shader uses, the less instances of that shader can run in parallel and vice-versa. While using more registers lowers the occupancy, it can also improve the execution time of a single shader and reduces the likelihood of cache trashing (which means that different shaders evict each others’ data from the cache). So, how many registers should we use?

ACO is unique: On CPUs, the concept of occupancy doesn’t usually exist which means that existing compilers rarely care either. When developing ACO, much emphasis was put into the ability to predetermine a desired occupancy by being able to schedule within fixed register limits and avoiding additional spilling (keyword: SSA-based register allocation). Currently, when determining the desired occupancy, the only information we take into account is the occupancy of the shader before scheduling happened. We then might allow a lower occupancy to give some room for improved scheduling.

This rewrite aims to detangle some concepts and provide more consistent results.

– wave_factor: The purpose of this value is to reflect that RDNA SIMDs can accomodate twice as many waves as GCN SIMDs.
– reg_file_multiple: This value accounts for the larger register file of wave32 and some RDNA3 families.
– wave_minimum: Below this value, we don’t sacrifice any waves. It corresponds to a register demand of 64 VGPRs in wave64.
– occupancy_factor: Depending on target_waves and wave_factor, this controls the scheduling window sizes and number of moves.

The main differences from the previous heuristic is a lower wave minimum and a slightly less aggressive reduction of waves.
It also increases SMEM_MAX_MOVES in order to mitigate some of the changes from targeting less waves.

This will hopefully yield at least some minor performance gains in the real-world for AMD Linux gamers. The impact though can vary game to game.

More details on this big fundamental improvement for the ACO compiler code via this MR. This will be part of Mesa 25.3 due out in Q4 so there still is time for further optimizations to the ACO and RADV driver code.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article AirPods Pro 2 are still down to $169 — save $80 at Amazon
Next Article Trend Micro Confirms Active Exploitation of Critical Apex One Flaws in On-Premise Systems
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Linux 6.17 Fixes A Performance Bottleneck In The Futex Code
Computing
ElevenLabs launches copyright-free music generator – UKTN
News
103 ByteDance employees dismissed for corruption and other misconduct · TechNode
Computing
Google’s three month old Pixel 9a already has a $100 saving
Gadget

You Might also Like

Computing

Linux 6.17 Fixes A Performance Bottleneck In The Futex Code

1 Min Read
Computing

103 ByteDance employees dismissed for corruption and other misconduct · TechNode

1 Min Read
Computing

Naima McLean at ALX, leading Africa’s creative tech economy

8 Min Read
Computing

7 Fall Social Media Ideas that Won’t Fall Flat – The Gain Blog

12 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?