By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Open-Source “GreenBoost” Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Open-Source “GreenBoost” Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs
Computing

Open-Source “GreenBoost” Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs

News Room
Last updated: 2026/03/14 at 8:21 PM
News Room Published 14 March 2026
Share
Open-Source “GreenBoost” Driver Aims To Augment NVIDIA GPUs vRAM With System RAM & NVMe To Handle Larger LLMs
SHARE

An open-source, independently developed Linux kernel module called GreenBoost aims to augment the dedicated video memory on NVIDIA discrete GPUs with system memory and NVMe storage. The intent here with GreenBoost is a CUDA caching layer to more easily run larger AI models for LLMs that otherwise won’t fit solely in your graphics card’s dedicated vRAM.

GreenBoost was announced today by independent open-source developer Ferran Duarri that is developing it as a multi-tier GPU memory extension for Linux. The GPLv2 driver doesn’t replace NVIDIA’s official Linux kernel drivers but is complementary to it in being a dedicated kernel module paired with a NVIDIA CUDA user-space shim library to transparently leverage it for the expanded memory access. This means doesn’t require modifying the CUDA user-space software itself but will transparently enjoy the expanded memory capacity thanks to your system RAM and any NVMe SSD storage.

The developer noted he wanted to run a 31.8GB model (glm-4.7-flash:q8_0) with a GeForce RTX 5070 12GB graphics card. Existing approaches like offloading layers to the GPU worked but dropped the token performance due to the system memory lacking CUDA coherence. Going for smaller quantization, of course, leads to lower quality as another option.

GreenBoost memory tiers for CUDA

As for how GreenBoost works, today’s announcement in the NVIDIA Forums explains:

“1. Kernel module (`greenboost.ko`)

Allocates pinned DDR4 pages using the buddy allocator (2 MB compound pages for efficiency) and exports them as DMA-BUF file descriptors. The GPU can then import these pages as CUDA external memory via `cudaImportExternalMemory`. From CUDA’s perspective, those pages look like device-accessible memory — it doesn’t know they live in system RAM. The PCIe 4.0 x16 link handles the actual data movement (~32 GB/s). A sysfs interface (`/sys/class/greenboost/greenboost/pool_info`) lets you monitor usage live. A watchdog kernel thread monitors RAM and NVMe pressure and signals userspace before things get dangerous.

2. CUDA shim (`libgreenboost_cuda.so`, injected via `LD_PRELOAD`)

Intercepts `cudaMalloc`, `cudaMallocAsync`, `cuMemAllocAsync`, `cudaFree`, and `cuMemFree`. Small allocations (< 256 MB) pass straight through to the real CUDA runtime. Large ones (KV cache, model weights overflowing VRAM) are redirected to the kernel module and imported back as CUDA device pointers. There is one tricky part worth mentioning: Ollama resolves GPU symbols via `dlopen` + `dlsym` internally, which bypasses LD_PRELOAD on those symbols. To handle this, the shim also intercepts `dlsym` itself (using `dlvsym` with the GLIBC version tag to bootstrap a real pointer without recursion) and returns hooked versions of `cuDeviceTotalMem_v2` and `nvmlDeviceGetMemoryInfo`. Without this, Ollama sees only 12 GB and puts layers on the CPU.”

Those wanting to learn more about this GPLv2-licensed open-source GreenBoost implementation can find the experimental code via this GitLab repository.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article You Can Play Original Xbox Games On Your Android Phone – Here’s How – BGR You Can Play Original Xbox Games On Your Android Phone – Here’s How – BGR
Next Article Apple’s MacBook Neo doesn’t support fast charging…or does it? Apple’s MacBook Neo doesn’t support fast charging…or does it?
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

iFixit Teardown: MacBook Neo Has Most Accessible Mac Battery in Over a Decade
iFixit Teardown: MacBook Neo Has Most Accessible Mac Battery in Over a Decade
News
I Tested Apple’s M5 MacBook Air—It’s Still the One Most Laptops Chase
I Tested Apple’s M5 MacBook Air—It’s Still the One Most Laptops Chase
News
RayNeo Air 4 Pro glasses are finally live at Amazon — save  with this coupon code
RayNeo Air 4 Pro glasses are finally live at Amazon — save $50 with this coupon code
News
This Smart Gadget On Amazon Can Change The Way You Watch TV – BGR
This Smart Gadget On Amazon Can Change The Way You Watch TV – BGR
News

You Might also Like

The HackerNoon Newsletter: Enids Dream: A Sentient Robot?  (3/14/2026) | HackerNoon
Computing

The HackerNoon Newsletter: Enids Dream: A Sentient Robot? (3/14/2026) | HackerNoon

2 Min Read
Godot 4.3 RC 2: The Safe Fixes | HackerNoon
Computing

Godot 4.3 RC 2: The Safe Fixes | HackerNoon

10 Min Read
CFG Tree Enumeration: A Simple Integer-Based Bijection Algorithm | HackerNoon
Computing

CFG Tree Enumeration: A Simple Integer-Based Bijection Algorithm | HackerNoon

3 Min Read
How AI Companions Impact the Gaming Experience | HackerNoon
Computing

How AI Companions Impact the Gaming Experience | HackerNoon

7 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?