By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: ScreenSafe: A Technical Chronicle of On-Device AI and Privacy-First Architecture | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > ScreenSafe: A Technical Chronicle of On-Device AI and Privacy-First Architecture | HackerNoon
Computing

ScreenSafe: A Technical Chronicle of On-Device AI and Privacy-First Architecture | HackerNoon

News Room
Last updated: 2025/12/12 at 10:12 AM
News Room Published 12 December 2025
Share
ScreenSafe: A Technical Chronicle of On-Device AI and Privacy-First Architecture | HackerNoon
SHARE

Cloud-based content moderation is a privacy nightmare. Sending screenshots to a server just to check for safety? That’s a non-starter.

My hypothesis was simple: modern mobile hardware is powerful enough to support a “Guardian AI” that sees the screen and redacts sensitive info (nudity, violence, private text) in milliseconds—strictly on-device using a hybrid inference strategy.

I called it ScreenSafe.

But the journey from concept to reality revealed a chaotic ecosystem of immature tools and hostile operating system constraints.

Here is the architectural breakdown of how I built ScreenSafe, the “integration hell” I survived, and why local AI is still a frontier battleground.

Watch the full technical breakdown on YouTube.


1. The Stack: Why Cactus?

I needed three things: guaranteed privacy, zero latency (in theory), and offline capability.

I chose Cactus (specifically cactus-react-native). Unlike cloud APIs, Cactus acts as a high-performance C++ wrapper around low-level inference graphs. It utilizes the device’s NPU/GPU via a C++ core, exposed to React Native through JNI (Android) and Objective-C++ (iOS).

The goal: Capture the screen buffer @ 60FPS -> Pass to AI -> Draw redaction overlay. Zero network calls.

We implemented a Two-Stage Pipeline to solve hallucination issues:

  • Stage 1 (Vision): Uses lfm2-vl-450m to generate a descriptive security analysis of the image.
  • Stage 2 (Logic): Uses qwen3-0.6 to analyze that description and extract structured JSON data regarding PII (Credit Cards, SSNs, Faces).

2. The Build System Quagmire

The first barrier wasn’t algorithmic; it was logistical. Integrating a modern C++ library into React Native exposed the fragility of cross-platform build systems. My git history is a graveyard of “fix build” commits.

The Android NDK Deadlock

React Native relies on older NDK versions (often pinned to r21/r23) for Hermes. Cactus, running modern LLMs with complex tensor operations, requires modern C++ standards (C++20) and a newer NDK (r25+).

This created a Dependency Deadlock:

  • Choose the old NDK: Cactus fails with syntax errors.
  • Choose the new NDK: React Native fails with ABI incompatibilities.

I faced constant linker failures, specifically undefined symbols like std::__ndk1::__shared_weak_count. This is a hallmark of libc++ version mismatch. The linker was trying to merge object files compiled against different versions of the C++ standard library.

The Fix: A surgical intervention in local.properties and build.gradle to force specific side-by-side NDK installations, effectively bypassing the package manager’s safety checks. Open PR: github.com/cactus-compute/cactus-react-native/pull/13.


3. The iOS Memory Ceiling (The 120MB Wall)

Once the app built, I hit the laws of physics on iOS. The requirement was simple: Share an image from Photos -> Redact it in ScreenSafe. This requires a Share Extension.

However, iOS enforces a hard memory limit on extensions—often cited as low as 120MB. If you exceed this, the kernel’s Jetsam daemon sends a SIGKILL, and the app vanishes.

The Physics of LLMs

  • Model Weights (Q4): ~1.2 GB
  • React Native Overhead: ~40 MB
  • Available RAM: 120 MB.

You cannot fit a 1.2GB model into a 120MB container.

The “Courier” Pattern

I had to re-architect. The Share Extension could not perform the inference; it could only serve as a courier.

  1. Data Staging: The Extension writes the image to an App Group (a shared file container).
  2. Signal: It flags hasPendingRedaction = true in UserDefaults.
  3. Deep Link: It executes screensafe://process, forcing the OS to switch to the main app.

The main app, running in the foreground, has access to the full 6GB+ of device RAM and handles the heavy lifting.


4. Android IPC: The 1MB Limit

While iOS struggled with static memory, Android struggled with moving data. Android uses Binder for IPC. The Binder transaction buffer is strictly limited to 1MB per process.

A standard screenshot (1080×2400) is roughly 10MB uncompressed. When I tried to pass this bitmap via an Intent, the app crashed instantly with TransactionTooLargeException.

The Solution: Stop passing data; pass references.

  1. Write the bitmap to internal storage.
  2. Pass a content:// URI string (bytes in size) via the Intent.
  3. The receiving Activity streams the data from the URI.

5. The Reality of “Real-Time”

Synchronizing Two Brains (Vision vs. Text)

Multimodal means processing pixels and text. On a server, these run in parallel. On a phone, they fight for the same NPU.

We hit a classic race condition: The vision encoder was fast (detecting an image), but the text decoder was slow.

  • Scenario: Vision says “Safe.” Text is still thinking.
  • The Risk: Do we block the screen? If we wait, the UI stutters (latency). If we don’t, we risk showing a harmful caption.

I had to engineer a complex state machine to manage these async streams, ensuring we didn’t lock the JS thread while the C++ backend was crunching numbers:

Dynamic Context Sizing: Implemented checkDeviceMemory to detect available RAM and dynamically set the model context window:

  • < 3GB RAM → 256 tokens (Safe mode)
  • 3-6GB RAM → 512 tokens (Standard mode)
  • > 6GB RAM → 1024 tokens (High-performance mode)

Timeout Protection: Added a 15-second timeout to the local text model inference. If it hangs (common on emulators), it gracefully fails instead of crashing the app, showing a “Limited analysis” warning.

Agent Manager verifying context size and memory management

PII Detection Logic

  • Logic Update: We prioritized the presence of types. If the types array is not empty, hasPII is forced to true, overriding the LLM’s boolean flag.
  • JSON Repair: The local LLM (qwen3-0.6) was returning conversational <think> blocks and sometimes malformed JSON, causing JSON.parse to fail. We enhanced the JSON cleaning regex to handle more edge cases from the model output (e.g., trailing commas, markdown blocks).
  • Cloud Fallback: We verified that the 15s timeout correctly triggers the “Enable Cloud Mode” suggestion for users on low-end devices.

Hybrid Cloud Inference Integration

We confirmed that the cloud API (https://dspy-proxy.onrender.com) is functional.

  • /configure endpoint works.

  • /register endpoint successfully registered the pii_detection signature.

  • /predict endpoint returns valid JSON with PII analysis.

    Agent Manager verifying all the endpoints

Furthermore, we added the logic to catch the timeouts and automatically retry the request (waking the server). If the cloud service remains unavailable, the app gracefully falls back to the local analysis result without crashing.

It Actually Works

We solved the crashes, but we couldn’t solve the latency. Despite the build breaks and the memory wars, we shipped it.

  • Latency: 30ms – 100ms (Real-time).
  • Privacy: 100% On-device.
  • Cost: $0 (Excluding cloud, infinite scalability).

We proved that you can run high-fidelity AI on mobile if you’re willing to fight the memory limits and patch the build tools. However, the wait is an eternity in mobile UX. This is the “frustration” of local AI: the gap between the instant feel of cloud APIs (which hide latency behind network states) and the heavy feel of a device physically heating up as it crunches tensors.


6. The “Antigravity” Companion

Debugging a neural net is like debugging a black box. You can’t step-through the decision making. I relied heavily on the “Antigravity” to iterate on system prompts and fix hallucinations where the model thought a restaurant menu was “toxic text.”

I also used the dspy-proxy to help streamline some of these interactions and test model behaviors before deploying to the constrained mobile environment.

Antigravity from Google being used to generate the mermaid


Conclusion

Building ScreenSafe proved that local, private AI is possible, but it requires you to be a systems architect, a kernel hacker, and a UI designer simultaneously. Until OS vendors treat “Model Inference” as a first-class citizen with dedicated memory pools, we will continue hacking build scripts and passing files through backdoors just to keep data safe.

Resources & Code

If you want to dig into the code or the proxy architecture I used to prototype the logic:

  • ScreenSafe Repo: github.com/aryaminus/screensafe
  • DSPy Proxy: github.com/aryaminus/dspy-proxy
  • Watch the breakdown: YouTube Video

Liked this post? I’m building more things that break (and fixing them). Follow me on Twitter or check out my portfolio.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Apple’s AirTag might get improved location tracking soon Apple’s AirTag might get improved location tracking soon
Next Article Hisense Hi8 Induction Hob 60cm HI6443BSCWF Review Hisense Hi8 Induction Hob 60cm HI6443BSCWF Review
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

BYD to launch affordable electric SUV in France · TechNode
BYD to launch affordable electric SUV in France · TechNode
Computing
Before You Shop: The Holiday Scam Warning Everyone Needs to Hear
Before You Shop: The Holiday Scam Warning Everyone Needs to Hear
News
‘Game of the Year’ Clair Obscur: Expedition 33 is now 20% cheaper
‘Game of the Year’ Clair Obscur: Expedition 33 is now 20% cheaper
Gadget
We’re still talking about the Trump phone
We’re still talking about the Trump phone
News

You Might also Like

BYD to launch affordable electric SUV in France · TechNode
Computing

BYD to launch affordable electric SUV in France · TechNode

1 Min Read
101+ Best Affiliate Marketing Business Names
Computing

101+ Best Affiliate Marketing Business Names

21 Min Read
Lessons From Hands-on Research on High-Velocity AI Development | HackerNoon
Computing

Lessons From Hands-on Research on High-Velocity AI Development | HackerNoon

18 Min Read
How to Sell Merch Online: The 2026 Guide for Success
Computing

How to Sell Merch Online: The 2026 Guide for Success

28 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?