At the beginning of May, a repository appeared on Hugging Face that disguised itself as an OpenAI model and installed an infostealer on Windows systems. The attackers used typosquatting and distributed the repository as an Open-OSS/privacy-filter based on the OpenAI model openai/privacy-filter.
Read more after the ad
During the attack, the repository landed at #1 trending repository within 18 hours, with over 240,000 downloads and 667 likes. The latter largely relied on automated accounts to push the repository.
Hugging Face has since removed the repository. Anyone who previously cloned it on a Windows computer and ran either start.bat or loader.py should consider their system to be infected and credentials stored in browsers and their extensions to be potentially hacked.
The analysis by the AI security company HiddenLayer shows which files can be affected.
At first glance, almost identical to the OpenAI repository
Apparently the attackers copied the model card that describes the model almost verbatim from OpenAI’s privacy filter, including a link to a PDF from OpenAI.
The instructions in the Readme were also largely similar, but also asked to clone the repository locally and run start.bat on Windows and the Python loader loader.py on macOS or Linux.
Pretend model activity
Read more after the ad
As a distraction, the loader initially runs seemingly legitimate code with a class DummyModelmock model training output and a synthetic data set.
The installation of the malicious code starts with the function called at the end _verify_checksum_integrity(). It starts a PowerShell command that only works on Windows systems and runs hidden in the background
powershell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -Command
With the Creation Flag CREATE_NO_WINDOW the process runs without a console window.
Numerous obfuscation tactics
The script downloads and executes an update.bat file that prepares the actual malicious code infection. To do this, the file first checks for admin rights, which it requests in case of doubt, which at least triggers a UAC prompt. She then downloads the malicious code and tries to enter it as an exception for Microsoft Defender.
The actual infostealer is a program written in Rust that uses numerous obfuscation techniques to avoid being recognized as malicious code. Among other things, the program obfuscates the use of Windows APIs and checks whether an anti-malware program is running it in a virtual machine.
Collect and upload
Finally, the Infostealer collects information from browsers, Discord, wallets (including via browser extensions), various configuration files and geodata. It also creates screenshots using the Windows Graphics Device Interface (gdi32.dll).
The infostealer packs the collected data into a JSON file, which it uploads to a remote server.
Automated likes for better visibility
The likes were probably created largely automatically in order to push the repository. According to HiddenLayer’s analysis, 504 follow the pattern “firstname-lastname###” and another 153 follow the pattern “adjectivenoun####”.
A portion of the 244,000 downloads were probably not carried out automatically by victims of the Infostealer attack, but rather by the attackers themselves in order to drive the repository up in the Hugging Face ranking.
(rme)
