By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Unveiling the Code Abyss: Inverting LLMs to Expose Vulnerability Vortexes in AI-Generated Programs | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Unveiling the Code Abyss: Inverting LLMs to Expose Vulnerability Vortexes in AI-Generated Programs | HackerNoon
Computing

Unveiling the Code Abyss: Inverting LLMs to Expose Vulnerability Vortexes in AI-Generated Programs | HackerNoon

News Room
Last updated: 2025/07/28 at 4:37 PM
News Room Published 28 July 2025
Share
SHARE

Table of Links

Abstract and I. Introduction

II. Related Work

III. Technical Background

IV. Systematic Security Vulnerability Discovery of Code Generation Models

V. Experiments

VI. Discussion

VII. Conclusion, Acknowledgments, and References

Appendix

A. Details of Code Language Models

B. Finding Security Vulnerabilities in GitHub Copilot

C. Other Baselines Using ChatGPT

D. Effect of Different Number of Few-shot Examples

E. Effectiveness in Generating Specific Vulnerabilities for C Codes

F. Security Vulnerability Results after Fuzzy Code Deduplication

G. Detailed Results of Transferability of the Generated Nonsecure Prompts

H. Details of Generating non-secure prompts Dataset

I. Detailed Results of Evaluating CodeLMs using Non-secure Dataset

J. Effect of Sampling Temperature

K. Effectiveness of the Model Inversion Scheme in Reconstructing the Vulnerable Codes

L. Qualitative Examples Generated by CodeGen and ChatGPT

M. Qualitative Examples Generated by GitHub Copilot

In the following, we briefly introduce existing work on large language models and discuss how this work relates to our approach.

A. Large Language Models and Prompting

Large language models have advanced the natural language processing field in various tasks, including question answering, translation, and reading comprehension [1], [17]. These milestones were achieved by scaling the model size from hundreds of millions [18] to hundreds of billions [1], self-supervised objective functions, reinforcement learning from human feedback [19], and huge corpora of text data. Many of these models are trained by large companies and then released as pre-trained models. Brown et al. [1] show that these models can be used to tackle a variety of tasks by providing only a few examples as input – without any changes in the parameters of the models. The end user can use a template as a few-shot prompt to guide the models to generate the desired output for a specific task. In this work, we show how a few-shot prompting approach can be used to generate code with specific vulnerabilities by approximating the inversion of the black-box code generation models.

B. Large Language Models of Source Codes

There is a growing interest in using large language models for source code understanding and generation tasks [7], [5], [20]. Feng et al. [21] and Guo et al. [22] propose encoder-only models with a variant of objective functions. These models [21], [22] primarily focus on code classification, code retrieval, and program repair. Ahmad et al. [23] and Wang et al. [20] employ encoder-decoder architecture to tackle code-to-code, and code-to-text generation tasks, including program translation, program repair, and code summarization. Recently, decoderonly models have shown promising results in generating programs in left-to-right fashion [5], [4], [6], [12]. These models can be applied to zero-shot and few-shot program generation tasks [5], [6], [24], [12], including code completion, code infilling, and text-to-code tasks. Large language models of code have mainly been evaluated based on the functional correctness of the generated codes without considering potential security vulnerability issues (see Section II-C for a discussion). In this work, we propose an approach to automatically find specific security vulnerabilities that can be generated by these models through the approximation of the inversion of target black-box models via few-shot prompting.

C. Security Vulnerability Issues of Code Generation Models

Large language code generation models have been pre-trained using vast corpora of open-source code data [7], [5], [25]. These open-source codes can contain a variety of different security vulnerability issues, including memory safety violations [26], deprecated API and algorithms (e.g., MD5 hash algorithm [27], [15]), or SQL injection and cross-site scripting [28], [15] vulnerabilities. Large language models can learn these security patterns and potentially generate vulnerable codes given the users’ inputs. Recently, Pearce et al. [15] and Siddiq and Santos [28] show that the generated codes using code generation models can contain various security issues.

Pearce et al. [15] use a set of manually-designed scenarios to investigate potential security vulnerability issues of GitHub Copilot [9]. These scenarios are curated by using a limited set of vulnerable codes. Each scenario contains the first few lines of the potentially vulnerable codes, and the models are queried to complete the scenarios. These scenarios were designed based on MITRE’s Common Weakness Enumeration (CWE) [29]. Pearce et al. [15] evaluate the generated codes’ vulnerabilities by employing the GitHub CodeQL static analysis tool. Previous studies [15], [30], [28] examined security issues in code generation models, but they relied on a limited set of manually-designed scenarios, which could result in missing generating potential codes with certain vulnerability types. In contrast, our work proposes a systematic approach to finding security vulnerabilities by automatically generating various scenarios at scale. This enables us to create a diverse set of non-secure prompts for assessing and comparing the models with respect to generating code with security issues.

D. Model Inversion and Data Extraction

Deep model inversion has been applied to model explanation [31], model distillation [32], and more commonly to reconstruct private training data [33], [34], [35], [36]. The general goal in model inversion is to reconstruct a representative view of the input data based on the models’ outputs [34]. Recently, Carlini et al. [37] showed that it is possible to extract memorized data from large language models. These data include personal information such as e-mail, URLs, and phone numbers. In this work, we use few-shot prompting to approximate an inversion of the targeted black-box code models. Here, our goal is to employ the approximated inversion of the models to automatically find the scenarios (prompts) that lead the models to generate codes with a specific type of vulnerability.

III. TECHNICAL BACKGROUND

Detecting software bugs before deployment can prevent potential harm and unforeseeable costs. However, automatically finding security-critical bugs in code is a challenging task in practice. This also includes model-generated code, especially given the black-box nature and complexity of such models. In the following, we elaborate on recent analysis methods and classification schemes for code vulnerabilities.

A. Evaluating Security Issues

Various security testing methods can be used to find software vulnerabilities to avoid bugs during the run-time of a deployed system [38], [39], [40]. To achieve this goal, these methods attempt to detect different kinds of programming errors, poor coding style, deprecated functionalities, or potential memory safety violations (e.g., unauthorized access to unsafe memory that can be exploited after deployment or obsolete cryptographic schemes that are insecure [41], [42], [26]). Broadly speaking, current methods for security evaluation of software can be

Listing 1: Python code adapted from [29], showing an example for deserialization of untrusted data (CWE-502).Listing 1: Python code adapted from [29], showing an example for deserialization of untrusted data (CWE-502).

divided into two categories: static [38], [43] and dynamic analysis [44], [45]. While static analysis evaluates the code of a given program to find potential vulnerabilities, the latter approach executes the codes. For example, fuzz testing (fuzzing) generates random program executions to trigger the bugs.

For the purpose of our work, we choose to use static analysis to evaluate the generated code, as it enables us to classify the type of detected vulnerabilities. Specifically, we use CodeQL, one of the best-performing free static analysis engines released by GitHub [46]. For analyzing the language model generated code, we query the code via CodeQL to find security vulnerabilities in the code. We use CodeQL’s CWE classification output to categorize the type of vulnerability that has been found during our evaluation and to define a set of vulnerabilities that we further investigate throughout this work.

B. Classification of Security Weaknesses

Common Weaknesses Enumerations (CWEs) is a list of typical flaws in software and hardware provided by MITRE [29], often with specific vulnerability examples. In total, more than 400 different CWE types are defined and categorized into different classes and variants, e. g. memory corruption errors. Listing 1 shows an example of CWE-502 (Deserialization of Untrusted Data) in Python. In this example from [29], the Pickle library is used to deserialize data: The code parses data and tries to authenticate a user based on validating a token, but without verifying the incoming data. A potential attacker can construct a pickle, which spawns new processes, and since Pickle allows objects to define the process for how they should be unpickled, the attacker can direct the unpickle process to call the subprocess module and execute /bin/sh.

For our work, we focus on the analysis of thirteen representative CWEs that can be detected via static analysis tools to show that we can systematically generate vulnerable code and their input prompts. We decided not to use fuzzing for vulnerability detection due to the potentially high computational cost and manual effort imposed by root cause analysis. Some CWEs represent mere code smells or require considering the development and deployment process and are hence out of scope for this work. The thirteen analyzed CWEs, including a brief description, are listed in Table I. Of the thirteen listed CWEs, eleven are from the top 25 list of the most important vulnerabilities. The description is defined by MITRE [29].

TABLE I: List of evaluated CWEs. Eleven of the thirteen CWEs are in the top 25 list. The description is from [29].TABLE I: List of evaluated CWEs. Eleven of the thirteen CWEs are in the top 25 list. The description is from [29].

Authors:

(1) Hossein Hajipour, CISPA Helmholtz Center for Information Security ([email protected]);

(2) Keno Hassler, CISPA Helmholtz Center for Information Security ([email protected]);

(3) Thorsten Holz, CISPA Helmholtz Center for Information Security ([email protected]);

(4) Lea Schonherr, CISPA Helmholtz Center for Information Security ([email protected]);

(5) Mario Fritz, CISPA Helmholtz Center for Information Security ([email protected]).


Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article New study into the environmental & chemical effects on women’s fertility
Next Article Women’s ‘red flag’ app Tea is a privacy nightmare
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Rethinking HR: One Consultant’s Plan to Empower America’s Smallest Employers
Gadget
Generating Emotions – The Future of Art, or Just Mimics of Expression? | HackerNoon
Computing
Getting Your Hands On A Nintendo Switch 2 Is Much Easier Now – BGR
News
Pacific Northwest tech pioneers team up in quantum realms and on the space frontier
Computing

You Might also Like

Computing

Generating Emotions – The Future of Art, or Just Mimics of Expression? | HackerNoon

13 Min Read
Computing

Pacific Northwest tech pioneers team up in quantum realms and on the space frontier

14 Min Read
Computing

Transsion’s Q1 net profit plunges nearly 70% y-o-y · TechNode

1 Min Read
Computing

Crypto Rally Ahead? LF Coin Nears $1 As Dogecoin Breakout Looms | HackerNoon

4 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?