Software development and programming, once regarded as endeavors that required deep expertise, can now be done by anyone using natural language. A feature that used to take days or months to develop can now be developed in minutes or hours thanks to code from an AI model. For example, OpenAI Codex and Google BERT are trained on programming web blogs, Stack overflow questions, etc.
These intelligent models create code through mathematical probabilities while they are also known to hallucinate and present false information. A research done by academia claims that AI code generation is the leading cause for top 10 vulnerabilities and nearly 40% of code has security bugs. Many of the leading players, along with new SaaS providers are leveraging AI for making their offerings smarter. SaaS programmers too, must be more knowledgeable about AI-based SaaS tools.
What Makes AI-Generated Code Unsafe?
Following programming standards and code quality dictates software safety. But, AI models are trained on every bit of information available on the internet. The code quality, reliability, security, and more might differ from that which is generated by human developers. A model trained on web development examples may contain poor data validation practices, for instance. This lack of validation can lead to security issues when the model generates code that adopts the same poor practices.
5 Indicators That Suggest Code Contains Security Weaknesses
No matter the size (million or billion of parameters) the mะพdels are known to hallucinate and make incorrect predictions. When a typical developer sees the code that an AI produces, they will miss the subtle but serious security vulnerabilities. However, for a developer who has complete knowledge of design and development patterns, flaws are a review away from being identified. Developers can leverage these patterns to discover vulnerabilities and align with SaaS security best practices.
1. Type Inference and Input Validations are Not Enforced
Modern frameworks and libraries rely heavily on interface/enum for inference and validation. This guarantees that the code does its job accurately and imposes security. AI-generated code won’t infer unless we direct it. Even after crafting a careful prompt, type mismatch and validation enforcement may not match the use case. To locate and amend code mismatches, developers must be well aware of the domain and business requirements.
def reciprocal(user_input):
# Insecure implementation with no type inference or validation
result = 100 / user_input
return result
2. Non-Standard State and Context Sharing Between Classes/Objects
Programs share objects in Public/Private/Protected ways. Higher order functions and classes inherit object state by accessing public/protected variables directly to perform computations. If something is done incorrectly in implementation or execution, security or performance bottlenecks may easily occur. SaaS developers must implement their state and context management logic appropriately and review it for correct and safe use.
class InsecureClass:
def __init__(self, owner, balance, password):
self.owner = owner # Public attribute
self._balance = balance # Protected attribute
self.__password = password # Private attribute
# Public def
def get_balance(self):
return self._balance
# Protected def
def _update_balance(self, amount):
self._balance += amount
# Private def
def __validate_password(self, input_password):
return self.__password == input_password
# Insecure def exposing private data
def insecure_password_exposure(self):
return self.__password
3. Weak Implementation of Data Handling and Sharing Techniques
Services share and receive information over the network. These days, secure connectivity and data handling have been crucial to the success of cloud-based systems. When reading, processing, and sharing sensitive data on organizations through distributed data networks, strong protocols and security techniques must be in place to prevent data interception. Using AI, the SaaS developer has to implement every single aspect of the architecture in full-fledged applications.
#Insecure Data Sharing
@app.route("/user/<int:user_id>", methods=["GET"])
def get_user(user_id):
user = users.get(user_id)
if user:
return jsonify(user) # All user data exposed, including secrets
# Insecure Data Handling
@app.route("/update_email", methods=["POST"])
def update_email():
data = request.json()
user_id = data.get("user_id")
new_email = data.get("new_email")
if user_id in users:
users[user_id]["email"] = new_email # No validation of new_email
return jsonify({"message": "Email updated successfully"})
4. Inadequate Secrets and Auth Handling
In today’s cyber sensitive world, tight
# Insecure authentication
@app.route("/login", methods=["POST"])
def login():
data = request.json()
email = data.get("email")
password = data.get("password")
for user_id, user in users.items():
if user["email"] == email and user["password"] == password:
return jsonify({"message": "Login successful", "user_id": user_id})
5. Outdated Dependencies with Deprecated Functionality Usage
AI programming is being directed by libraries and frameworks made by the community and open-source. People support new technology by using these promising tools and creating new ones. The data these models were trained on is not up to date and the model capabilities are frozen, and so is their knowledge. With the development of technology, a lot of features will become obsolete and some libraries won’t be relevant to current needs. A SaaS developer is tasked with the review and usage of valid dependencies to ensure functionality and security.
import md5 # Outdated library
def insecure_hash_password(password):
# Insecure password hashing done using the deprecated MD5 algorithm.
return md5.new(password).hexdigest()
Tips to Make AI-Generated Code Safe to Use
The advanced coding capabilities of Large Language Models is due to their extensive mathematical calculations. No fancy techniques are needed to make it compliant with security and programming standards. We can use these simple checks to make AI-generated code safe and compliant with standards:
- Code review with security and architecture teams should be a standard part of your life cycle.
- Integrate automated security testing and validation steps in version control tools.
- Include dependency and compliance checks in testing KPIs.
- Adopt Zero-Trust architecture with static and dynamic security testing tools.
- Leverage DevSecOps practices and shadow AI.
Handling Unsafe AI-Generated Code with a Simple Github Action
No matter how carefully we review and audit the code, chances of human error are always there. Relying solely on manual audits is not enough as we need to have some predefined checks that can test and validate the code as soon as it enters the version control system. What better check than to add a Github action which automatically runs security and quality checks when a PR is raised.
name: Simple Security Checks for AI generated Code
on:
push:
branches:
- main
pull_request:
branches:
- main
jobs:
security-and-quality-check:
runs-on: ubuntu-latest
Steps:
- name: Repository checkout
uses: actions/checkout@v3
- name: Python setup
uses: actions/setup-python@v4
with:
python-version: ">=3.9"
- name: Dependency installation
run: |
python -m pip install --upgrade pip
pip install bandit pytest
- name: Identifying insecure libraries and patterns
run: |
echo "Checking for insecure patterns..."
if grep -r "md5.new(" .; then
echo "ERROR: Insecure MD5 hashing detected. Use hashlib.sha256 or bcrypt instead."
exit 1
fi
echo "No insecure patterns detected."
- name: Scanning for security vulnerabilities
run: |
echo "Running Bandit security scanner..."
bandit -r .
- name: Running unit tests
run: |
echo "Running unit tests..."
pytest test/unit --cmodopt=local
- name: Notifying on failure
if: failure()
run: |
send_slack_notification(โUnsafe code merge detected, fix immediatelyโ)
Conclusion
Large language models are quite useful tools for SaaS developers to generate code and information with natural prompts. However, they pose security risks and sometimes deliver non-performant code that doesn’t suit enterprise needs. SaaS developers must be very careful when using these tools and implementing AI-generated code for real-life use cases. This practical guide focuses on the factors that arise and influence security posture while showing how to overcome these challenges.