How to use AI to identify and fix security vulnerabilities in your codebase

Meta: Understand the common code-based security vulnerabilities, from SQL injection to XSS, and how AI simplifies the detection and resolution of these security vulnerabilities for improved code security.

With the average data breach now costing companies $4.45 million, securing your code has never been more urgent.

As development cycles accelerate, security vulnerabilities like SQL injections or cross-site scripting (XSS) are still common or discovered too late. AI is changing that. By scanning large codebases, learning from actual attack patterns, and offering targeted fixes, AI tools help you address code security flaws faster than ever.

This article explores where traditional approaches fall short, how AI can fill those gaps, and the practical steps to embedding AI code security into your development workflow.

Common security vulnerabilities in your codebase

When you look at the OWASP Top 10, you’ll notice how many serious threats come from your routine everyday coding patterns. Let’s examine a few of the most prevalent vulnerabilities:

1. SQL injection

SQL injection can be especially dangerous because it exploits the mechanism you rely on to store and retrieve data. An attacker essentially “injects” malicious SQL commands into an application’s inputs, potentially gaining unauthorized access to or altering the underlying data.

Here’s an example from a simple authentication routine:

# auth_service.py
def authenticate_user(username, password):
    query = f"""
        SELECT id, role FROM users 
        WHERE username = '{username}' 
        AND password = '{password}'
    """
    result = database.execute(query)
    return result.fetchone()

This code appears functional but is highly vulnerable to SQL injection attacks. The authenticate_user function constructs an SQL query using string concatenation, which allows an attacker to inject malicious SQL code. If an attacker inputs ‘ OR ‘1’=’1 as the username, the query becomes:

SELECT id, role FROM users WHERE username = '' OR '1'='1' AND password = 'anything'

This query will always return true, allowing the attacker to potentially bypass authentication and access protected data. The consequences of SQL injection attacks can be severe, including loss of confidentiality, tampering with existing data, identity spoofing, and even gaining administrative access to the database server.

A more secure approach uses parameterized queries:

# auth_service.py (secure version)
def authenticate_user(username, password):
    query = """
        SELECT id, role FROM users 
        WHERE username = %s AND password = %s
    """
    result = database.execute(query, (username, password))
    return result.fetchone()

In this secure version, the SQL query uses placeholders (%s) for the user input, and the actual values are passed as parameters to the execute method. This approach ensures that the input data is properly escaped and cannot be used to manipulate the SQL query structure

2. Cross-site scripting (XSS)

XSS happens when attackers inject malicious scripts (often JavaScript) into web pages that other users view. Any spot in your application where user input is rendered onto the page can be a gateway for XSS.

The following example of a blog post display function is a common pattern in content management systems and is vulnerable to XSS attacks.

@app.route('/blog/<post_id>')
def display_post(post_id):
    post = get_post(post_id)
    return f"""
        <article>
            <h1>{post.title}</h1>
            <div>{post.content}</div>
        </article>

An attacker could inject malicious JavaScript through the post content, affecting every visitor.

In this example, an attacker could script malicious JavaScript through the post.content field. For instance, if an attacker submits a blog post with the following content:

<script>alert('XSS')</script>

This script will be executed by every visitor who views the blog post, allowing the attacker to steal session cookies, manipulate the user’s browser, or perform other malicious actions.

To counter XSS, always sanitize and escape user-generated content:

from markupsafe import escape

@app.route('/blog/<post_id>')
def display_post(post_id):
    post = get_post(post_id)
    return render_template('post.html',
        title=escape(post.title),
        content=escape(post.content)
    )

The escape function from the markupsafe library is used to ensure that any user-provided content is properly sanitized, preventing malicious scripts from being executed in the user’s browser.

3. Hardcoded secrets

Hardcoding sensitive information, like API keys or database credentials, into your code is a common mistake, often done for convenience. However, attackers can use these secrets to gain unauthorized access if this code ends up in a public repository or is leaked elsewhere.

Here is a simple but risky approach:

class StorageService:
    def __init__(self):
        self.aws_key = "AKIA1234567890ABCDEF"
        self.aws_secret = "jK8*2nP9$mB4#kL5"
        self.client = boto3.client('s3',
            aws_access_key_id=self.aws_key,
            aws_secret_access_key=self.aws_secret
        )

If these credentials appear in a publicly accessible repository, malicious actors can exploit them immediately. Instead, you can store secrets in environment variables or a secure secrets manager:

from decouple import config

class StorageService:
    def __init__(self):
        self.client = boto3.client('s3',
            aws_access_key_id=config('AWS_ACCESS_KEY'),
            aws_secret_access_key=config('AWS_SECRET_KEY')
        )

The decouple library from config is used to load the AWS credentials from environment variables, ensuring that sensitive information is not hardcoded in the application code. This approach significantly reduces the risk of credential exposure and subsequent security breaches.

Although these vulnerabilities may appear obvious individually, they become difficult to detect within large codebases, especially when many developers make multiple commits daily across various repositories. Manual reviews alone struggle to consistently catch security issues at scale.

Keeping these vulnerabilities in mind, let’s examine how various types of security testing can help identify and prevent them.

Choosing static or dynamic code analysis?

It’s critical to identify vulnerabilities early. Two practical approaches to scrutinizing your code are:

Static Application Security Testing (SAST) – Analyzes your codebase without executing it, pinpointing issues like insecure coding patterns, hardcoded secrets, or known vulnerability signatures.
Dynamic Application Security Testing (DAST) – Interacts with your running application to spot real-time issues, such as misconfigurations or runtime injection paths.

Many security teams rely on both since SAST focuses on code structure, whereas DAST uncovers flaws that are only visible during runtime. Whichever approach you take, the idea is to catch issues well before affecting production users.

An AI-assisted approach to code security

Even if you’re vigilant about security, modern codebases constantly evolve, and it’s easy for vulnerabilities to slip through. AI code review tools help by quickly scanning large repositories, drawing on known attack patterns, and machine learning to spot issues you might otherwise miss.

They also offer practical suggestions for fixing those issues, reinforcing your overall security posture.

To see this in action, consider a Python Flask project that handles user profiles and file uploads (two areas where security oversights often hide). AI can highlight these hidden risks and guide you in resolving them before they become real problems.

Setting up an AI tool for automated code reviews for security vulnerabilities

If you’d like to explore AI-based reviews, you can try any tool that suits your needs; some include GitHub Copilot, Claude, and Perplexity. For this tutorial, we’ll use a popular AI tool like ChatGPT to review and examine common vulnerabilities with a Python Flask application that manages user profiles and handles photo uploads.

First, clone the Python project that handles user profile management.

git clone https://github.com/Tabintel/py-photo-lib.git

Navigate to the project directory and install dependencies.

cd py-photo-lib
python -m venv venv
venvScriptsactivate
pip install -r requirements.txt

Then, create a branch for the new photo upload functionality:

git checkout -b feature/profile-uploads

After cloning the repository and creating the branch, copy the code from this GitHub gist into your app.py file.

This feature adds photo upload capabilities to the application, allowing users to upload profile images.

The changes include:

New route /upload for uploading images.
File upload handling using Flask’s request.files.
Database query in /profile/<username> to fetch user details.
Configured API to handle profile image paths.

Before pushing the code to the main repository, you’ll first use an AI tool to review it for any vulnerabilities.

Give this prompt to your AI tool (we’re using ChatGPT in this case):

“Review the following code, which adds a new profile image upload feature to a Python Flask application. Identify any security vulnerabilities or potential risks in the implementation, briefly explain why, and suggest improvements where necessary.”

In the review process, you would likely receive precise, line-by-line recommendations, such as these examples:

1. Exception handling enhancement

The review flagged broad exception handling, suggesting you implement more specific error tracking to prevent the exposure of sensitive information.

This is a critical security concern, as broad exception handling can make identifying and addressing potential issues difficult. To address this, you can implement a robust exception-handling mechanism that includes specific error messages and logging.

For example, instead of using a broad exception handler like this:

except Exception as e:

You can implement a more specific exception handler that includes error messages and logging:

try:
except ValueError as e:
    # Log the exception and return an error message
    logging.error(f"Invalid value: {e}")
    return {"error": "Invalid value"}
except Exception as e:
    # Log the exception and continue
    logging.error(f"An error occurred: {e}")

This approach ensures that each exception type is handled specifically, providing more detailed error logs and making diagnosing and fixing issues easier.

2. Undefined username

Another vulnerability identified is related to undefined model imports. Having undefined model imports in any codebase can make it difficult to ensure data integrity and prevent unauthorized access.

To address this, you can enhance the username validation to prevent undefined model imports.

from models import User
user = User.query.get(username)

When the User models are imported correctly, you prevent potential errors that could arise from undefined models, which can lead to data inconsistencies and prevent attacks like SQL injections.

3. Additional code vulnerabilities

From this particular review, you can see that the AI tool recognizes areas in the app.py code pull request and points out the exact lines of code that the changes need to be reviewed for security purposes. Let’s look at each of them:

4. Image format validation

The automated review flags the need to review image format validation. Currently, the application only allows png, jpg, jpeg, and gif formats for uploaded photos:

ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg', 'gif'}
MAX_FILE_SIZE = 5 * 1024 * 1024  # 5MB

While this approach is sufficient for basic use cases, it leaves the application vulnerable to limitations or potential misuse.

For example, if the project requires support for other formats (e.g., webp, tiff), the lack of validation for these could result in errors or even security risks when unexpected file types are uploaded.

Proper validation ensures that the system explicitly handles supported formats, mitigating risks from unverified file uploads.

5. Function usage documentation

Next, there’s a suggestion to clarify function usage in docstring. A short docstring for allowed_file could be helpful in describing its purpose and the expected parameter or return types.

def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

Without documentation, you may misunderstand the purpose or behavior of this function. Adding a short docstring that specifies the function’s role (validating file extensions) and details the expected parameters and return values would prevent miscommunication and misuse.

6. Image processing logging

There’s a suggestion to validate and log image processing steps. The image processing logic is good but may benefit from explicit logging when conversions or resizing occur, especially to diagnose user-upload issues.

def process_image(image_path):
    with Image.open(image_path) as img:
        # Convert to RGB if necessary
        if img.mode != 'RGB':
            img = img.convert('RGB')
        # Resize if too large
        if max(img.size) > 2000:
            img.thumbnail((2000, 2000))
        # Save optimized version
        img.save(image_path, 'JPEG', quality=85, optimize=True)

Without logs, it becomes difficult to diagnose issues when image uploads fail. Logging each step would provide a clear trail for troubleshooting and prevent vulnerabilities caused by incomplete or incorrect processing. Logs also enhance accountability and allow developers to detect unexpected behavior during uploads.

Through this automated code review process, you’ve seen how AI tools comprehensively analyze the code, highlighting vulnerabilities in the photo upload feature. The review shows critical security gaps, from unhandled exceptions that could expose system information to unsecured file type validation that might allow malicious uploads and missing model imports that could compromise data integrity.

Once you’ve added the code from the review, commit and push the changes to your GitHub repository using the following commands:

git add app.py
git commit -m "feat: add profile photo upload functionality"
git push origin feature/profile-uploads

Implementing code security best practices

Security requires ongoing diligence. Here are some routines you can integrate into daily development:

1. Regular code reviews

Peer reviews remain indispensable for catching logical errors and sharing knowledge. To maximize coverage, combine them with automated scanning.

2. Secure coding standards

Avoid storing secrets in code.
- Enforce the least privilege for database connections.
- Use parameterized queries by default.
- These guidelines reduce the risk of common issues slipping through.

3. Frequent vulnerability assessments

Schedule periodic scans (e.g., monthly or quarterly) using SAST and DAST tools. Consider penetration tests for critical areas of your application.

4. Automate where possible

CI/CD pipelines can automatically run security tests before deployment, ensuring no commit merges without scrutiny. AI can assist in triaging results, making the process more efficient.

5. Embrace DevOps culture

Encourage open collaboration between development and operations, ensuring that security is baked in from the start rather than bolted on at the end.

6. Shift-Left security

Moving security checks to earlier stages of development (DevSecOps) reduces last-minute surprises. Addressing security vulnerabilities earlier in the development process significantly reduces the time and resources required for remediation.

Wrapping up: Securing your codebase

Applying thoughtful, ongoing code security practices and AI code review checks can significantly reduce your application’s vulnerability. Whether it’s preventing SQL injection, mitigating XSS, or ensuring secrets don’t leak, each layer of protection adds up.

As you refine these habits and explore AI code security solutions, you’ll discover more efficient ways to keep your applications secure, protect sensitive data, and maintain a reliable development pipeline.