A Pentester's Guide to Source Code Review

Everyone who has achieved mastery began as a novice. This blog post guides how to manage and conduct a source code review project, particularly if you are new to the task and have to do it. It offers advice on approach and execution to help readers complete the source code review assessment successfully.

Source code review is the process of thoroughly examining and evaluating the source code of an application to identify any potential security vulnerabilities at the code level. From a security engineer’s perspective, the goal of a source code review is to identify and mitigate any issues that could be exploited by an attacker, such as SQL injection, cross-site scripting, or other types of code-level vulnerabilities.

What should I do if assigned a project to conduct a source code review assessment?

What is the process for conducting a source code review?
What information should I ask the client to conduct a source code review assessment properly?
How can a source code review be conducted?

If you are still getting familiar with conducting a source code review, this guide will assist you in preparing and conducting the assessment effectively. Before discussing the specific context of a client, let's first gain a general understanding of code review.

Understanding Secure Code Review in Secure Software Development

Secure Code Review is a process used to identify potential security vulnerabilities in software source code. It is an important part of a secure SDLC process and can be used to detect common coding flaws such as buffer overflows, SQL injection, and cross-site scripting.

Secure code review aims to produce secure and reliable software by identifying and correcting security vulnerabilities before threat actors exploit them. During the review, trained security professionals review the application source code and look for potential security flaws. They then document their findings and provide recommendations for remediation. Secure code review is essential for ensuring secure software development and should be part of any organization's security program.

Benefits of Secure Code Review in Secure Software Development

Most obvious from the pentester's perspective would be to find security flaws!

One of the primary reasons for conducting a secure code review is to identify and address security flaws at the code level. This includes identifying vulnerabilities such as SQL injection, cross-site scripting, and other types of security risks that could be exploited by attackers.

Protection of information and assets

Secure code review helps to protect the organization's sensitive information and assets by identifying and addressing vulnerabilities in the code. This can help to prevent data breaches and protect the organization's reputation. Secure code review can also help find hard-coded secrets & API Keys, saving tons of money.

Effort

Conducting a secure code review can save a lot of effort in the long run. Fixing vulnerabilities early in the development process is much easier and less expensive than fixing them after the software has been deployed in production.

Cost

Secure code review can help save costs by identifying and fixing vulnerabilities early in development. This can prevent costly security breaches and reduce the need for expensive remediation efforts.

Compliance

Many organizations are required to comply with regulations and standards, such as PCI-DSS, and SOC II, that mandate secure code review as a requirement. Conducting a secure code review can help organizations meet these compliance requirements.

Reputation

A data breach can profoundly affect an organization's standing. Secure code review can help to protect an organization's reputation by identifying and addressing vulnerabilities in the code before attackers can exploit them.

Preparing for a Source Code Review: Methods and Approach

If you have a task or project involving a source code review. To ensure the review is done well and thoroughly, there are a few things you need to do before the engagement even begins. Let's look at its methods and approach.

Identify Security Code Review Objectives:

Identifying the objectives for the security code review is the first step in the process.
These objectives will determine what specific code areas will be reviewed and what specific security vulnerabilities will be looked for.
Examples of objectives could include identifying and fixing SQL injection vulnerabilities, ensuring compliance with industry standards, or reducing the overall attack surface of the application.
Establishing a clear and mutual understanding of the goals and objectives of a source code review with the client is essential before starting the engagement. This will ensure that both parties are on the same page and have a common understanding of the purpose and expectations of the review.

Prerequisites for Code Review:

Gathering specific information from the client is essential before beginning a source code review engagement. This includes

Access: Obtain access to the source code for the module under review. This can be access on GitHub/GitLab instance or any internal code hosting service.
LOC (Lines of Code): A critical step in preparation for a source code review is determining the total number of lines of code (LOC) in the module under review. This information will aid in calculating the estimated number of work hours necessary to complete the activity.
Hardware Requirement for license tool installation: One of the prerequisites for a source code review is to confirm with the client whether the review will be conducted on a machine provided by the client or on the pentester’s own machine. This is important to plan for as it will affect any enterprise tools requiring a license.
Walk-through: Have a session with the developer team to discuss the code and any concerns. The most significant aspect of this section is the Walk-through with the development team. Meeting with the application's developer to discuss the code's goals is the key step of a secure code review.

Sr. No	Area of focus	What to ask
1	Authentication	Are application users authenticated, or are they all treated as anonymous users? What factors are used for authentication (such as passwords, tokens, and certificates)? If passwords are being used, are there any policies regarding complexity or age in place?
2	Authorization	Are there different roles that users can be given depending on the applications’ function? Is the authorization data cached checked for each incoming request? Are there any private, sensitive data files stored in the web root that is not authorized for the regular user?
3	Data Validation	Is the user-submitted data validated? Is data validated as soon as it comes in from the user or when it is used by the code? How is the data validation accomplished (whitelisting, blacklisting, min/max, etc.)? Are you using a database? If so, are you using prepared statements?
4	Exception/Error Handling	What approach(s) for error handling is being used? What kind of details about an error are displayed to the user? Are error traces ever sent back to the user? Or are they sent to logs only? Is the sensitive data in error logged to the log files? If the database throws an error, is the error message sent to the user, or is it passed to a log?
5	Session Management	Is there any way the application manages or stores session state, and if so, how? How is the session id being generated? Is the previous session deleted when a user logs into the site and creates a new session? Are tokens used for session management? If yes, what algorithm is used? Any timeouts for sessions? If cookies are used, are there path and domain restrictions in the cookie?
6	Logging	Is any type of logging being used within the code? Where are generated log messages sent? Are users who shouldn't have access to the log files able to access them (any or all employees)? Are you logging any input that is not validated first or data that has failed validation? Are log messages time-stamped? Is any sensitive data written to a log (e.g., password, API key, etc.)?
7	Encryption	Are there any encryption algorithms used within the code at all? (SSL, TLS, RSA?) Where did you get the library's implementation, and what version is it using? If using 3DES or AES (any block cipher), what encryption mode is used? Is there a central function in the code that handles encryption? Where is it?

This is just a high-level overview of the checks you can request; feel free to add or modify any other checks as needed.

Perform Preliminary Scan

But why use Tools for Source Code Analysis?

The purpose of source code review tools is to automate and streamline the process of reviewing large amounts of code, typically measured in thousands or millions of lines. This is necessary as performing a manual line-by-line analysis of such a vast amount of code is virtually impossible. Source code review tools aid in the efficient identification and resolution of potential security vulnerabilities, coding errors, and performance issues.

After the objectives and necessary information have been gathered, the next step is to perform a preliminary scan of the source code.
Based on the total number of lines of code (LOC), the approach for conducting the test will be determined.
This is often the case when the number of lines of code is high. This scan can be done using automated tools or manual review methods to identify potential vulnerabilities or areas of the code that may require further examination.
Benefits of SAST tools:

Reduction in manual effort
Time efficient
Find all the instances of vulnerabilities
Source to sink analysis
Exhaustive coverage of vulnerability patterns
Elaborate reporting format

Limitations of SAST tools

Unable to detect Business Logic Flaws
Limited Scope
Custom Validations
Design Flaws
Application Specific Recommendations

Source Code Review – Free Tools

Note: Many open-source tools are available for examining and evaluating source code during a review process. Some examples have been highlighted. However, it is crucial to remember that each tool has its specific programming language compatibility. It is recommended to thoroughly examine and assess the options before deciding which tool to utilize.

Source Code Review – Commercial Tools

Note: Many commercial tools are available for examining and evaluating source code during a review process. Selecting the correct source code review tool for your project is essential. When choosing a tool, be sure to consider its programming language compatibility and features. Consider your project's needs and evaluate the various enterprise tools available to determine which one best meets those needs.

Key Points to Consider When Reviewing Small Code Bases Manually

If you have a relatively small amount of code to review, such as several thousand lines, and you plan to conduct a manual assessment, there are several key points to remember when examining the source code.

The following are high-level findings that can be easily identified in the code: while there may be more, the points covered below provide a good starting point.

Basics: One of the most important prerequisites for conducting a code review is understanding at least one object-oriented framework, such as J2EE or .NET.

Application Details: The next critical step in code review is to comprehend the workings of the application and the settings it employs. The deployment descriptor, such as web.xml in J2EE or web.config in .NET, is the entry point for this process. In web service testing, it's essential to understand the application's purpose, business logic, and data flow. Examining the application can reveal vital information, including its APIs and third-party libraries.

Following are a few terminologies of which a code reviewer should be aware:

Source: The "source" refers to the location in the code where malicious input was introduced, such as using the "request.getParameter()" method.
Sink: The "sink" is the location in the code where a vulnerability is exploited, such as where XSS alerts are reflected.
Taint: "Taint" refers to malicious data provided by the user.
Taint propagator: The "taint propagator" function takes malicious data as input and then passes it out without any validation.

Area of focus during code review:

Input validation and output encoding
Authentication & Authorization
Cryptography
Session management
Threat Modeling Terminology

Some high-level evergreen findings can be easily discovered during a manual code review assessment. Note that this list is not exhaustive, and it's meant to provide a general idea. You can create your own checklist based on these findings.

1. Hard-Coded Secrets in source code

Description: One of the most straightforward issues to identify during a source code review is the presence of hard-coded secrets such as API Keys, tokens, passwords, etc. By using the find ("Ctrl+F") function and searching for specific keywords such as "token" or "password," it is easy to determine if the source code contains any hard-coded credentials.
Recommendation: To address the issue of hard-coded secrets in the source code, it is recommended to remove the secret from the code and instead use a secure secret management solution, aka KMS(Key Management Service), such HashiCorp Vault, to store the secrets.
The diagram below illustrates that the "config.inc" file contains a hard-coded password & database name.

2. Insecure Algorithm Used

Description: To quickly find a list of insecure algorithms that have been used, you can again use the find ("Ctrl+F") function and search for specific keywords such as SHA-1 (or sha1), md5, DES, and RC4. These keywords will help you easily locate information on algorithms considered to be insecure.
Recommendation: To ensure secure cryptography, it's important to choose the right algorithms for the task at hand. For symmetric key cryptography, use AES or Twofish, while RSA is recommended for asymmetric key encryption with a minimum key length of 2048 bits. SHA-256 is now the generally recommended hash function, and SHA-3 algorithms are preferred for new development. For applications with high-security needs, consider AES-256, RSA with larger key sizes, and SHA-512. Stay updated with best practices to ensure maximum security.
The diagram below illustrates the code $password = hash("sha1", $password, false), using the SHA-1 hashing algorithm to create a hash of the password. However, SHA-1 is no longer considered a secure hashing algorithm as it is vulnerable to collision attacks. This means that an attacker could create two different inputs that produce the same hash value, making it easier to break the hash and gain unauthorized access. Therefore, it is not safe to use SHA-1 for hashing passwords.

3. Improper Exception Handling

Description: The diagram below illustrates the catch block catches all exceptions of the Exception type, which is too general and can lead to unexpected behavior. An attacker could exploit this by intentionally throwing an exception that the application is unprepared to handle, causing the application to crash or behave unexpectedly.
Recommendation: To fix this issue, the application should catch only specific exception types that it expects and can handle appropriately. Additionally, the application should provide informative error messages and log relevant information, such as the stack trace, to help diagnose issues and improve security.

4. Application Logs in Clear Text

Description: During source code review, searching for log statements that may write sensitive information in clear text is crucial. Firstly, locate log files created by the application, which may be in various forms, such as text files, databases, or other storage mechanisms. Next, scrutinize the log messages to identify any sensitive information like passwords, usernames, credit card numbers, API keys, or other PII (personally identifiable information) that may be logged. It is important to ensure that sensitive information is not logged in clear text as it can lead to sensitive data exposure, and thus reviewing application logs is an essential security measure.
Recommendation: It is advisable not to log sensitive data, encrypt the logs and limit access to authorized users with appropriate privileges.

5. DEBUG is enabled in the application

Description: Enabling DEBUG mode in an application provides detailed information about its processes and any errors that may occur, which is helpful for developers to identify and resolve issues. However, it can be a security risk if the information is accessed by unauthorized parties. DEBUG mode should only be used during development and testing environments, not in a production where it can reveal sensitive information and potentially provide attackers with an opportunity for further exploitation.
Recommendation: If DEBUG is enabled in the source code, the recommended fix is disabling it before deploying it to a production environment.
The following diagram depicts that the source code has its debug mode set to true.

Via Enterprise License Tool

Tool Name: I will be using AppScan Source for the Analysis tool
Practical Example: By using the source code of bWAPP – buggy web application.
Programming Language: PHP
Assumption: When conducting the source code review for this engagement, it's important to consider that if client has provided a large codebase that cannot be manually reviewed due to its LOC (Lines of code), automated tools or scripts may need to be employed to assist with the analysis and ensure thorough coverage of the code.

AppScan Source Tool

When you open the tool, it looks something like this

GUI Representation

You can Start running the scan by going in

File => Add Application => Create a new application

Give the Name and the working directory of the code and click on Next

Now Select the language.

Click Next and Give Name, specify the Working directory, and click on Add Source Root Folder

Click on Finish and Go to Configuration

Right-click on your project and Scan Application

After the scan, the result looks something like this

Click Yes

After the scan, the result looks something like this

From the above findings, we will take 3 vulnerabilities - SQL Injection, XSS, and OS Command Injection (Blind).

SQL Injection

So, the tool detected SQL Injection vulnerability at line 143. Let’s first understand the code block that tool reported to us

Source of the file C:\USER\bWAPP\sqli_1.php

Code Explanation:

This code uses GET method to retrieve a movie title based on the user input, and then it searches the "movies" table in the database using the retrieved title with a SQL LIKE statement. Note: The user input $_GET["title"] is used to build a SQL query without proper input sanitization.
The code is vulnerable to SQL injection attacks since it directly includes the user input into the SQL statement.
An attacker can craft a malicious input to execute arbitrary SQL statements, bypass authentication, or access sensitive data.

Remediation:

To fix the vulnerability, the code needs to use parameterized queries, prepared statements, or escape characters to sanitize user input before using it in SQL statements.
Moreover, the code should not reveal detailed error messages to the user, as it may provide clues for attackers to exploit the vulnerability.

Fixed Code:

if(isset($_GET["title"])) {

$title = $_GET["title"];

// Create a prepared statement

$stmt = mysqli_prepare($link, "SELECT * FROM movies WHERE title LIKE ?");

// Bind the parameter

mysqli_stmt_bind_param($stmt, "s", $title);

// Execute the query

mysqli_stmt_execute($stmt);

// Get the results

$result = mysqli_stmt_get_result($stmt);

// Check if any rows were returned

if(mysqli_num_rows($result) > 0) {

while($row = mysqli_fetch_array($result)) {

// Do something with the row data

}

} else {

// No rows were returned

}

In this modified code, a prepared statement is created using mysqli_prepare(), and the query is parameterized by replacing the variable with a question mark ?. The parameter is bound using mysqli_stmt_bind_param(), which ensures that any user input is properly sanitized and escaped from any malicious payload. Then, the query is executed using mysqli_stmt_execute(), and the results are retrieved using mysqli_stmt_get_result().

Cross-Site Scripting (XSS) attack

The vulnerability in this code is that it's vulnerable to Cross-Site Scripting (XSS) attacks. The $_SERVER["SCRIPT_NAME"] variable is not properly sanitized, which means that an attacker can inject a malicious payload into the form action attribute and execute it in the victim's browser.

Remediation:

There are various ways to protect against cross-site scripting attacks within PHP.

htmlspecialchars: Converts characters &, ", ', < and > to their HTML entity representation, preventing XSS from using those characters. This will not stop any user code javascript replacements from running on the system
htmlentities: Converts all characters with an HTML entity replacement, including those with htmlspecialchars. This is more complete protection from cross-site scripting vulnerability but can lead to display bugs if certain characters are not replaced.

Example:

echo '<div>' . htmlspecialchars($_GET['input']) . '</div>';

This line of code takes a user input passed through the GET request, then wraps it with an HTML div element after encoding it using the htmlspecialchars() function.

htmlspecialchars() is a PHP function that converts special characters to their HTML entities. This helps prevent XSS attacks by preventing malicious scripts from being executed when the user input is displayed on the webpage.

OS Command Injection (Blind)

This code's vulnerability lies in using the shell_exec function to execute the ping command without proper input validation or sanitization. This can be exploited by an attacker who can control the value of the $target variable, which is passed as an argument to the ping command.

An attacker can manipulate this variable to inject arbitrary commands and execute them on the target system with the privileges of the PHP script user. For example, an attacker can use command injection to execute a reverse shell or to delete files from the target system (where the application is running).

Recommendation: To mitigate this vulnerability, the input to the $target variable should be validated and sanitized before passing to the shell_exec() to prevent any injection of malicious commands.

One way to do this is by using a whitelist of allowed characters or a regular expression to validate the input. Additionally, it's recommended to use PHP's escapeshellarg function to escape any special characters before passing the argument to the ping command. Avoid using command line calls. Disable executable os command APIs through php.ini disable_functions entry.

Conclusion

Conducting a source code review project can seem like a daunting task, especially for those who are new to it. However, with proper planning, execution, and a thorough understanding of the code being reviewed, the process can be completed successfully. By following the advice outlined in this guide, consultants and security professionals can confidently approach source code review projects, knowing they have the tools and strategies needed to complete the task effectively.

Remember always to prioritize security, accuracy, and thoroughness when reviewing code, and don't be afraid to ask for help or advice from colleagues or industry experts when needed. With these tips in mind, you can successfully conduct a source code review project and contribute to your organization's software development efforts.

In closing, don't miss the opportunity to enhance your software's security with secure code review services by Cobalt. Explore our offerings today and take a proactive step towards safeguarding your digital assets. Your journey towards robust and secure coding starts with Cobalt.

Learn how Cobalt’s Pentest as a Service (PtaaS) model makes you faster, better, and more efficient.

Learn how Cobalt’s Pentest as a Service (PtaaS) model makes you faster, better, and more efficient.

A Pentester's Guide to Source Code Review

Understanding Secure Code Review in Secure Software Development

Benefits of Secure Code Review in Secure Software Development