Search the Community
Showing results for tags 'codeguru'.
-
By using Generative AI, developers can leverage pre-trained foundation models to gain insights on their code’s structure, the CodeGuru Reviewer recommendation and the potential corrective actions. For example, Generative AI models can generate text content, e.g., to explain a technical concept such as SQL injection attacks or the correct use of a given library. Once the recommendation is well understood, the Generative AI model can be used to refactor the original code so that it complies with the recommendation. The possibilities opened up by Generative AI are numerous when it comes to improving code quality and security. In this post, we will show how you can use CodeGuru Reviewer and Bedrock to improve the quality and security of your code. While CodeGuru Reviewer can provide automated code analysis and recommendations, Bedrock offers a low-friction environment that enables you to gain insights on the CodeGuru recommendations and to find creative ways to remediate your code... View the full article
-
In this post, we’ll demonstrate how you can leverage Amazon CodeGuru Reviewer Command Line Interface (CLI) to integrate CodeGuru Reviewer into your Jenkins Continuous Integration & Continuous Delivery (CI/CD) pipeline. Note that the solution isn’t limited to Jenkins, and it would be equally useful with any other build automation tool. Moreover, it can be integrated at any stage of your SDLC as part of the White-box testing. For example, you can integrate the CodeGuru Reviewer CLI as part of your software development process, as well as run it on your dev machine before committing the code... View the full article
-
Amazon CodeGuru Reviewer is a developer tool that leverages automated reasoning and machine learning to detect potential code defects that are difficult to find and offers suggestions for improvements. Today, we’re announcing the support of file/folder suppression for Amazon CodeGuru Reviewer, a new feature that allows customer to prevent CodeGuru Reviewer from surfacing findings on certain parts of their codebase. View the full article
-
Amazon CodeGuru Reviewer is a developer tool that leverages automated reasoning and machine learning to detect potential defects that are difficult to find in your code and offers suggestions for improvements. Today, we are excited to announce, a new repository size-based pricing model with a price reduction of up to 90%, making it easier for customers to predictably scale their automated code reviews across their software development processes. View the full article
-
Today, we are excited to announce additional capabilities with Amazon CodeGuru Reviewer to help you find and remediate security issues in your code before you deploy. CodeGuru Reviewer Security Detectors helps identify security risks from the top ten Open Web Application Security Project (OWASP) categories (OWASP is a standard awareness document for developers and web application security), security best practices for AWS APIs, and common Java crypto libraries. View the full article
-
Amazon CodeGuru is a set of developer tools powered by machine learning that provides intelligent recommendations for improving code quality and identifying an application’s most expensive lines of code. Amazon CodeGuru Profiler allows you to profile your applications in a low impact, always on manner. It helps you improve your application’s performance, reduce cost and diagnose application issues through rich data visualization and proactive recommendations... View the full article
-
Amazon CodeGuru Reviewer is a developer tool that leverages automated reasoning and machine learning to detect potential defects that are difficult to find in your code and offers suggestions for improvements. Today, we are announcing the general availability of Python support for CodeGuru Reviewer to help you detect coding issues and vulnerabilities in your Python code and applications. View the full article
-
Today, we are excited to announce additional capabilities with Amazon CodeGuru Reviewer. You can now use CodeQuality Detector to identify smells early, balance between speed and technical debt, and coordinate software development and maintenance efficiently. View the full article
-
This post walks you through associating the GitHub Enterprise repository with Amazon CodeGuru Reviewer. This repository support is available for both self-hosted and cloud-hosted GitHub Enterprise options. In this post, we focus on associating CodeGuru with the repository on a self-hosted GitHub Enterprise Server. CodeGuru Reviewer offers automated code reviews to catch difficult-to-find defects in the early stage of development. It is backed by machine learning models trained from millions of code reviews conducted within AWS and open-source projects. When the code repository is associated with CodeGuru, the creation of pull requests triggers CodeGuru to scan the code and, based on the analysis, provide you actionable recommendations. CodeGuru Reviewer currently identifies code quality issues in the following broad categories: AWS best practices Concurrency Resource leaks Sensitive information leaks Code efficiency Refactoring Input validation In short, CodeGuru equips your development team with the tools to maintain a high bar of coding standards in the software development process. For more information about configuring CodeGuru for automated code reviews and performance optimization, see Automated code reviews and application profiling with Amazon CodeGuru. In this post, we discuss the following: Support for GitHub Enterprise repositories: Cloud hosted and Self hosted Associating CodeGuru with GitHub Enterprise Server: Creating the repository provider association on CodeGuru Setting up the host (authorizing AWS CodeStar to access and install the app on an endpoint) and creating the connection (providing access to selected repositories) Completing the association of CodeGuru with the created connection Generating a pull request Cleaning up to avoid unnecessary charges In a typical organization, an administrator would build and configure the GitHub Enterprise Server on AWS cloud or on-premises and associate Amazon CodeGuru with the desired repositories hosted on that GitHub Enterprise server in the same AWS Account region. Later, when the developers create a pull-request on those repositories, CodeGuru will automatically scan the code and send out actionable recommendations to those developers. CodeGuru support for Cloud-hosted and self-hosted GitHub Enterprise repositories GitHub Enterprise offers various ways to host repositories. When configuring CodeGuru to associate with GitHub Enterprise repositories, you can select GitHub for cloud hosted or GitHub Enterprise Server for self hosted. The cloud-hosted option refers to the GitHub cloud, whereas the self-hosted option refers to on-premises or the AWS Cloud. For this post, we walk though configuring CodeGuru to associate with the self-hosted GitHub Enterprise Server. We consider the following self-hosted options: With a public endpoint – The Amazon Elastic Compute Cloud (Amazon EC2) compute instance hosting GitHub Enterprise Server is accessible from the internet. With a private endpoint – This is a common scenario for an organization in which GitHub Enterprise Server is hosted on AWS Cloud as a private VPC endpoint and the developers securely access this endpoint from their corporate network or VPN. You could also host it on an on-premises server accessible via a specific VPC. We revisit these scenarios later in this post when we configure the association with CodeGuru. Before you associate GitHub Enterprise (GHE) repositories with CodeGuru, let us review our GHE server that is set up on our EC2 instance that will be hosting our repositories. The following screenshot shows the EC2 instance launched using the GitHub Enterprise AMI: After you instantiate GitHub Enterprise Server on Amazon EC2 using the desired AMI, confirm its reachability by choosing its DNS name. If it’s configured with a private endpoint or an on-premises server, test the reachability using the endpoint URL from the appropriate source location and logging in. Log in to the account and check your repositories. The following screenshot shows your repositories listed in the navigation pane. In all these use cases, we recommend integrating Certificate Authority (CA) authorized certificates on GitHub Enterprise Server to enable a proper TLS handshake when accessing from a browser. If you’re using self-signed certificates on GitHub Enterprise Server, you may have to import those certificates in your browser to enable the TLS handshake and access the service. Associating CodeGuru with GitHub Enterprise Server This section summarizes the high-level steps to associate the self-hosted GitHub Enterprise Server code repository with CodeGuru Reviewer. To do so, we have to use the AWS CodeStar connections service, which offers a centralized place to create the association between a third-party service and an AWS service. We first create an AWS CodeStar connection to GitHub Enterprise Server, then use this connection to list the repositories in that GitHub account and select the repository to associate with CodeGuru. Creating the association To start creating the association, complete the following steps: On the CodeGuru console, left pane, choose Reviewer. Choose Associated repositories. Select GitHub Enterprise Server. From the drop-down menu, check for any existing connections. If you don’t have any connections, choose Create a GitHub Enterprise Server connection. The Create a connection window appears. For Connection name, enter a name. For Choose a host, choose the search box. If no hosts are configured, it displays an informational box. Choose Create host. The host is configured to model the self-hosted GitHub Enterprise Server; for our use case, it’s hosted on our EC2 instance. This is configured only one time and later reused to create multiple connections in the same Region as the EC2 instance. After you choose Create host, the Create host window appears. For Host name, enter a name. For Select a provider, choose GitHub Enterprise Server. For Endpoint, enter your endpoint. For VPC configuration, choose No VPC (we don’t need to configure a VPC for this use case). The following screenshot shows an example of configuring for a self-hosted GitHub server that is accessible over the public internet. For a GitHub Enterprise Server that isn’t accessible over the public internet, you need to choose Use a VPC and enter the following: VPC ID – The VPC ID where GitHub Enterprise Server is located or where the on-premises GitHub Enterprise Server is accessible from (a VPC with reachability to the on-premises GitHub Enterprise Server) Subnet ID – The subnets from the preceding VPC where GitHub Enterprise Server is located or where the on-premises GitHub Enterprise Server is accessible from Security group ID – The security groups that allow the CodeStar connections host to access GitHub Enterprise Server in the preceding VPC. TLS certificate – You don’t need to enter a TLS certificate if you’re using a certificate signed by a public CA on GitHub Enterprise Server. If you’re using a self-signed certificate or non-public certificate, enter the certificate. To obtain your certificate, complete the following: Navigate to your endpoint URL in Firefox. Choose the lock icon in the address field. Choose Connection, More Information. On the Security tab, choose View Certificate. On the Details tab, choose Export. Save it as a local file. Open the file in your preferred text editor to locate the certificate and copy and paste the text. Choose Create host. You should now see the Setup status change from VPC configuration initializing to Pending. Choose Set up host. Setting up the host and creating the connection When you complete the steps in the previous section, you’re redirected to the Create GitHub App page. You need administrator login credentials for GitHub Enterprise Server to allow the application installation. After you log in with those credentials, for GitHub App name, enter a name. Choose Create GitHub App. This step installs the app on GitHub Enterprise Server, and the host Setup status changes to Available. Choose Create connection. If the button isn’t available, complete the following: In the navigation pane, choose Settings. Choose Connections. Choose Create connection. On the Create a connection page, select GitHub Enterprise Server. For Connection name, enter a name. For Choose a host, enter your host. Choose Connect to GitHub Enterprise Server. In the window that appears, authorize the app installation. On the Connect to GitHub Enterprise Server page, choose Install a new app. In the window that appears, select to apply All repositories or Only select repositories in that organization. When you return to the Connect to GitHub Enterprise Server page, choose Connect. Completing the association of CodeGuru with the created connection After you complete the steps in the preceding section, you return to the Associate repository page. For Select source provider, choose GitHub Enterprise Server. For Connect to GitHub Enterprise Server, choose your connection. For Repository location, choose your repository. Choose Associate. In less than 30 seconds, the repository status shows as Associated. Generating a pull request You’re now ready to create a pull request for any changes to the code and trigger CodeGuru to scan the code and generate actionable recommendations. In your repository, on the Code tab, choose Create pull request. You can see an active entry for the code review on the Code reviews page with the status Pending. When it’s complete, you can see the recommendations on the Pull request tab. Cleaning up When you’re finished testing, you should un-provision the following resources to avoid incurring further charges: CodeGuru Reviewer – Remove the association of CodeGuru to the repository, so that any further pull request notifications don’t trigger CodeGuru to perform an automated code review GitHub Enterprise Server – If hosted on an EC2 instance, stop the instance Conclusion This post reviewed the support of self-hosted GitHub Enterprise Server repositories for CodeGuru Reviewer. You can take advantage of these features to enhance your application development workflow. About the Author Nikunj Vaidya is a Sr. Solutions Architect with Amazon Web Services, focusing in the area of DevOps services. He builds technical content for the field enablement and offers technical guidance to the customers on AWS DevOps solutions and services that would streamline the application development process, accelerate application delivery, and enable maintaining a high bar of software quality. View the full article
-
This post discusses the types of concurrency bugs Amazon CodeGuru detects and how developers can fix them. CodeGuru automatically analyzes pull requests (created in supported repositories like CodeCommit, GitHub, GitHub Enterprise, and Bitbucket) and generates recommendations about how to improve your code quality. For more information, see Automating code reviews and application profiling with Amazon CodeGuru. Why use a tool to automatically detect concurrency bugs? Concurrency bugs are difficult to catch during unit and system testing. This is because triggering concurrency bugs is timing dependent: threads need to execute instructions in parallel in a particular order for the program to exhibit the buggy behavior (we provide examples later in this post). Additionally, triggering concurrency bugs is non-deterministic: many program executions during testing and production may not exhibit the buggy behavior, but occasionally, (for example, under a slightly different execution timing or system load), an execution may trigger the bug. Even after triggering a concurrency bug, reproducing it for debugging may be difficult due to non-determinism. Overview of concurrency bugs found by CodeGuru The concurrency bug detectors in CodeGuru found previously unknown concurrency bugs in 22 mature and widely used Amazon projects, 6 open-source projects (JDK 8 to 14, Elasticsearch, Hadoop, NetBeans, RabbitMQ, and XMage), and 18 internal Amazon projects. We reported these bugs to the developers, who fixed them based on our reports. In addition to fixing the bugs, developers gave us strongly positive feedback. For example, after we reported bugs in three mature projects, the project developers asked us to run the concurrency bug detectors on other modules. Developers say the bugs found by the concurrency bug detectors in CodeGuru would be difficult to find by a human though code inspection: “It’s a really neat catch. Bugs like that are super tricky to track down.”, “…it’s pretty subtle, not something easily spottable,” and, “Looks like it just picked up a flaw in the code that I don’t think I’ll ever find out myself.” Other developers expressed enthusiasm for the concurrency bug detectors: “I’m also impressed by the inspection tool you guys are using,” “He gave us some fantastic deadlock finds that he found with CodeGuru,” “This seems like a very useful tool,” and “CodeGuru is really powerful.” Types of concurrency bugs CodeGuru detects CodeGuru detects four types of concurrency bugs: Deadlocks Data races on thread unsafe classes Atomicity violations Over-synchronization In the following sections, we give real-world examples of bugs from each of the four bug types. Deadlocks Deadlocks are severe bugs in which the execution permanently blocks. Deadlocks can cause lost system utilization, long response time, low throughput, and negative customer experience. During a deadlock (also called a circular-wait), one thread waits for a synchronization object held by another thread, while the other thread waits for a synchronization object held by the first thread. The execution therefore can’t make forward progress, and the user needs to forcefully terminate the process. (When synchronization protects one or more code regions, those code regions cannot execute simultaneously in parallel and are forced to execute one at a time). Example deadlock in JDK 8 to 14 The following code shows a previously unknown deadlock (JDK-8236873) that CodeGuru found in JDK 8 to 14. The JDK developers fixed this deadlock based on our report. public void run() { ... synchronized (jobs) { ... isStopped(); ... } private synchronized boolean isStopped() { return stopped; } public synchronized void stopWorker() { ... synchronized (jobs) { ... } In the preceding code, the deadlock occurs due to the actions two threads can take. One thread performs the following actions: Enters method run() Acquires synchronization on object jobs Calls method isStopped() Tries to acquire synchronization on object this (because isStopped() is a synchronized method ) and blocks waiting for another thread (see next) to release the synchronization on object this Another thread performs the following actions: Enters method stopWorker() Acquires synchronization on object this (because stopWorker() is a synchronized method) Tries to acquire synchronization on object jobs and blocks waiting for the first thread to release the synchronization on object jobs The two thread executions are deadlocked: each thread waits for the other thread to release synchronization, and therefore neither of the two threads can continue executing. Fixing this deadlock To break this deadlock cycle, developers made variable stopped a volatile variable and removed the synchronization on this. A general fix strategy for deadlocks (recommended by CodeGuru in its messages for deadlock detections) is to always acquire synchronization objects in the same order, which prevents cycles. This deadlock may be difficult to detect during testing Triggering this deadlock requires very particular timing: the first thread acquires jobs but doesn’t yet acquire this and at exactly the same time, the second thread acquires this but doesn’t yet acquire jobs. In a regular unit or system test, such particular timing may be unlikely to occur. However, during production and heavy loads, this particular timing may happen. Data races on thread unsafe classes Data races on thread unsafe classes are severe bugs that can cause the execution to crash or to produce wrong results. For example, if one thread iterates over a list while another thread adds to that list at the same time, the list’s internal data structures may become corrupt, potentially causing exceptions, infinite loops, or wrong or lost results. To avoid this, code typically protects accesses to a shared data structure by acquiring synchronization, which ensures thread mutual exclusion. A data race on a thread unsafe class happens when the code wrongly doesn’t acquire synchronization. Example data race on a thread unsafe class in Hadoop The following code shows a data race on a thread unsafe class (HDFS-14618) that CodeGuru found in Hadoop (we reported this bug anonymously because we had not yet launched CodeGuru). The Hadoop developers fixed this race based on our report. public void clear() { synchronized (pendingReconstructions) { pendingReconstructions.clear(); timedOutItems.clear(); timedOutCount = 0L; } } The preceding code has a data race on timedOutItems. timedOutItems is a field of type ArrayList, a thread unsafe class. The Hadoop code typically (in six code locations) protects operations on timedOutItems with synchronization on timedOutItems. In these six locations, the operations on timedOutItems are add(), another clear(), toArray(), and size(). However, in the preceding code (the seventh location where timedOutItems is accessed), the timedOutItems.clear() call is protected by synchronization on a different object (pendingReconstructions). Two threads synchronizing on different objects can still execute at the same time. Therefore, timedOutItems.clear() can modify timedOutItems while another thread calls add(), the other clear(), or toArray() on timedOutItems. The code would be buggy even if synchronized (pendingReconstructions) wasn’t present. However, the synchronized (pendingReconstructions) likely obscured for the developer the fact that timedOutItems.clear() isn’t correctly protected by synchronization. Fixing this data race The fix for this data race is to surround timedOutItems.clear() with a synchronized (timedOutItems) block. This is the type of fix recommended by CodeGuru in its messages for data race detections. This race may be difficult to detect during testing Triggering this data race requires very particular timing: one thread needs to execute timedOutItems.clear() and at exactly the same time, another thread needs to execute one of the other actions (add(), the other clear(), or toArray()) on timedOutItems. In a regular unit or system test, such special timing may be unlikely to happen, but during production and heavy loads, this particular timing may occur. Atomicity violations Atomicity violations are severe bugs that can cause the execution to crash or produce wrong results. An atomicity violation happens when two instructions are expected and assumed to execute together (atomically) but another thread executing on the same data breaks the atomicity expectation. As of this writing, CodeGuru focuses on atomicity violations involving concurrent collections. For example, when the code checks isPresent(key) on a ConcurrentHashMap and then the code calls get(key), the code implicitly assumes that no other thread removes the key from the map in between the calls to isPresent() and get(). In other words, the code assumes the isPresent() and get() are executed atomically. If another thread removes the key in the time between isPresent() and get() are executed, get() can return null and the program can crash (if the code wrongly uses the null value). Example atomicity violation in Amazon code The following code (anonymized sketch of the original code) shows an atomicity violation CodeGuru found in Amazon code. The Amazon developers fixed this atomicity violation based on our report. public void methodA(KeyType param) { ... if (aConcurrentHashMap.contains(param)) { ValueType value = aConcurrentHashMap.get(param); ... value.callAMethod(); } public void methodB(KeyType param) { ... aConcurrentHashMap.remove(param); } In the preceding code, aConcurrentHashMap is a field of type ConcurrentHashMap. If a thread executes methodA() and another thread executes methodB(), the following can happen: The first thread executes aConcurrentHashMap.contains(param), which returns true The second thread executes aConcurrentHashMap.remove(param), which removes the key param and corresponding value The first thread executes aConcurrentHashMap.get(param), which returns null (because now aConcurrentHashMap no longer contains the key param—assuming the two parameters for methodA() and methodB() are the same object) The first thread executes value.callAMethod() and crashes with NullPointerException Fixing this atomicity violation To fix this atomicity violation, remove the call aConcurrentHashMap.contains(param) and replace it will a null check on the value returned by aConcurrentHashMap.get(param). The null check effectively checks whether aConcurrentHashMap contained the key or not. After the fix, methodA() looks like the following code: public void methodA(KeyType param) { ... ValueType value = aConcurrentHashMap.get(param); if (value != null) { ... value.callAMethod(); } This is the type of fix recommended by CodeGuru in its messages for atomicity violation detections. This atomicity violation may be difficult to detect during testing Triggering this atomicity violation requires very particular timing: the second thread must execute aConcurrentHashMap.remove(param) right in between the calls aConcurrentHashMap.contains(param) and aConcurrentHashMap.get(param) executed by the other thread. In a regular unit or system test, such special timing may be unlikely to occur, but it may happen during production and heavy loads. Over-synchronizations Over-synchronizations are bugs that reduce program performance unnecessarily. In an over-synchronization bug, the developer adds coarse-grained synchronization and unnecessarily serializes the parallel execution. Over-synchronizations can exist in many scenarios, but CodeGuru focuses on over-synchronizations involving concurrent collections. Example over-synchronization in Amazon code The following code (anonymized sketch of the original code) shows an over-synchronization CodeGuru found in Amazon code. The Amazon developers fixed this over-synchronization based on our report. synchronized (aConcurrentHashMap) { value = aConcurrentHashMap.get(key); if (value == null) { aConcurrentHashMap.put(key, newValue); } } In the preceding code, aConcurrentHashMap is a field of type ConcurrentHashMap. The code uses synchronized (aConcurrentHashMap) to guard against the following execution scenario (which may happen if synchronized (aConcurrentHashMap) is missing): Two threads execute aConcurrentHashMap.get(key) on the same key. Both threads get null values (because aConcurrentHashMap doesn’t contain the key). Both threads execute the true branch of the if (value == null) condition. The value put into aConcurrentHashMap by the first thread with aConcurrentHashMap.put(key, newValue) is overwritten by the value put by the second thread. The first thread uses the overwritten value to store data, and therefore that data is lost (the data becomes unreachable in Java) after the value is overwritten. By using synchronized (aConcurrentHashMap), the code prevents this execution scenario, but the code is unnecessarily slow: synchronization makes execution sequential, which can become a bottleneck under heavy load. Developers can avoid this slowdown by using the putIfAbsent() method in ConcurrentHashMap. The putIfAbsent() method achieves the same behavior but uses fine-grained synchronization, which reduces the sequential bottleneck and increases performance under heavy load. Fixing this over-synchronization The fix for this over-synchronization is to simply call putIfAbsent(key, newVal) and remove the synchronized (aConcurrentHashMap). This is the type of fix recommended by CodeGuru in its messages for over-synchronization detections. This over-synchronization may be difficult to detect during testing This over-synchronized code may get executed during unit or system testing. However, the code may appear as slow only under heavy load. Additionally, many synchronized regions (that may be slow during heavy load) may be legitimately slow: the code may truly need mutual exclusion. In contrast, the preceding code may be slow without actually needing to be slow. What CodeGuru brings to concurrency bug detection The concurrency bug detectors in CodeGuru prioritize accuracy over coverage: we believe it’s better to lose some concurrency bugs than give developers reports that they may consider to be wrong. Consequently, the concurrency bug detectors in CodeGuru report high-confidence detections but can miss bugs in the four categories presented above. The concurrency detectors in CodeGuru use static analysis algorithms and techniques (static analysis means analyzing code without executing it, which is the use case for CodeGuru during pull requests). For concurrency bugs, static analysis techniques are notoriously prone to false reports: a tool may report what it believes is a bug but the developer may decide the code is benign or works as intended (even though the code may be somewhat unusual). The concurrency detectors in CodeGuru identify likely false reports and filter them out. For example, sometimes developers implement caches using ConcurrentHashMap, but such caches are typically resilient to slightly stale values and therefore to some forms of atomicity violations. In other cases, a data race may affect only logging or appear in methods that are likely sequential or known to the developer to be thread safe. In other cases, static analysis has trouble tracking objects though fields and complex heap objects. Additionally, CodeGuru limits the types of concurrency bugs it detects (e.g., it focuses on data races involving thread unsafe classes and atomicity violations involving concurrent collections) to achieve high confidence. We’re working on expanding these bug types. Conclusion Concurrency bugs are difficult to detect during testing. The concurrency bug detectors in CodeGuru found previously unknown concurrency bugs in 22 mature and widely used Amazon projects, 6 open-source projects , and 18 internal Amazon projects. Thanks to our reports, developers have fixed these bugs. CodeGuru has a 90 days free trial. You can easily get your code reviewed as described in Getting started with CodeGuru Reviewer. View the full article
-
Forum Statistics
63.6k
Total Topics61.7k
Total Posts