Code review with the help of AI
In today’s world, there are more and more tools, systems and approaches that use AI to help automate, and in some cases, even replace human participation. Of course, there are many niches in the IT field where AI has already excelled. We will analyze one of these niches with you.
My name is Oleksandr, I have been working at NLT for more than 7 years and I will try to analyze one of these niches for you. We will talk about code review automation.
Today, there are several dozen AI-based services that help conduct code reviews. After analyzing the ratings of these services, I can single out 12 most popular once:
- Pluralsight
- Сoderabbit
- Codacy
- DeepCode
- Codeclimate
- Deepsource
- SonarQube
- CodeBeat
- Pylint
- ESLint
- Houndci
- Сhat GPT
I tested each of the services on the basis of a small project for the iOS mobile platform. I added a new one to it pull-request with a large file and for the purity of the experiment “skipped” the same pull-request through all services. It was a 730-line ViewController file written in Swift. It had enough logic, animations and various code for the AI to “turn around”. I also added some variables to the file that are not used.
Example: it was possible to test services in free mode only with the condition that my repository should be public. I also decided to take this file from a public repository instead of using data from a commercial project. Firstly, no one canceled the NDA, and secondly, I decided to play it safe because I had no information on how these services can work/use information in free mode.
During the experiment, I paid attention to the following aspects of work:
- compatibility with code repository management systems
- cost
- programming language restrictions
- level of code analysis
As a result, a generalized table was made with conclusions for each of them, which also includes some personal ratings. Next, an overview on each of the services.
Pluralsight
This service supports work only with code written in Python, it was not possible to conduct an experiment and evaluate its work.
Сoderabbit
Codacy
Codeclimate
Deepsource
CodeBeat
Houndci
File Length Violation: File should contain 400 lines or less: currently contains 739 (file_length)
Trailing Whitespace Violation: Lines should not have trailing whitespace. (trailing_whitespace)
And 14 more similar comments about trailing_whitespace
Pylint
This service supports work only with code written in Python, it was not possible to conduct an experiment and evaluate its work.
ESLint
Because this service supports work only with code written in JS and similar languages, it was not possible to conduct an experiment and evaluate its work.
Chat GPT 3.5
Naming: you follow good practices in naming variables and functions. However, try to use more informative names to make the code clearer. For example, toggleArgeementState has a typo, should be toggleAgreementState.
Segregation of duties: your controller performs many tasks such as displaying the interface, handling user input, managing state, and even interacting with your view model. Consider separation of duties and application of architectural patterns such as MVVM to simplify code management and testing.
Dependencies: you have several dependencies like UIStackView, UIButton, and UIViewControllerKeyboardHelper. Whenever possible, try to inject dependencies or use inversion of control to make the class more testable and avoid hard dependencies on specific classes.
Error Handling: it looks like your code has no error handling. Consider how you can handle errors that may occur while your application is running and provide information to users about problems.
Code Duplication: the code has some duplication. For example, the handler definitions for orgNameSelectButton and orgNameClearButton are similar. You can create one method for both to reduce code duplication.
Keyboard control: your code contains a keyboard control for the input field. Consider using libraries or patterns that can make this code more generalizable and easily reusable.
Swift Versioning: make sure your project is configured to use an up-to-date version of Swift.
Gesture handling : your code contains gesture handling such as UITapGestureRecognizer. Make sure you properly manage and release the resources associated with these gestures.
RxSwift: you use RxSwift (Rx) for reactive programming. Make sure you manage subscriptions properly to avoid memory leaks.
Animations : you have animations for showing and hiding elements. Check that they are controlled and animated correctly and that they do not cause any loss of performance.
Chat GPT 4.0
Name and code structure: FATthirdStepViewController – the class name follows general conventions, but says little about its functionality. Try to choose a name that better describes the purpose of the controller.
There are lots of private lazy vars inside the class, which is fine, but can make the code harder to read. Consider grouping related properties and their initializers into extensions to improve readability.
Constants and literals : nesting enums in Consts to group related constants is a good approach, but make sure all constants are actually needed and used. Constants that are used only once and do not have a global value can be defined locally where they are used.
Using #imageLiteral is deprecated, it is better to use UIImage(named:) to improve code compatibility and support different image resources.
UI elements: initializing UI elements as lazy var is a common practice, but make sure that lazy initialization is really required for each element.
Consider using factory methods or separating the UI creation code into separate methods/extensions for each UI element to reduce the amount of code in the controller and make it easier to read.
Multiple use of closures: using once.perform inside event handlers is an interesting idea, but can make the code flow difficult to understand. Make sure it’s absolutely necessary and doesn’t lead to unexpected behavior.
Working with the keyboard: keyboardConstraint is initialized, but its initial value is not set. Make sure that this does not cause problems when working with the layout.
Working with the view model: delegating processes to the viewModel is a great approach. Make sure that all the necessary actions are handled in the view model and that the communication with the viewModel is clean and clear.
Localizability : texts are sewn directly into the code. To support localization, they should be exported in separate localization files.
Accessibility : make sure UI components support accessibility, such as captions for screen readers, this is not mentioned in the code.
Custom UI components : titleWithStepsView, Select, Input, DotsLoader, Buttons.Component.TextWithArrow, CheckBox – make sure all custom components have clear interfaces and are well integrated into the overall app architecture.
Network Handling and Long Operations: there is no mention of handling network requests or long operations. If there are, make sure you handle asynchronous calls correctly, show loading indicators, and handle errors.
Service | Cost | Comment | |
1 | Pluralsight | $29/month from the product | For Python only |
2 | Сhat GPT | GPT – 3.5 – free GPT – 4.0 – $20/month. | Not embedded in git. Only text can be inserted – the file is not accepted. Provides one of the best code analyses. |
3 | Сoderabbit | $12/month from the developer | GitHub, Gitlab. Checks existing and new PRs. One of the best results. |
4 | Codacy | $15/month from the developer | GitHub, Gitlab, Bitbucket. Checks existing and new PRs. Has a large dashboard with many settings and statistics. The analysis is superficial, at the level of code style. |
5 | DeepCode | $25/month from the product | GitHub, Bitbucket and CLI. The analysis is superficial, at the level of code style. |
6 | Codeclimate | For private $16 from the developer per month | Only for GitHub. I did not find any errors. |
7 | Deepsource | From $8/month from the developer | GitHub, Gitlab, Bitbucket. It can also scan the entire project, but no errors were found. |
8 | SonarQube | From $150 per year. | GitHub, Gitlab, Bitbucket. Verification failed due to the need to deploy the server infrastructure. Reviews on the Internet are positive. |
9 | CodeBeat | $20/month from the developer | GitHub, Gitlab, Bitbucket. Can analyze the entire project at once It has its own dashboard. The results are average. |
10 | Pylint | Free | For Python only |
11 | ESLint | $20/month from the developer | Only for JavaScript and similar languages |
12 | Houndci | From $29 for 50 reviews | Embeds only on GitHub. Only checks newly created PRs. The results are weak – reminiscent of the work of SwiftLint. |
Personal rating
1st place – Code Rabbit
- is integrated into popular management systems for code repositories – GitHub.
- gtlab can check not only new pull requests, but also existing ones.
- leaves the results of the check in the form of comments in the pull request itself.
- made one of the best code reviews. Found the following points: pointed out unused variables; warning about possible excessive declaration of a variable; warning about a possible logical error using the “Once” common resource; a recommendation for a better approach using a button target; indication of the logical BAG when using a conditional operator; a recommendation to use the keyboard library due to the current complexity of the implementation; remarks on the use of constants; class modifiers.
2nd place – Chat GPT
- I consider its level of analysis to be the greatest advantage. Example, he turned out to be one of the best. I found the following points – errors in naming; determined the workload of the class and recommended sharing responsibility; recommendation for work with addictions; indicated the possible duplication of code; made recommendations for keyboarding, animations, reactive approach, and more (see full description above).
- the main drawback is the impossibility of integration with code repository management systems. The analysis code must be inserted manually. There is also no understanding – how Chat GPT can dispose of the code it analyzes.
3rd place – CodeBeat
- integrates with most popular code repository management systems – GitHub, Gitlab, Bitbucket.
- has its own dashboard with a lot of information.
- one of the few services that analyzes the entire project, not just individual pull requests.
- the level of analysis is average. Can evaluate both code style and code complexity.
General conclusions
- Most services can be embedded in gitlab as a separate program.
- Only two services scanned the entire project, the others work only with PR/MR .
- Half of the services were rejected, did not give results at all or were sharpened for only one language (Python or JS).
- All services have a free tariff, but it is only for public repositories. For private individuals, the rate is on average $15-30 per month, for one developer.
- Most services only give superficial comments at the level – style code – exceeding the length of a class / function / string, right / left indentation, etc.
In conclusion, I would like to say that, in general, such services are able to give a certain boost to development, and with the development of AI, this boost will be more and more noticeable.
Author Alexander Bondar