Custom Generative AI Plugin for IntelliJ to improve SDLC Productivity – Piloted in a Large Healthcare Payer Customer
Authors
1. Arun Kumar Singh - arun.kumar.singh@ibm.com
2. Srinivasan K S – sksriniv@in.ibm.com
3. Anupam Singh - singh.anupam@in.ibm.com
4. Heramb Shembekar - heramb.shembekar@in.ibm.com
5. Iyaaz Mohtisham - ihasmoh1@in.ibm.com
Contents
1.0 Introduction
2.0 Customer Pain Points
3.0 IBM Solution – Proof of Concept
4.0 The Pilot in the Customer Environment
5.0 The Architecture
6.0 Custom Plugin Vs Commercial Off-the-shelf Plugins
7.0 The Results
8.0 Lessons Learnt
8.0 Conclusion
1.0 Introduction
This white paper describes the customer pain points in the various phases of Software Development Life Cycle (SDLC) and how IBM has done a pilot using Generative AI to improve the productivity improvement and addressing the customer pain points in the various phases of SDLC. Customer is predominantly using Java 17 as programming language.
2.0 Customer Pain Points
a. Code Quality – The quality of the code is not standard. The customer standard for code coverage is minimum 70%. However, most of the code has less than 40% of code coverage. Some code even has 0% code coverage.
b. Tech Debts – Because of the poor code quality and coverage, the tech debts are very high. Some of the code needs to be thrown out and new codes have been introduced because of unmanageable tech debts.
c. Knowledge Transfer due to Attrition – Code knowledge is not properly transferred to the new joiner, exposing the risk to the project timelines.
d. Custom Code Review – Customer has 75 custom coding rules that are not available in the Sonarqube, which is the tool used to review the coding best practices. Configuring these custom coding rules in the sonarqube is not a straightforward process. Hence these 75 coding rules are manually reviewed as part of the code review process. This is time consuming and error prone. Often some coding best practices are missed in the code review process due to the human error.
3.0 IBM Solution – Proof of Concept
IBM has developed a Proof of concept (PoC) using Generative AI using AWS Bedrock for the below use cases. This is a plugin to the IntelliJ that connects to Bedrock and uses Llama 2.0 LLM to generate the output.
a. Junit Generation for Improving Code Coverage
b. Custom Coding Best Practices Review (Review of 75 custom rules)
c. Code Explanation and Summary for a new Joiner
IBM has demonstrated the PoC to the customer Generative AI Committee and got an approval to conduct a pilot in the customer environment.
The below were the expected productivity improvements informed to the customer.
Use Case
|
Expected Benefits
|
Code Coverage (Junit Test Generation)
|
40% of Junit writing time
|
Custom Code Review
|
30% of the manual review
|
Code Explanation
|
Independent learning led to 50% savings of the SME time
|
4.0 The Pilot in the Customer Environment
The Customer Generative AI Committee has given the below pointers on the IBM Proof of Concept and asked have IBM to address these during the Pilot.
a. The Code and other data cannot go outside customer network. The customer has AWS hyper scalar.
b. The prompt should not be given to the developer, rather it should be standardized and reused.
c. The working environment should be within the developer IDE IntelliJ.
d. Guardrails must be in place.
e. Should support Multi Model deployments and usage
IBM has modified the Plugin Architecture for the above requirements. The new Architecture has introduced an intermediary microservice, deployed on the OCP (on AWS EC2) that will take care of filling up the Prompt Template. Llama 3.0-instruct was used from the AWS Bedrock, replacing the Llama 2.0 from the PoC.
5.0 The Architecture
- Logging enabled in the Splunk to store the prompt history and the content that the LLM has generated.
- The Prompt Wrapper Service is a microservice that sits between the plugin and the AWS Bedrock. It is responsible for forming the prompt based on the user action. For example, user can just select a piece of code and click “Generate Junit”. The corresponding prompt template is picked by this service and proper prompt will be formed before sending the request to the LLM.
- Used AWS Private Link to connect the customer AWS Account to the AWS Bedrock.
- Llama 3.0 70b instruct was used as On-Demand Deployment model.
6.0 Custom Plugin Vs Commercial Off-the-shelf Plugins
IBM has developed this custom plugin as one-of-first-of-a-kind. ICA Plugin, Github Copilot and some others were not available at that time.
The below commercially available plugins were evaluated before piloting the custom developed plugin.
COTS Plugin
|
Why we did not use it
|
AWS Q Developer
|
- Prompt will be given to the developers and the LLM is tied to a single model
|
Watson Code Assist
|
-
- Prompt will be given to the developer. Customer do not want the prompt to be given to the developer.
|
Tabnine
|
- No information about their model
|
7.0 The Results
A 4-week pilot with 9 Java/Springboot microservices have been successfully executed. The GenAI generated Code needed some manual work to suit customer framework.
8.0 Lessons Learnt
8.0 Conclusion
This project was the first Generative AI Project executed at this client. IBM has not only successfully piloted the Gen AI Project but also given a Gen AI roadmap to the customer. The Generative AI Committee of the customer has acknowledged the potential savings in the effort. Deploying the custom plugin for the larger developer community is being discussed now.