What is AskGit?
IBM developers spend a lot of time searching through hundreds of internal Git repositories to find reusable code. Valuable solutions often stay hidden across teams, leading to duplicated effort and lost productivity.
AskGit solves this problem. It’s an AI-powered internal tool that helps IBMers discover the most relevant Git repositories simply by describing their project or use case in natural language.
Instead of manually hunting for code, developers can get instant, intelligent repository recommendations: accelerating reuse, collaboration, and innovation across IBM’s engineering ecosystem.
The tool is available here:
Why AskGit?
IBM’s global developer community spans hundreds of thousands of repositories across multiple GitHub organizations. This wealth of code is an enormous advantage: but without an efficient discovery mechanism, it becomes a challenge to reuse and build upon existing assets.
AskGit was designed to tackle three recurring pain points:
-
Redundant effort: Developers rebuild solutions that already exist.
-
Inefficient discovery: Manually searching across repositories is time-consuming and inconsistent.
-
Hidden knowledge: Projects with useful information often remain buried in lesser-known repos.
By leveraging AI-driven semantic search, AskGit ensures that no valuable repository remains undiscovered. A developer can simply describe their goal: for example, “FastAPI backend with JWT authentication” — and instantly find similar, production-ready projects within IBM’s GitHub ecosystem.
How Does AskGit Work?
Ingestion Pipeline — Preparing Repositories for Search
The Ingestion Pipeline is AskGit’s backbone. It continuously processes repositories from multiple IBM GitHub organizations, transforming raw repository data into enriched, AI-searchable content.
Repository Extraction
AskGit uses the Git API to extract metadata — descriptions, commit history, READMEs, and folder structures: forming an Input JSON snapshot for each repository.
LLM Enrichment
Each Input JSON is analyzed by watsonx.ai Large Language Models, which generate a Final JSON enriched with:
-
Repository summary and short description
-
Candidate use-cases
-
Extracted technical keywords
-
Core metadata (URL, repo name, last commit date, etc.)
Chunking, Embedding, and Indexing
The Final JSON is divided into smaller chunks for fine-grained representation.
Each chunk is embedded using mixedbread-ai/mxbai-embed-large-v1, then indexed into watsonx Discovery, forming the Vector Store used during retrieval.
Continuous Updates
Repositories are re-indexed periodically to stay current with codebase changes, ensuring AskGit’s results remain fresh and reliable.
Retrieval Process
1. Vector Search & Hybrid Retrieval
When a developer submits a query such as “Find Git repos related to IBM’s reverse proxy,” AskGit first converts the input text into a vector embedding — a dense numerical representation of the query’s semantic meaning.
The system then performs a hybrid search over its Vector Store (powered by watsonx Discovery), combining:
-
Semantic similarity matching, where the query embedding is compared against repository description embeddings to identify conceptually similar projects.
-
Keyword-based retrieval, where the query text is simultaneously matched against technical keywords and use-case tags extracted from each repository during ingestion.
By merging these two retrieval modes, AskGit ensures results are both semantically relevant and contextually precise — surfacing repositories that align with the user’s intent, even when exact keywords differ.
2. Recommendations
Finally a reranking is done of all the retrieved results (from the above step) and then AskGit presents the top five repository recommendations, each containing:
-
A concise summary of the repository’s functionality
-
Technical keywords automatically extracted by LLMs
-
Candidate use-cases describing how the repository can be applied
-
Metadata such as repository name, URL, and last commit date
This ensures every result is contextual, meaningful, and actionable, helping developers quickly identify the most relevant assets.
Current Data that Has Been Ingested
AskGit’s vector store currently covers 1,483 repositories across IBM’s major GitHub organizations: representing a comprehensive, global snapshot of Client Engineering and Tech Garage projects.
GitHub Org |
Repositories |
github.ibm.com/tech-garage-canada |
231 |
github.ibm.com/skol-assets |
113 |
github.ibm.com/Client-Eng-EMEA-IN |
231 |
github.ibm.com/client-engineering-japan |
190 |
github.ibm.com/client-engineering-korea |
47 |
github.ibm.com/Industrial-CE-Projects |
62 |
github.ibm.com/Indonesia-Client-Engineering |
73 |
github.ibm.com/technology-garage-dach |
52 |
github.ibm.com/ClientEngineers-Chile |
12 |
github.ibm.com/technology-garage-la |
118 |
github.ibm.com/watsonx-apac |
123 |
github.ibm.com/asean-techgarage |
86 |
github.ibm.com/anz-tech-garage |
96 |
github.ibm.com/TW-Client-Engineering |
49 |
Total repositories indexed: 1,483
Summary
AskGit lets you explore IBM’s Git ecosystem effortlessly and intelligently. It makes your development workflow faster, smarter, and more connected. It helps you:
AskGit represents a significant leap forward in AI-driven developer productivity within IBM. It connects teams, unlocks hidden code assets, and empowers every engineer to build faster and together.
Example in Action
1)Login to CE coach (https://coach.ce.techzone.ibm.com/ask_git?) and enter your query
2)Submit your query and wait for 15 seconds or so
3)View results
Technical Stack & Future Enhancements
AskGit is powered by a robust, enterprise-grade architecture:
Core Technologies:
-
watsonx.ai: granite (ibm/granite-3-2-8b-instruct) and mixtral (mistralai/mixtral-8x7b-instruct-v01) for analysis and generation of summary and keywords in the ingestion piepeline
-
Other models: mixedbread-ai/mxbai-embed-large-v1 for creating embeddings and colbert-ir/colbertv2.0 for reranking
-
Vector DB: watsonx Discovery for storing embeddings and metadata
-
Other tools: IBM Code Engine for deployment and FastAPI for creating backend APIs
Future Enhancements: