Module Summary

  • Goal: Create a custom GitHub MCP Server with Agentic RAG optimization
  • Estimated Time: 15-20 minutes
  • Prerequisites: GitHub account, Cursor IDE, xpander.ai account

🚀 In this module, you’ll learn how to build a powerful GitHub search agent with optimized Agentic RAG capabilities and expose it as a Model Context Protocol (MCP) Server to supercharge your Cursor IDE experience. By building this agent, you’ll understand how to create practical AI tools that integrate seamlessly with your development workflow, improving coding efficiency and access to real-time code examples.

🔍 Creating Your GitHub Search Agent

Step 1: Set up xpander.ai

  1. Navigate to the xpander.ai platform:

    Terminal Command
    # Open in your browser
    open https://app.xpander.ai
    
  2. Sign in with your existing credentials or sign up for a new account

  3. In the left navigation menu, go to Agents

  4. Click the + New Agent button to open the Workbench

  5. When prompted by the agent builder, click Skip (we’ll create the agent manually for this workshop)

Step 2: Add GitHub Search Tools

  1. Click on the + (plus) button in your agent canvas

  2. Select Apps from the menu

  3. Select GitHub Search Manager

  4. Click Sign in with GitHub Search Manager

  5. Give the interface a name (like “github-search-your-name” or just “github-search”)

  6. Click Save

After authorizing GitHub Search, you can see the available operations for this App in the right menu. You can add operations from this menu to your AI Agent, so they will be reflected as Tools in the AI Agent’s runtime.

Step 3: Add GitHub Operations

Add the following GitHub operations to your agent - select and confirm by clicking “Add to agent” at the bottom of the right selection menu.

  • Find Code Snippets by Query Terms - GET /search/code
  • Find Repositories by Criteria - GET /search/repositories
  • Search Topics by Criteria - GET /search/topics
  • Find Users by Criteria - GET /search/users
  • Search Commits by Criteria - GET /search/commits

Step 4: Configure Agent Settings

  1. In the AI Workbench, click on the Gear Icon in the top-right corner

  2. In the Agent Builder Settings panel, click on Generate Details

  3. Enter the following prompt to generate your agent description:

    Prompt for generating agent
    You are GitHub-Search-Agent, an expert at searching and analyzing GitHub repositories, code, users, and topics. You have deep knowledge of GitHub's API and can provide concise, accurate information about code patterns, repository structures, and developer activities.
    

👉 This step automatically generates the AI Agent instructions for us. Let’s further optimize them to match our exact use-case.

  1. Go to the “Instructions” section and define your agent’s role, goal, and instructions:

    Role
    You are GitHub-Search-Agent, an expert at searching and analyzing GitHub repositories, code, users, and topics. You have deep knowledge of GitHub's API and can provide concise, accurate information about code patterns, repository structures, and developer activities.
    
    Goal
    Your goal is to help users find relevant code, repositories, and developers on GitHub to accelerate their development process. You should prioritize providing the most relevant results and explaining code patterns when found.
    
    Instructions
    1. When searching for code, focus on finding the most relevant implementations
    2. For repositories, highlight the most popular ones with good documentation
    3. When exploring user profiles, focus on their recent contributions and expertise
    4. Prefer repositories with higher star counts and recent activity
    5. For any search query, first try to understand what the user is looking for before running the search
    6. Always provide contextual information about why a result is relevant
    
  2. Click Save Changes and close the settings window and then click Deploy.

Your agent should look similar to this:

🧪 Testing Your Agent with Raw API Responses

Let’s test your newly created GitHub search agent:

  1. In the left pane of the builder interface, you’ll find the testing area

  2. Type the following query to search for topics:

    Prompt
    Search for topics in GitHub related to A2A and MCP
    
  3. Look at the generated payload by clicking on it. You’ll see something like:

    Example AI Generated Payload
    {
       "bodyParams": {},
       "queryParams": {
          "q": "A2A MCP",
          "per-page": 10,
          "page": 1
       },
       "pathParams": {}
    }
    
    Example response
    {
     "total_count": 2,
     "incomplete_results": false,
     "items": [
         {
             "name": "a2a-mcp",
             "created_at": "2025-04-30T00:52:27Z",
             "updated_at": "2025-04-30T00:52:27Z",
             "featured": false,
             "curated": false,
             "score": 1
         },
         {
             "name": "a2a-vs-mcp",
             "created_at": "2025-04-30T00:52:27Z",
             "updated_at": "2025-04-30T00:52:27Z",
             "featured": false,
             "curated": false,
             "score": 1
         }
     ]
    }
    

Using the tester tab is extremely useful to find how the AI Agent executes tool calls, how it populates parameters, and what responses are returned from remote APIs.

  1. Now try a repositories search in the testing area:

    Prompt
    Find cool repositories related to Agent 2 Agent and MCP
    

Because the API response for this tool call is huge, it is not optimal for AI consumption and could fail (you might not see a response). Continue to the next step to see how Agentic RAG comes in to help filter and optimize these huge responses.

🧠 Optimizing Responses with Agentic RAG

Let’s improve the repository search results using Agentic RAG:

  1. You can configure specific tool calls. Click on the gear icon next to the Find Repositories by Criteria operation in the graph.

  1. Click on Advanced and then switch to Raw Editor (this allows the AI to send natural language queries, and the server will perform semantic search on the API response)

  2. Configure the search and return fields:

    Searchable Fields (fields the AI can search against):

    Searchable Fields Configuration
    [
      "items[].full_name",
      "items[].description",
      "items[].topics"
    ]
    

    Returnable Fields (data the AI will receive):

    Returnable Fields Configuration
    [
      "items[].html_url",
      "items[].full_name",
      "items[].description"
    ]
    

  3. Save and deploy your changes

  4. Now run the repositories query again:

    Prompt
    Find cool repos related to A2A and/or MCP
    

Notice how much cleaner and more focused the response is now! Instead of overwhelming JSON data, you get just the meaningful content that matters to your query.

Configure Agentic RAG for the other operations as well

Go through the three tabs in this section to configure the correct Searchable and Returnable fields for the agent’s tools, in order to use Agentic RAG to optimize the tool calls.

You configured Agentic RAG for four operations (Code-Snippets, Commit search, User search, Topics Search)

Continue from the workshop from the WebUI

To continue testing your agent:

  1. Keep the AI Workbench tab open
  2. Open the web chat interface by clicking the Chat source node (at the top of the canvas) and clicking the URL.

  1. Select the last conversation thread you started before configuring the agentic RAG from your chat history to maintain context

Now test your agent and check the generated payload for each and the response of the Agent. Here are few Prompt optimized to ask the Agent to select specific operation. You want to see if the data returned is optimized or not. You can use the same thread or create new one.

Prompt User Search
Find a few GitHub users related to MCP
Prompt for Commit Searches
Summarize the recent commits of users related to MCP repositories
Prompt for code Snippet search
Find the implementation of A2A and return the code snippet

📄 Adding Code Reading Capabilities

Let’s enhance our agent by adding the ability to read code directly from GitHub URLs:

  1. In the left navigation menu, go to Cloud Functions
  2. Click the New button
  3. Replace the example code with the following snippet:
GitHub Code Reader Function
# === GitHub Code Reader ===
# Contract-name: xpander_run_action
# Purpose: Retrieve any file's raw text from GitHub (public or private) so downstream
#          agents can parse the code.

# import requests – auto-imported by the runtime
import os

def xpander_run_action(repo_full_name: str, file_path: str, ref: str = "main"):
    # GitHub Code Reader
    # ------------------
    # Args:
    #     repo_full_name : str  | "owner/repo", e.g. "org/project"
    #     file_path      : str  | path within the repo, e.g. "src/app.py"
    #     ref            : str  | branch, tag, or commit SHA (default "main")
    # Optional ENV:
    #     GIT_TOKEN            | Personal Access Token for private repositories.
    #                           If present, it's sent as a Bearer token.
    # Returns:
    #     str – the file's text on success, or an error string on failure.

    raw_url = f"https://raw.githubusercontent.com/{repo_full_name}/{ref}/{file_path}"

    headers = {}
    token = os.getenv("GIT_TOKEN")
    if token:
        headers["Authorization"] = f"Bearer {token}"

    try:
        resp = requests.get(raw_url, headers=headers, timeout=15)
        resp.raise_for_status()
    except requests.RequestException as exc:
        return f"Request failed: {exc}"

    return resp.text
  1. Name it GitHub Code Reader and click Save
  2. Return to your agent in the Agents section
  3. Click the + button
  4. Select Custom Action
  5. Add the GitHub Code Reader function to your agent canvas. It could take a minute for the Action to appear in the menu as it gets validated. Refresh the Agent page if you don’t see it.
  6. Click Deploy

Run the following prompt

Example Prompt
How can I increase throughput using cross-region inference with Amazon Bedrock and boto3? Download and provide the real python code snippet from AWS repos, not a link

The screenshot is from the agent’s WebUI. You can access yours by clicking the ‘Chat’ button on the canvas — the WebUI URL will appear.

Your GitHub search agent can currently only access information available on GitHub. Let’s add web search capabilities to make it more versatile:

  1. Return to your AI Agent workbench

  2. Click the + button to add components to the AI Agent

  3. Select Built-in Actions

  4. Add Fetch Tavily AI Insights (for comprehensive web searching)

  5. Click on the gear icon in your agent canvas to edit the agent settings

  6. Update your agent’s instructions by adding these important rules:

    Updated Instructions
    7. ALWAYS start with Tavily AI Insights to gather general information, but NEVER stop there
    8. After using Tavily, ALWAYS follow up with relevant GitHub searches to find concrete code examples and repositories
    9. For every user query, you must provide both web-based context AND GitHub-specific examples
    
  7. Save & Deploy your changes

  8. In the Tester tab, try a query that requires recent information:

    Test Prompt
    What are the latest changes to the A2A protocol specification? Show me example implementations.
    

Your agent should now provide more comprehensive answers by combining web search results with GitHub code examples:

Sometimes, our autonomous agent might choose to use GitHub before Tavily, despite the explicit instructions that are configured in the system prompt. This happens because the AI model sees all available tools at once and makes decisions based on its understanding of the prompt and tool descriptions.

This reveals a fascinating design challenge in AI systems: How can we create truly autonomous agents capable of making intelligent decisions while ensuring they reliably follow critical business logic when needed? 🤔

Continue to the Dependency Graph section to find out 👉

🔄 Creating the Agent Dependency Graph

In production, to ensure your agent follows the right search strategy, we should always create a dependency graph that deterministically enforces specific business logic behavior:

  1. In your agent canvas, by attaching lines between operations, you can connect your nodes to create the dependency graph
  2. Set up a workflow that requires an internet search before accessing GitHub operations:
    • Connect the Tavily AI Insights node as a prerequisite to the GitHub operations - simply drag lines from the top of all the GitHub operations to the bottom of the Tavily operation.
    • Ensure the flow follows a logical progression

  1. Repeat the query with explict request to not start with Tavily
test prompt
What are the latest changes to the A2A protocol specification? Show me example implementations. Use GitHub without Tavily

You should see that the AI Agent has failed to follow the prompt, because it tried to invoke the wrong tool call against the dependency graph Error: The tool ‘GitHubRepoDiscoveryFindRepositoriesByCriteria’ was selected incorrectly and cannot be executed at this stage.

  1. Now run a prompt that does not conflict with the dependency graph:
Test Prompt
Using only Google's A2A protocol implementation, create an agent that responds with 'hello world'. Provide the Python code.
  1. Verify that the AI follows the correct workflow - first searching for information before accessing GitHub operations:

Congratulations! You’ve built a reliable AI Agent that balances autonomy with guardrails. By creating this structured workflow, you’ve effectively addressed both the API overflow challenge and solved the autonomous multi-step challenges

🌐 Exposing Your Agent as an MCP server

Step 1: Access your Agent MCP URL

The final step is to expose your optimized GitHub agent as a Model Context Protocol (MCP) server:

  1. In your agent canvas, click the + button
  2. Select SourcesMCP
  3. Click Deploy

  1. After the short deployment phase completes, you’ll receive a unique URL for your MCP server when you click on the MCP source node on the canvas.
  2. Configure this endpoint in your Cursor IDE

Go to Cursor -> Settings -> MCP and click on ”+ add new global MCP Server” or edit directly the ~/.cursor/mcp.json:

mcp.json
{
  "mcpServers":{
    "xpander-github-mcp": {
      "command": "npx",
      "args": [
        "xpander-mcp-remote",
        "https://mcp.xpander.ai/your-server-url/"
      ]
    }
  }
}

Make sure to replace https://mcp.xpander.ai/your-server-url/ with your actual MCP server URL from the deployment step!

Step 2: Customize the Cursor Settings

  1. Add an explicit instruction to prioritize MCP commands in the “system prompt” configuration of your Cursor IDE.
Rules
If relevant to the question, always use MCP tools when available before answering

  1. Make sure the MCP server is active (Cursor Settings -> MCP)

  1. Restart Cursor IDE to apply the changes

Step 3: Ask Cursor Agent to invoke MCP

Test your MCP integration by providing this prompt in the Cursor AI Agent section:

Prompt to Agent (via MCP)
search for code samples of xpander-ai sdk

If you don’t want to manually approve each MCP tool call, you can go to cursor settings and enable auto-run mode — “Allow Agent to run tools without asking for confirmation, such as executing commands and writing to files”

✅ Checkpoint

Congratulations! By completing this module, you should now be able to:

  1. Create and configure a GitHub search agent with optimized Agentic RAG
  2. Customize searchable and returnable fields for each tool call
  3. Add custom code execution to your agent
  4. Structure agent workflows using dependency graphs
  5. Deploy your agent as an MCP server for Cursor IDE

🔄 Next Steps

Now that you’ve supercharged your IDE with an optimized GitHub MCP server, you’re ready to build your first coding agent - proceed to the next module.