Build a ready-to-deploy multi-agent system to detect and broadcast outages in minutes
Build a multi-agent outage manager. Agent 1: Listen on Slack to the #outage channel for notifications about new outages. Agent 2: List all files on Google Drive, find the appropriate Google Docs template for a given input message, and reformat the message accordingly. Agent 3: Create a new incident on statuspage.io.Once you click the “Build AI Agent” button, xpander handles all the complex backend tasks like infrastructure setup and deployment for you. After about a minute, your agent will be ready, and you’ll see a graph visualization similar to the one below:
#outage
and passes the message to the next agent.Search for messages about outages in the #outage channel.Assume the
#outage
channel contains a message like this:
#outage
. Then it uses the “Fetch Conversation History” operation to retrieve the raw outage message.
Once you’ve verified that this agent is working as expected, we can move on to testing the second agent.
Message: “We are experiencing ‘InsufficientCapacityError’ when trying to launch a SageMaker endpoint in a eu-central-1 region with ml.g5.12x large instance.”Here’s how the agent processes this:
Find the “Outage Template” document in Google Docs and read its content. Then, reformat the outage notification above according to the template. Finally, write me the reformatted message here.
AWS SERVICE INCIDENT Problematic AWS Service: Amazon SageMaker Details: We are currently experiencing an issue with launching a SageMaker endpoint in the eu-central-1 region using the ml.g5.12x large instance due to an ‘InsufficientCapacityError’. Our team is actively working to resolve this issue as quickly as possible. We apologize for any inconvenience this may cause and appreciate your patience. Next Step: Please monitor our status page for updates and further information. Contact: For any urgent inquiries, please contact our support team. Create an incident on statuspage.io based on the reformatted message above.As shown in the visualization below, the agent first uses the “Get All Status Pages” operation to retrieve your list of status pages, then uses “Create Incident” to post the outage message as an incident:
Search for messages about outages in the #outage channel. Then, find the “Outage Template” docs in Google Docs and take a look at its content. Then, reformat the outage notification according to the template. Then, create an incident on StatusPage based on the reformatted message.As shown in the visualization below, here’s how the agent coordinates the workflow:
#outage
Slack channel.pip
, such as:
.env
file in the same directory as your Python script and add your credentials like this:
.env
and initialize the agent.while
loop operation until the agent finishes its task.while
loop: