Project Background
This project explores how AI agents can control Discord bots through dynamic tool systems and proper prompting. Instead of hard-coding every feature, the bot uses the Model Context Protocol (MCP) to load tools at runtime, making it infinitely extensible.
Key Features
Dynamic Tool System
- Unlimited extensions via MCP SSE servers
- Tools automatically converted to OpenAI function schemas
- Modular architecture for easy management
Built-in Capabilities
- Web search with deep search for detailed research
- Document fetching and analysis (PDF, DOCX, HTML, etc.)
- Image and video generation via Hugging Face
- User/guild/channel information retrieval
- RAG-powered content search using embeddings
Discord Integration
- Rich component rendering (containers, buttons, galleries, sections)
- Custom markdown-to-component parser
- Interactive elements with context preservation
- Conversation history awareness
Technical Architecture
Core Technologies
discord.pyfor Discord API interactionopenaifor AI model integration (supports Azure OpenAI and standard OpenAI)mcpfor Model Context Protocol supporthuggingface_hubfor image/video generationgoogle-custom-searchfor web searchingmarkitdownfor document conversion
Component System The bot uses a custom markdown format that transforms into Discord’s UI components:
- Containers for organized layouts
- Action rows with buttons (up to 5 per row)
- Media galleries for images/videos
- Select menus for multi-choice interactions
- Sections with thumbnails or buttons
Tool Decorator Functions are converted to OpenAI tools using a custom decorator that automatically generates JSON schemas from Python type hints:
@tool(
query="Parameter description"
)
async def search(self, ctx: AIContext, query: str) -> Dict[str, Any]:
"""Tool description for AI"""
# Implementation
This allows seamless integration of new tools without manual schema definitions and helping speed up development and flexibility.
Usage Example
Setup
# Clone and install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.template .env
# Edit .env with your tokens
# Run the bot
python main.py
Adding Custom MCP Tools
# In .env, add MCP server endpoints
MCP_CUSTOM = "https://your-mcp-server.com/sse"
The bot automatically loads all MCP tools on startup and makes them available to the LLM.
Deep Search Implementation
The bot includes a sophisticated deep search system that:
- Executes iterative Google searches
- Fetches and processes content using embeddings
- Ranks results by similarity to target context
- Iterates until quality threshold is met (90% rating, 10+ results)
- Generates final summary from relevant context
This combines search, RAG, and LLM summarization for comprehensive research.
Docker Support
# Build image
docker build -t ai-fantasy .
# Run container
docker run -d --name ai-fantasy \
-e BOT_TOKEN="your_token" \
-e OPENAI_API_KEY="your_key" \
...
ai-fantasy
Project Impact
This bot demonstrates:
- Agents can effectively control complex systems with proper tooling
- MCP enables truly modular Agentic applications
- Custom component systems enhance user experience beyond plain text
- RAG improves search quality significantly over basic web scraping
Acknowledgments
Special thanks to Dương Lê Giang for guidance on RAG concepts that formed the foundation of the search implementation.
Repository
Source code available on GitHub: Agent Fantasy
Note: Requires OpenAI API access and Discord bot token. Optional features need Hugging Face token and Google Custom Search credentials.