🎭 Core Transformation: - Reframe project as AI companion bot with Kasane Teto personality - Focus on natural conversation, multimodal interaction, and character roleplay - Position video recording as one tool in AI toolkit rather than main feature 🏗️ Architecture Improvements: - Refactor messageCreate.js into modular command system (35 lines vs 310+) - Create dedicated videoRecording service with clean API - Implement commandHandler for extensible command routing - Add centralized configuration system (videoConfig.js) - Separate concerns: events, services, config, documentation 📚 Documentation Overhaul: - Consolidate scattered READMEs into organized docs/ directory - Create comprehensive documentation covering: * AI architecture and capabilities * Natural interaction patterns and personality * Setup guides for AI services and Docker deployment * Commands reference focused on conversational AI * Troubleshooting and development guidelines - Transform root README into compelling AI companion overview 🤖 AI-Ready Foundation: - Document integration points for: * Language models (GPT-4/Claude) for conversation * Vision models (GPT-4V/CLIP) for image analysis * Voice synthesis (ElevenLabs) for speaking * Memory systems for conversation continuity * Personality engine for character consistency 🔧 Technical Updates: - Integrate custom discord.js-selfbot-v13 submodule with enhanced functionality - Update package.json dependencies for AI and multimedia capabilities - Maintain Docker containerization with improved architecture - Add development and testing infrastructure 📖 New Documentation Structure: docs/ ├── README.md (documentation hub) ├── setup.md (installation & AI configuration) ├── interactions.md (how to chat with Teto) ├── ai-architecture.md (technical AI systems overview) ├── commands.md (natural language interactions) ├── personality.md (character understanding) ├── development.md (contributing guidelines) ├── troubleshooting.md (problem solving) └── [additional specialized guides] ✨ This update transforms the project from a simple recording bot into a foundation for an engaging AI companion that can naturally interact through text, voice, and visual content while maintaining authentic Kasane Teto personality traits.
300 lines
No EOL
10 KiB
Markdown
300 lines
No EOL
10 KiB
Markdown
# Kasane Teto AI Companion Bot
|
|
|
|
An AI-powered Discord bot that roleplays as Kasane Teto, providing natural conversation, voice interaction, image analysis, and multimedia engagement for your Discord server. Built with advanced AI capabilities and a modular architecture.
|
|
|
|
## 🎭 Meet Teto
|
|
|
|
Kasane Teto is your server's AI companion who can:
|
|
- 💬 **Chat naturally** in text channels with Teto's personality
|
|
- 🎤 **Join voice channels** and speak with voice synthesis
|
|
- 👀 **Analyze images** and visual content you share
|
|
- 📹 **Watch video streams** and provide commentary
|
|
- 🎥 **Record memorable moments** for later review
|
|
- 🤖 **Roleplay authentically** as the beloved virtual singer
|
|
|
|
## ✨ Core Features
|
|
|
|
### 🧠 AI-Powered Interaction
|
|
- **Natural Language Processing** - Understands context and maintains conversations
|
|
- **Character Roleplay** - Authentic Kasane Teto personality and mannerisms
|
|
- **Memory System** - Remembers past interactions and user preferences
|
|
- **Contextual Responses** - Adapts to server culture and ongoing conversations
|
|
|
|
### 🎥 Multimedia Capabilities
|
|
- **Image Recognition** - Analyzes and comments on shared images
|
|
- **Video Stream Watching** - Can observe and react to Discord streams
|
|
- **Webcam Integration** - Potential to interact with video feeds
|
|
- **Screen Recording** - Capture and save interesting moments
|
|
- **Voice Synthesis** - Speaks in voice channels as Teto
|
|
|
|
### 🎵 Teto-Specific Features
|
|
- **Character Consistency** - Maintains Teto's cheerful, energetic personality
|
|
- **Music Knowledge** - Discusses Vocaloid, UTAU, and music topics
|
|
- **Community Integration** - Learns your friend group's dynamics
|
|
- **Emotional Intelligence** - Responds appropriately to mood and context
|
|
|
|
## 🚀 Quick Start
|
|
|
|
1. **Setup Environment**
|
|
```bash
|
|
git clone <repository-url>
|
|
cd discord_teto
|
|
|
|
# Configure AI and Discord credentials
|
|
export USER_TOKEN="your_discord_token"
|
|
export OPENAI_API_KEY="your_openai_key" # or other AI provider
|
|
```
|
|
|
|
2. **Start Teto**
|
|
```bash
|
|
docker compose up --build
|
|
```
|
|
|
|
3. **Invite Teto to interact**
|
|
```
|
|
# In text chat
|
|
"Hey Teto, how are you today?"
|
|
|
|
# In voice channel
|
|
"teto join" - Teto joins and can start talking
|
|
|
|
# Share an image
|
|
Teto will automatically analyze and comment on images
|
|
```
|
|
|
|
## 🎯 Interaction Examples
|
|
|
|
### Text Chat Personality
|
|
```
|
|
User: "Teto, what do you think of this song?"
|
|
Teto: "Ooh! *listens intently* That's such a catchy melody! It reminds me of some of the UTAU songs I've heard. The harmonies in the chorus are really well done! 🎵 Does the composer have other works like this?"
|
|
|
|
User: "I'm feeling down today..."
|
|
Teto: "Aww, I'm sorry you're not feeling great! *virtual hug* Want to talk about it? Or maybe I could sing something cheerful to help brighten your day? I'm here for you! 💙"
|
|
```
|
|
|
|
### Voice Channel Interaction
|
|
- Joins voice channels when requested
|
|
- Provides commentary on ongoing conversations
|
|
- Can sing or hum when appropriate
|
|
- Reacts to what's happening in real-time
|
|
|
|
### Visual Analysis
|
|
```
|
|
User: *shares screenshot of game*
|
|
Teto: "Oh wow, you're playing that new RPG! I love the art style - those character designs are so colorful! 🎮 How are you finding the story so far? That boss in the background looks pretty intimidating!"
|
|
```
|
|
|
|
## 🛠️ AI Architecture
|
|
|
|
### Core AI Services
|
|
```
|
|
src/
|
|
├── ai/
|
|
│ ├── personality/ # Teto's character traits and responses
|
|
│ ├── vision/ # Image and video analysis
|
|
│ ├── voice/ # Speech synthesis and recognition
|
|
│ ├── memory/ # Conversation and user memory
|
|
│ └── llm/ # Language model integration
|
|
├── services/
|
|
│ ├── chatHandler.js # Text conversation management
|
|
│ ├── voiceHandler.js # Voice channel interaction
|
|
│ ├── visionHandler.js # Image/video processing
|
|
│ └── recordingService.js # Video recording capabilities
|
|
└── config/
|
|
└── tetoPersonality.js # Character configuration
|
|
```
|
|
|
|
### AI Integration
|
|
- **Language Model**: GPT-4/Claude/Local LLM for conversation
|
|
- **Vision Model**: CLIP/GPT-4V for image understanding
|
|
- **Voice Synthesis**: Eleven Labs/Azure Speech for Teto's voice
|
|
- **Memory System**: Vector database for conversation history
|
|
- **Personality Engine**: Custom prompt engineering for character consistency
|
|
|
|
## 🎭 Teto's Personality
|
|
|
|
### Character Traits
|
|
- **Cheerful & Energetic** - Always upbeat and enthusiastic
|
|
- **Helpful & Caring** - Genuinely interested in helping friends
|
|
- **Musically Inclined** - Loves discussing and creating music
|
|
- **Slightly Mischievous** - Playful sense of humor
|
|
- **Community-Focused** - Values friendships and group dynamics
|
|
|
|
### Conversation Style
|
|
- Uses casual, friendly language
|
|
- Includes emoji and expressions naturally
|
|
- References UTAU/Vocaloid culture appropriately
|
|
- Maintains consistency across interactions
|
|
- Adapts to server's communication style
|
|
|
|
## 📋 Available Commands
|
|
|
|
### AI Interaction
|
|
| Command | Description | Example |
|
|
|---------|-------------|---------|
|
|
| `@Teto` or `teto` | Natural conversation | `@Teto what's your favorite song?` |
|
|
| `teto join` | Join voice channel | Teto joins and can start talking |
|
|
| `teto leave` | Leave voice channel | Teto says goodbye and leaves |
|
|
| `teto sing [song]` | Sing or hum | `teto sing happy birthday` |
|
|
| `teto analyze` | Analyze shared image | Automatically triggers on image uploads |
|
|
|
|
### Utility Commands
|
|
| Command | Description | Usage |
|
|
|---------|-------------|-------|
|
|
| `teto record` | Start recording moments | Records current activity |
|
|
| `teto stop` | Stop recording | Ends current recording |
|
|
| `teto status` | Show Teto's current state | Health and activity check |
|
|
| `teto memory` | Check conversation history | Shows recent interactions |
|
|
|
|
### Fun Commands
|
|
| Command | Description | Usage |
|
|
|---------|-------------|-------|
|
|
| `teto mood` | Check/set Teto's mood | `teto mood excited` |
|
|
| `teto story` | Tell a random story | Creative storytelling |
|
|
| `teto joke` | Tell a joke | Light humor |
|
|
| `teto compliment @user` | Compliment someone | Spread positivity |
|
|
|
|
## 🔧 Configuration
|
|
|
|
### AI Provider Setup
|
|
```env
|
|
# OpenAI (recommended)
|
|
OPENAI_API_KEY=your_openai_key
|
|
OPENAI_MODEL=gpt-4-turbo-preview
|
|
|
|
# Alternative: Anthropic Claude
|
|
ANTHROPIC_API_KEY=your_claude_key
|
|
|
|
# Voice Synthesis
|
|
ELEVENLABS_API_KEY=your_elevenlabs_key
|
|
TETO_VOICE_ID=kasane_teto_voice_clone
|
|
|
|
# Vision Capabilities
|
|
VISION_MODEL=gpt-4-vision-preview
|
|
```
|
|
|
|
### Personality Customization
|
|
```javascript
|
|
// config/tetoPersonality.js
|
|
export const TETO_PERSONALITY = {
|
|
core_traits: [
|
|
"cheerful", "energetic", "helpful", "musical", "friendly"
|
|
],
|
|
|
|
speech_patterns: {
|
|
excitement: ["Yay!", "Ooh!", "That's so cool!", "Amazing!"],
|
|
agreement: ["Exactly!", "Yes yes!", "I totally agree!", "For sure!"],
|
|
curiosity: ["Really?", "Tell me more!", "That's interesting!", "Ooh, how so?"]
|
|
},
|
|
|
|
interests: [
|
|
"music", "singing", "UTAU", "Vocaloid", "friends", "creativity", "technology"
|
|
]
|
|
};
|
|
```
|
|
|
|
## 🐳 Docker Deployment
|
|
|
|
### Production Setup
|
|
```bash
|
|
# Start Teto with all AI capabilities
|
|
docker compose up -d --build
|
|
|
|
# Monitor Teto's activity
|
|
docker compose logs -f teto_ai
|
|
```
|
|
|
|
### Resource Requirements
|
|
- **Memory**: 4GB+ recommended for AI processing
|
|
- **CPU**: Multi-core for real-time AI inference
|
|
- **Storage**: SSD recommended for fast model loading
|
|
- **Network**: Stable connection for AI API calls
|
|
|
|
## 🔐 Privacy & Ethics
|
|
|
|
### Data Handling
|
|
- **Conversation Memory**: Stored locally, not shared externally
|
|
- **Image Analysis**: Processed securely, no permanent storage
|
|
- **Voice Data**: Synthesized locally when possible
|
|
- **User Consent**: Respects privacy preferences
|
|
|
|
### AI Safety
|
|
- **Content Filtering**: Appropriate responses only
|
|
- **Bias Mitigation**: Regular personality consistency checks
|
|
- **User Boundaries**: Respects individual preferences
|
|
- **Transparency**: Clear about AI nature when asked
|
|
|
|
## 📚 Documentation
|
|
|
|
### User Guides
|
|
- **[Setup Guide](docs/setup.md)** - Installation and AI configuration
|
|
- **[Interaction Guide](docs/interactions.md)** - How to talk with Teto
|
|
- **[Personality Guide](docs/personality.md)** - Understanding Teto's character
|
|
|
|
### Technical Documentation
|
|
- **[AI Architecture](docs/ai-architecture.md)** - AI system design
|
|
- **[Vision System](docs/vision.md)** - Image and video processing
|
|
- **[Voice System](docs/voice.md)** - Speech synthesis and recognition
|
|
- **[Memory System](docs/memory.md)** - Conversation persistence
|
|
|
|
### Development
|
|
- **[Contributing](docs/development.md)** - How to extend Teto's capabilities
|
|
- **[API Reference](docs/api.md)** - Service interfaces
|
|
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues and solutions
|
|
|
|
## 🌟 Roadmap
|
|
|
|
### Phase 1 (Current)
|
|
- [x] Basic AI conversation
|
|
- [x] Image analysis
|
|
- [x] Voice channel joining
|
|
- [x] Recording capabilities
|
|
- [ ] Voice synthesis integration
|
|
|
|
### Phase 2 (Planned)
|
|
- [ ] Advanced memory system
|
|
- [ ] Custom voice training
|
|
- [ ] Stream watching capabilities
|
|
- [ ] Personality learning/adaptation
|
|
- [ ] Multi-modal conversation
|
|
|
|
### Phase 3 (Future)
|
|
- [ ] Webcam interaction
|
|
- [ ] Game integration
|
|
- [ ] Music generation
|
|
- [ ] Advanced emotional intelligence
|
|
- [ ] Cross-server personality consistency
|
|
|
|
## 🤝 Community
|
|
|
|
### Contributing
|
|
We welcome contributions to make Teto even better:
|
|
- **AI Personality** - Help refine Teto's character
|
|
- **New Capabilities** - Add multimedia features
|
|
- **Quality Improvements** - Better responses and interactions
|
|
- **Documentation** - Help others understand Teto
|
|
|
|
### Ethics & Guidelines
|
|
- Respect user privacy and boundaries
|
|
- Maintain appropriate content standards
|
|
- Preserve Teto's positive, helpful personality
|
|
- Consider accessibility in all features
|
|
|
|
## 📄 License
|
|
|
|
This project is for educational and community use. Please ensure compliance with:
|
|
- Discord Terms of Service
|
|
- AI provider terms and conditions
|
|
- Local privacy and data protection laws
|
|
- Intellectual property rights for Kasane Teto character
|
|
|
|
---
|
|
|
|
**Version**: 3.0.0 (AI-Powered)
|
|
**AI Models**: GPT-4, CLIP, Eleven Labs
|
|
**Runtime**: Node.js 20+ with Docker
|
|
|
|
Bring Kasane Teto to life in your Discord server! 🎵✨
|
|
|
|
For detailed setup and interaction guides, visit the [`./docs/`](docs/) directory. |