teto_ai/README.md

# Kasane Teto AI Companion Bot

An AI-powered Discord bot that roleplays as Kasane Teto, providing natural conversation, voice interaction, image analysis, and multimedia engagement for your Discord server. Built with advanced AI capabilities and a modular architecture.

## 🎭 Meet Teto

Kasane Teto is your server's AI companion who can:
- 💬 **Chat naturally** in text channels with Teto's personality
- 🎤 **Join voice channels** and speak with voice synthesis
- 👀 **Analyze images** and visual content you share
- 📹 **Watch video streams** and provide commentary
- 🎥 **Record memorable moments** for later review
- 🤖 **Roleplay authentically** as the beloved virtual singer

## ✨ Core Features

### 🧠 AI-Powered Interaction
- **Natural Language Processing** - Understands context and maintains conversations
- **Character Roleplay** - Authentic Kasane Teto personality and mannerisms
- **Memory System** - Remembers past interactions and user preferences
- **Contextual Responses** - Adapts to server culture and ongoing conversations

### 🎥 Multimedia Capabilities
- **Image Recognition** - Analyzes and comments on shared images
- **Video Stream Watching** - Can observe and react to Discord streams
- **Webcam Integration** - Potential to interact with video feeds
- **Screen Recording** - Capture and save interesting moments
- **Voice Synthesis** - Speaks in voice channels as Teto

### 🎵 Teto-Specific Features
- **Character Consistency** - Maintains Teto's cheerful, energetic personality
- **Music Knowledge** - Discusses Vocaloid, UTAU, and music topics
- **Community Integration** - Learns your friend group's dynamics
- **Emotional Intelligence** - Responds appropriately to mood and context

## 🚀 Quick Start

> [!IMPORTANT]
> This project is designed to run exclusively within Docker containers. Bare-metal installation is not officially supported. All instructions assume a working Docker environment.

1. **Setup Environment**
   ```bash
   git clone <repository-url>
   cd discord_teto

   # Configure Discord credentials & local AI endpoints
   export USER_TOKEN="your_discord_token"
   export VLLM_ENDPOINT="http://localhost:8000" # Or your vLLM server
   export WYOMING_ENDPOINT="http://localhost:10300" # Or your Wyoming server
   ```

2. **Start Teto**
   ```bash
   docker compose up --build
   ```

3. **Invite Teto to interact**
   ```
   # In text chat
   "Hey Teto, how are you today?"

   # In voice channel
   "teto join" - Teto joins and can start talking

   # Share an image
   Teto will automatically analyze and comment on images
   ```

## 🎯 Interaction Examples

### Text Chat Personality
```
User: "Teto, what do you think of this song?"
Teto: "Ooh! *listens intently* That's such a catchy melody! It reminds me of some of the UTAU songs I've heard. The harmonies in the chorus are really well done! 🎵 Does the composer have other works like this?"

User: "I'm feeling down today..."
Teto: "Aww, I'm sorry you're not feeling great! *virtual hug* Want to talk about it? Or maybe I could sing something cheerful to help brighten your day? I'm here for you! 💙"
```

### Voice Channel Interaction
- Joins voice channels when requested
- Provides commentary on ongoing conversations
- Can sing or hum when appropriate
- Reacts to what's happening in real-time

### Visual Analysis
```
User: *shares screenshot of game*
Teto: "Oh wow, you're playing that new RPG! I love the art style - those character designs are so colorful! 🎮 How are you finding the story so far? That boss in the background looks pretty intimidating!"
```

## 🛠️ AI Architecture

### Core AI Services
```
src/
├── ai/
│   ├── personality/          # Teto's character traits and responses
│   ├── vision/              # Image and video analysis
│   ├── voice/               # Speech synthesis and recognition
│   ├── memory/              # Conversation and user memory
│   └── llm/                 # Language model integration
├── services/
│   ├── chatHandler.js       # Text conversation management
│   ├── voiceHandler.js      # Voice channel interaction
│   ├── visionHandler.js     # Image/video processing
│   └── recordingService.js  # Video recording capabilities
└── config/
    └── tetoPersonality.js   # Character configuration
```

### AI Integration
- **Language Model**: Self-hosted LLM via `vLLM` (OpenAI compatible endpoint)
- **Vision Model**: Multi-modal models served through `vLLM`
- **Voice Synthesis**: `Piper` TTS via `Wyoming` protocol
- **Speech Recognition**: `Whisper` STT via `Wyoming` protocol
- **Memory System**: Local vector database for conversation history
- **Personality Engine**: Custom prompt engineering for character consistency

## 🎭 Teto's Personality

### Character Traits
- **Cheerful & Energetic** - Always upbeat and enthusiastic
- **Helpful & Caring** - Genuinely interested in helping friends
- **Musically Inclined** - Loves discussing and creating music
- **Slightly Mischievous** - Playful sense of humor
- **Community-Focused** - Values friendships and group dynamics

### Conversation Style
- Uses casual, friendly language
- Includes emoji and expressions naturally
- References UTAU/Vocaloid culture appropriately
- Maintains consistency across interactions
- Adapts to server's communication style

## 📋 Available Commands

### AI Interaction
| Command | Description | Example |
|---------|-------------|---------|
| `@Teto` or `teto` | Natural conversation | `@Teto what's your favorite song?` |
| `teto join` | Join voice channel | Teto joins and can start talking |
| `teto leave` | Leave voice channel | Teto says goodbye and leaves |
| `teto sing [song]` | Sing or hum | `teto sing happy birthday` |
| `teto analyze` | Analyze shared image | Automatically triggers on image uploads |

### Utility Commands
| Command | Description | Usage |
|---------|-------------|-------|
| `teto record` | Start recording moments | Records current activity |
| `teto stop` | Stop recording | Ends current recording |
| `teto status` | Show Teto's current state | Health and activity check |
| `teto memory` | Check conversation history | Shows recent interactions |

### Fun Commands
| Command | Description | Usage |
|---------|-------------|-------|
| `teto mood` | Check/set Teto's mood | `teto mood excited` |
| `teto story` | Tell a random story | Creative storytelling |
| `teto joke` | Tell a joke | Light humor |
| `teto compliment @user` | Compliment someone | Spread positivity |

## 🔧 Configuration

### Local AI Provider Setup
```env
# Local vLLM Server (OpenAI Compatible)
VLLM_ENDPOINT="http://localhost:8000/v1"
LOCAL_MODEL_NAME="mistralai/Mistral-7B-Instruct-v0.2" # Or your preferred model

# Wyoming Protocol for Voice (Piper TTS / Whisper STT)
WYOMING_HOST="localhost"
WYOMING_PORT="10300"
PIPER_VOICE="en_US-lessac-medium"

# Vision Capabilities are enabled if the vLLM model is multi-modal
VISION_ENABLED=true
```

### Personality Customization
```javascript
// config/tetoPersonality.js
export const TETO_PERSONALITY = {
  core_traits: [
    "cheerful", "energetic", "helpful", "musical", "friendly"
  ],

  speech_patterns: {
    excitement: ["Yay!", "Ooh!", "That's so cool!", "Amazing!"],
    agreement: ["Exactly!", "Yes yes!", "I totally agree!", "For sure!"],
    curiosity: ["Really?", "Tell me more!", "That's interesting!", "Ooh, how so?"]
  },

  interests: [
    "music", "singing", "UTAU", "Vocaloid", "friends", "creativity", "technology"
  ]
};
```

## 🐳 Docker Deployment

This project is officially supported for **Docker deployments only**. The container-first approach is critical for managing the complex local AI stack, ensuring that all services, dependencies, and configurations operate together consistently.

### Production Setup
```bash
# Start Teto with all AI capabilities
docker compose up -d --build

# Monitor Teto's activity
docker compose logs -f teto_ai
```

### Resource Requirements
- **VRAM**: 8GB+ for 7B models, 24GB+ for larger models
- **Memory**: 16GB+ RAM recommended
- **CPU**: Modern multi-core CPU
- **Storage**: Fast SSD for model weights (15GB+ per model)
- **Network**: Local network for inter-service communication

## 🔐 Privacy & Ethics

### Data Handling
- **Conversation Memory**: Stored locally, not shared externally
- **Image Analysis**: Processed securely, no permanent storage
- **Voice Data**: Synthesized locally when possible
- **User Consent**: Respects privacy preferences

### AI Safety
- **Content Filtering**: Appropriate responses only
- **Bias Mitigation**: Regular personality consistency checks
- **User Boundaries**: Respects individual preferences
- **Transparency**: Clear about AI nature when asked

## 📚 Documentation

### User Guides
- **[Setup Guide](docs/setup.md)** - Installation and AI configuration
- **[Interaction Guide](docs/interactions.md)** - How to talk with Teto
- **[Personality Guide](docs/personality.md)** - Understanding Teto's character

### Technical Documentation
- **[AI Architecture](docs/ai-architecture.md)** - AI system design
- **[Vision System](docs/vision.md)** - Image and video processing
- **[Voice System](docs/voice.md)** - Speech synthesis and recognition
- **[Memory System](docs/memory.md)** - Conversation persistence

### Development
- **[Contributing](docs/development.md)** - How to extend Teto's capabilities
- **[API Reference](docs/api.md)** - Service interfaces
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues and solutions

## 🌟 Roadmap

### Phase 1 (Current)
- [x] Basic AI conversation
- [x] Image analysis
- [x] Voice channel joining
- [x] Recording capabilities
- [ ] Voice synthesis integration

### Phase 2 (Planned)
- [ ] Advanced memory system
- [ ] Custom voice training
- [ ] Stream watching capabilities
- [ ] Personality learning/adaptation
- [ ] Multi-modal conversation

### Phase 3 (Future)
- [ ] Webcam interaction
- [ ] Game integration
- [ ] Music generation
- [ ] Advanced emotional intelligence
- [ ] Cross-server personality consistency

## 🤝 Community

### Contributing
We welcome contributions to make Teto even better:
- **AI Personality** - Help refine Teto's character
- **New Capabilities** - Add multimedia features
- **Quality Improvements** - Better responses and interactions
- **Documentation** - Help others understand Teto

### Ethics & Guidelines
- Respect user privacy and boundaries
- Maintain appropriate content standards
- Preserve Teto's positive, helpful personality
- Consider accessibility in all features

## 📄 License

This project is for educational and community use. Please ensure compliance with:
- Discord Terms of Service
- AI provider terms and conditions
- Local privacy and data protection laws
- Intellectual property rights for Kasane Teto character

---

**Version**: 3.0.0 (AI-Powered)
**AI Stack**: Local-First (vLLM, Piper, Whisper)
**Runtime**: Node.js 20+ with Docker

Bring Kasane Teto to life in your Discord server! 🎵✨

For detailed setup and interaction guides, visit the [`./docs/`](docs/) directory.