No description

Find a file

Mikolaj Wojciech Gorski 3f2918a7d5 feat: enhance webcam recording initialization timing - Increase initial wait time to 8s for better connection stability - Add SSRC mapping readiness check with 15s timeout - Extend camera detection retry interval to 2s - Add comprehensive SSRC mapping debug logging before recording - Better state validation before starting webcam stream		2025-07-26 16:51:23 +02:00
dashboard_mockups	Reworking the dashboard with Gemini, gotta go in myself, because Gemini can't seem to vibe out the auth.	2025-07-24 06:48:19 +02:00
discord.js-selfbot-v13@d8e4393ff5	feat: enhance webcam recording initialization timing	2025-07-26 16:51:23 +02:00
docs	feat: add auto-stop functionality for webcam and video recordings	2025-07-26 16:13:19 +02:00
public	Reworked auth, added WS for imediate communication, cleaned up architecture.	2025-07-26 05:43:38 +02:00
src	feat: enhance webcam recording initialization timing	2025-07-26 16:51:23 +02:00
test	Major refactor: Transform into AI-powered Kasane Teto companion bot	2025-07-26 13:08:47 +02:00
views	Major refactor: Transform into AI-powered Kasane Teto companion bot	2025-07-26 13:08:47 +02:00
.dockerignore	Reworked auth, added WS for imediate communication, cleaned up architecture.	2025-07-26 05:43:38 +02:00
.gitignore	Initial commit	2025-07-21 06:44:37 +02:00
.gitmodules	feat: add auto-stop functionality for webcam and video recordings	2025-07-26 16:13:19 +02:00
bot.js	Reworked auth, added WS for imediate communication, cleaned up architecture.	2025-07-26 05:43:38 +02:00
docker-compose.yaml.release	feat: add auto-stop functionality for webcam and video recordings	2025-07-26 16:13:19 +02:00
docker-compose.yml	feat: add auto-stop functionality for webcam and video recordings	2025-07-26 16:13:19 +02:00
Dockerfile	Major refactor: Transform into AI-powered Kasane Teto companion bot	2025-07-26 13:08:47 +02:00
Dockerfile.dev	feat: add auto-stop functionality for webcam and video recordings	2025-07-26 16:13:19 +02:00
entry.sh	feat: enhance webcam recording initialization timing	2025-07-26 16:51:23 +02:00
package-lock.json	Major refactor: Transform into AI-powered Kasane Teto companion bot	2025-07-26 13:08:47 +02:00
package.json	Major refactor: Transform into AI-powered Kasane Teto companion bot	2025-07-26 13:08:47 +02:00
README.md	Updated the docs to focus on a local only stack instead of one relient on services like OpenAI, Eleven labs and so on.	2025-07-26 14:26:18 +02:00

README.md

Kasane Teto AI Companion Bot

An AI-powered Discord bot that roleplays as Kasane Teto, providing natural conversation, voice interaction, image analysis, and multimedia engagement for your Discord server. Built with advanced AI capabilities and a modular architecture.

🎭 Meet Teto

Kasane Teto is your server's AI companion who can:

💬 Chat naturally in text channels with Teto's personality
🎤 Join voice channels and speak with voice synthesis
👀 Analyze images and visual content you share
📹 Watch video streams and provide commentary
🎥 Record memorable moments for later review
🤖 Roleplay authentically as the beloved virtual singer

✨ Core Features

🧠 AI-Powered Interaction

Natural Language Processing - Understands context and maintains conversations
Character Roleplay - Authentic Kasane Teto personality and mannerisms
Memory System - Remembers past interactions and user preferences
Contextual Responses - Adapts to server culture and ongoing conversations

🎥 Multimedia Capabilities

Image Recognition - Analyzes and comments on shared images
Video Stream Watching - Can observe and react to Discord streams
Webcam Integration - Potential to interact with video feeds
Screen Recording - Capture and save interesting moments
Voice Synthesis - Speaks in voice channels as Teto

🎵 Teto-Specific Features

Character Consistency - Maintains Teto's cheerful, energetic personality
Music Knowledge - Discusses Vocaloid, UTAU, and music topics
Community Integration - Learns your friend group's dynamics
Emotional Intelligence - Responds appropriately to mood and context

🚀 Quick Start

Important

This project is designed to run exclusively within Docker containers. Bare-metal installation is not officially supported. All instructions assume a working Docker environment.

Setup Environment

git clone <repository-url>
cd discord_teto

# Configure Discord credentials & local AI endpoints
export USER_TOKEN="your_discord_token"
export VLLM_ENDPOINT="http://localhost:8000" # Or your vLLM server
export WYOMING_ENDPOINT="http://localhost:10300" # Or your Wyoming server

Start Teto
```
docker compose up --build
```

Invite Teto to interact

# In text chat
"Hey Teto, how are you today?"

# In voice channel
"teto join" - Teto joins and can start talking

# Share an image
Teto will automatically analyze and comment on images

🎯 Interaction Examples

Text Chat Personality

User: "Teto, what do you think of this song?"
Teto: "Ooh! *listens intently* That's such a catchy melody! It reminds me of some of the UTAU songs I've heard. The harmonies in the chorus are really well done! 🎵 Does the composer have other works like this?"

User: "I'm feeling down today..."
Teto: "Aww, I'm sorry you're not feeling great! *virtual hug* Want to talk about it? Or maybe I could sing something cheerful to help brighten your day? I'm here for you! 💙"

Voice Channel Interaction

Joins voice channels when requested
Provides commentary on ongoing conversations
Can sing or hum when appropriate
Reacts to what's happening in real-time

Visual Analysis

User: *shares screenshot of game*
Teto: "Oh wow, you're playing that new RPG! I love the art style - those character designs are so colorful! 🎮 How are you finding the story so far? That boss in the background looks pretty intimidating!"

🛠️ AI Architecture

Core AI Services

src/
├── ai/
│   ├── personality/          # Teto's character traits and responses
│   ├── vision/              # Image and video analysis
│   ├── voice/               # Speech synthesis and recognition  
│   ├── memory/              # Conversation and user memory
│   └── llm/                 # Language model integration
├── services/
│   ├── chatHandler.js       # Text conversation management
│   ├── voiceHandler.js      # Voice channel interaction
│   ├── visionHandler.js     # Image/video processing
│   └── recordingService.js  # Video recording capabilities
└── config/
    └── tetoPersonality.js   # Character configuration

AI Integration

Language Model: Self-hosted LLM via vLLM (OpenAI compatible endpoint)
Vision Model: Multi-modal models served through vLLM
Voice Synthesis: Piper TTS via Wyoming protocol
Speech Recognition: Whisper STT via Wyoming protocol
Memory System: Local vector database for conversation history
Personality Engine: Custom prompt engineering for character consistency

🎭 Teto's Personality

Character Traits

Cheerful & Energetic - Always upbeat and enthusiastic
Helpful & Caring - Genuinely interested in helping friends
Musically Inclined - Loves discussing and creating music
Slightly Mischievous - Playful sense of humor
Community-Focused - Values friendships and group dynamics

Conversation Style

Uses casual, friendly language
Includes emoji and expressions naturally
References UTAU/Vocaloid culture appropriately
Maintains consistency across interactions
Adapts to server's communication style

📋 Available Commands

AI Interaction

Command	Description	Example
`@Teto` or `teto`	Natural conversation	`@Teto what's your favorite song?`
`teto join`	Join voice channel	Teto joins and can start talking
`teto leave`	Leave voice channel	Teto says goodbye and leaves
`teto sing [song]`	Sing or hum	`teto sing happy birthday`
`teto analyze`	Analyze shared image	Automatically triggers on image uploads

Utility Commands

Command	Description	Usage
`teto record`	Start recording moments	Records current activity
`teto stop`	Stop recording	Ends current recording
`teto status`	Show Teto's current state	Health and activity check
`teto memory`	Check conversation history	Shows recent interactions

Fun Commands

Command	Description	Usage
`teto mood`	Check/set Teto's mood	`teto mood excited`
`teto story`	Tell a random story	Creative storytelling
`teto joke`	Tell a joke	Light humor
`teto compliment @user`	Compliment someone	Spread positivity

🔧 Configuration

Local AI Provider Setup

# Local vLLM Server (OpenAI Compatible)
VLLM_ENDPOINT="http://localhost:8000/v1"
LOCAL_MODEL_NAME="mistralai/Mistral-7B-Instruct-v0.2" # Or your preferred model

# Wyoming Protocol for Voice (Piper TTS / Whisper STT)
WYOMING_HOST="localhost"
WYOMING_PORT="10300"
PIPER_VOICE="en_US-lessac-medium"

# Vision Capabilities are enabled if the vLLM model is multi-modal
VISION_ENABLED=true

Personality Customization

// config/tetoPersonality.js
export const TETO_PERSONALITY = {
  core_traits: [
    "cheerful", "energetic", "helpful", "musical", "friendly"
  ],
  
  speech_patterns: {
    excitement: ["Yay!", "Ooh!", "That's so cool!", "Amazing!"],
    agreement: ["Exactly!", "Yes yes!", "I totally agree!", "For sure!"],
    curiosity: ["Really?", "Tell me more!", "That's interesting!", "Ooh, how so?"]
  },
  
  interests: [
    "music", "singing", "UTAU", "Vocaloid", "friends", "creativity", "technology"
  ]
};

🐳 Docker Deployment

This project is officially supported for Docker deployments only. The container-first approach is critical for managing the complex local AI stack, ensuring that all services, dependencies, and configurations operate together consistently.

Production Setup

# Start Teto with all AI capabilities
docker compose up -d --build

# Monitor Teto's activity
docker compose logs -f teto_ai

Resource Requirements

VRAM: 8GB+ for 7B models, 24GB+ for larger models
Memory: 16GB+ RAM recommended
CPU: Modern multi-core CPU
Storage: Fast SSD for model weights (15GB+ per model)
Network: Local network for inter-service communication

🔐 Privacy & Ethics

Data Handling

Conversation Memory: Stored locally, not shared externally
Image Analysis: Processed securely, no permanent storage
Voice Data: Synthesized locally when possible
User Consent: Respects privacy preferences

AI Safety

Content Filtering: Appropriate responses only
Bias Mitigation: Regular personality consistency checks
User Boundaries: Respects individual preferences
Transparency: Clear about AI nature when asked

📚 Documentation

User Guides

Setup Guide - Installation and AI configuration
Interaction Guide - How to talk with Teto
Personality Guide - Understanding Teto's character

Technical Documentation

AI Architecture - AI system design
Vision System - Image and video processing
Voice System - Speech synthesis and recognition
Memory System - Conversation persistence

Development

Contributing - How to extend Teto's capabilities
API Reference - Service interfaces
Troubleshooting - Common issues and solutions

🌟 Roadmap

Phase 1 (Current)

Basic AI conversation
Image analysis
Voice channel joining
Recording capabilities
Voice synthesis integration

Phase 2 (Planned)

Advanced memory system
Custom voice training
Stream watching capabilities
Personality learning/adaptation
Multi-modal conversation

Phase 3 (Future)

Webcam interaction
Game integration
Music generation
Advanced emotional intelligence
Cross-server personality consistency

🤝 Community

Contributing

We welcome contributions to make Teto even better:

AI Personality - Help refine Teto's character
New Capabilities - Add multimedia features
Quality Improvements - Better responses and interactions
Documentation - Help others understand Teto

Ethics & Guidelines

Respect user privacy and boundaries
Maintain appropriate content standards
Preserve Teto's positive, helpful personality
Consider accessibility in all features

📄 License

This project is for educational and community use. Please ensure compliance with:

Discord Terms of Service
AI provider terms and conditions
Local privacy and data protection laws
Intellectual property rights for Kasane Teto character

Version: 3.0.0 (AI-Powered)
AI Stack: Local-First (vLLM, Piper, Whisper) Runtime: Node.js 20+ with Docker

Bring Kasane Teto to life in your Discord server! 🎵✨

For detailed setup and interaction guides, visit the ./docs/ directory.