Hello Google Cloud Build with AI community!
Iβm excited to share Flame Audio AI, an open-source, full-stack voice platform powered by Google Generative AI, designed for seamless speech-to-text, text-to-speech, and speaker diarization capabilities in your applications .
Quick Install & Setup
Follow these steps to get up and running locally.
git clone GitHub - Bag-zy/flame-audio
cd flame-audio
npm install
npm run dev
-
Create a .env.local in the project root with your MONGODB_URI, NEXTAUTH_SECRET, and GOOGLE_API_KEY .
-
Open localhost:3000 in your browser to see the live demo .
Key Features
Speech-to-Text: Real-time transcription with multi-speaker support
Text-to-Speech: Natural, human-like voice synthesis
Speaker Diarization: Automatically label whoβs speaking when
Multi-Format & Multi-Language: Supports MP3, WAV, M4A, 50+ languages
Responsive UI: Light/dark mode toggle, mobile-friendly design
Under the Hood (Tech Stack)
Next.js 15 for frontend & API routes
React + TypeScript for UI components
Tailwind CSS, Radix UI, Lucide React Icons for styling
NextAuth.js for authentication, MongoDB + Mongoose for persistence
Google Generative AI powering all speech features
How You Can Help
-
Test & Report Issues: Run the demo and let me know any bugs or performance quirks .
-
Feature Requests: What additional formats, languages, or AI capabilities would you like? .
-
Performance Tips: Suggestions on scaling speech workloads in Vertex AI or Cloud Functions? .
Looking forward to your feedback and collaborationβthanks in advance! ![]()
Repo & Demo: GitHub - Bag-zy/flame-audio