Hi TypingMind Team,
My name is Robert, and I am an AI Engineer and CTO at a biotech startup. I have been using TypingMind and really appreciate the solid tool you’ve built! I am writing to share a comprehensive feature proposal that I believe could elevate your platform to the next level, transforming it into a complete, multimodal AI hub.
While I love the current integrations with OpenAI and ElevenLabs, I highly recommend expanding the platform's capabilities by integrating Google's latest preview models. Adding this suite of features would be a massive differentiator for power users, developers, and researchers.
Here is my proposed suite of additions:
1. Google's TTS and STT Models
Integrating Google's new audio models as options for both Text-to-Speech and Speech-to-Text would provide users with more versatility alongside existing APIs. I suggest looking into:
gemini-2.5-pro-preview-tts
gemini-3.1-flash-tts-preview
2. Video and Music Generation Models
To make the platform a unified creative powerhouse, incorporating Google's media generation capabilities would be incredible. Specifically:
Video Generation: veo-3.1-fast-generate-preview and veo-3.1-generate-preview
Music Generation: lyria-3-clip-preview and lyria-3-pro-preview
3. Real-Time Conversational Mode (Voice/Video)
I highly recommend adding a real-time, low-latency conversational mode (similar to the live modes in the ChatGPT and native Gemini apps). Using Google's gemini-3.1-flash-live-preview model would vastly improve the user experience for brainstorming, accessibility, and natural, fluid interactions.
4. Computer Use for UI Interactions
Integrating agentic capabilities like Google's Computer Use (gemini-2.5-computer-use-preview-10-2025) would allow TypingMind to assist users with browser automation (this could be made via a Chrome Extension, for example), multi-step software tasks, and precise UI control directly within their workflows.
5. Deep Research Capabilities
Finally, adding the Deep Research model (deep-research-pro-preview-12-2025) as a tool or plugin to Gemini Models would be a game-changer for academic and professional users. It would allow the platform to conduct autonomous, multi-step investigations, synthesizing complex information into comprehensive, properly cited reports based on web sources and workspace data.
Thank you for your time and for continuously improving the platform! I’d love to hear your thoughts on these potential additions. I look forward to seeing how TypingMind evolves.
Best regards,
Robert Sousa Santos
Please authenticate to join the conversation.
Open
Feature Request
9 days ago
Get notified by email when there are changes.
Open
Feature Request
9 days ago
Get notified by email when there are changes.