Introduction
VoiceForm is a modern, voice-first feedback collection platform built through iterative development with Emergent. Starting from a simple concept, we evolved it into a production-ready application with voice recording, AI transcription, conditional logic, and a polished UI inspired by modern SaaS design.
Final Tech Stack:
Frontend: React 18, Tailwind CSS, shadcn/ui components
Backend: FastAPI (Python), Motor (async MongoDB driver)
Database: MongoDB
AI Integration: OpenAI Whisper via Emergent LLM key
Voice Recording: MediaRecorder API with real-time audio visualization
Drag & Drop: @hello-pangea/dnd (after troubleshooting React 18 compatibility)
Problem Statement and Proposed Solution
Our Vision
We wanted to solve a common problem with traditional feedback forms:
Problems with Current Forms:
Typing long responses is tedious and time-consuming
Users abandon forms due to fatigue
Mobile typing is especially frustrating
Rich context and tone are lost in text
The Proposed Solution:
A Typeform-like platform where users simply speak their answers instead of typing. Key requirements:
Seamless voice recording interface
Interactive visualizer showing audio input
Landing page highlighting ease of use
No more typing long answers - just speak and move on
Learning Outcomes
What Was Actually Built Through This Process:
Requirement Gathering: Learning to ask the right clarifying questions before coding
AI Service Integration: Working with OpenAI Whisper through integration playbooks via Emergent
UI/UX Design: Translating design references into production code
Edge Case Handling: Dealing with transcription quirks and boundary conditions
Step 0: Prerequisites
What We Had
Access to Emergent platform (FastAPI + React + MongoDB stack pre-configured)
Basic project structure with problem statement idea
What Was Needed
Emergent LLM key (for OpenAI Whisper - agent managed this)
Understanding of desired features
Design reference for UI improvements
Step 1: Initial Prompt and Requirement Gathering
Our First Message
Agent's Response: Clarification Questions
"Before I start building, I need to clarify a few details:
Voice-to-Text Integration - Which provider would you prefer? • OpenAI Whisper (latest model) • Google Speech-to-Text • Other preference?
Core Features - What functionality do you need? • Form builder (create questions, configure forms) • Response collection & viewing • Form sharing (unique links) • Analytics/insights dashboard
Voice Experience - What should happen with voice recordings? • Convert to text only (transcription) • Store audio + text • Play back audio recordings
Authentication - Do you need user accounts? • Yes, for form creators • No authentication needed (simple/open)
API Keys - Should I use Emergent LLM key for OpenAI integrations?"
Our Answers
OpenAI Whisper
All of them as well as the form link to try out as a responder
Play back audio and show transcript to form creator
Yes for form creators
Use Emergent LLM
Step 2: UI Improvement Request
Our Request
We provided:

Photo Credit: Dribble
Agent's Analysis of Reference
Agent analyzed the reference image and identified:
Light pastel backgrounds with vibrant accent
Larger, bolder headlines with yellow highlights
Generous whitespace
Bright pill-shaped CTAs (not purple gradients)
Before/After comparison sections
Card-based feature layouts
Professional yet playful tone
Implementation: Complete UI Redesign
Landing Page Changes
New Sections Added
Dashboard Improvements
Step 3: First Major Bug - Audio Playback
Our Report
During testing through preview, we noticed that the audio does not work. Here's what we got as an error popup:
Agent's Response
Step 4: Feature Request - Enhanced Question Types
Our Request
Agent's Response
Common Issues
This app may run into issues as outlined above during testing, and that is why previewing the app is so important before you launch. For every issue, try to:
Describe the issue as you see it to the agent; what part of the app is it breaking?
Take a screenshot of the issue and highlight the section in question
Press F12 and consult the Console and Network tabs - if you see any errors marked in red, flag them to the agent
Deploying Your App
Now that your app is ready, you should deploy it and keep it as always accessible via our dedicated hosting. Here's the steps in brief:
Preview Your App
Click Preview in the Emergent interface.
Test UI, responsiveness, interactivity, and functionality.
Fix any errors using the agent if needed.
Run Pre-Deployment Health Check
Use Emergent’s automated readiness checker.
Review and resolve any flagged issues before deployment.

Deploy the Application
Click Deploy, then Deploy Now.
Wait for deployment to finish (typically ~15 minutes).
Emergent provides a publicly accessible production URL once complete.

Wrapping Up:
And that's it! You now have an AI enabled, voice based feedback collection tool with a detailed form builder as well as conditional branching of questions. However, there's more to be done. Here are some enhancements you can apply to this app to take it to the next level:
Dashboard with charts (completion rates, average time)
Question-level analytics (skip rates, response length)
Export responses to CSV/PDF
Webhook integration for real-time notifications
Form templates library
Team collaboration (share forms with team members)
Check out a preview of our finished app down below:
We can't wait to see what you build. Share your creations with the community on X or Discord.

