Professional Portfolio
Posted on May 20, 2025 by David H Sells
I built a hands-free AI assistant that lets you talk to an LLM without touching your keyboard. Speak, wait for silence, and let the AI respond with its synthesized voice. All using JavaScript, WebSockets, and the Web Speech API. Code included!
Ever had that moment when you're elbow-deep in cookie dough and suddenly need to convert tablespoons to milliliters? Or maybe you're changing a tire and need to remember the proper torque settings?
I found myself constantly wanting to talk to AI assistants without having to touch anything. Sure, there are commercial solutions like Alexa and Google Assistant, but I wanted something:
So I built this hands-free LLM interface that uses speech recognition to understand you, sends your question to any LLM, and then speaks the response back to you.
Our speech-powered AI assistant requires three main components:
Let's dive into how we built each part!
There are two main parts to our application:
Our client code (in index.html
) does two critical things:
The challenge with speech recognition isn't getting the words—it's knowing when you're done talking! Our solution uses a silence detection approach that automatically stops listening after you've been quiet for a few seconds.
function handleSpeechResult(event) { // Get the text you've spoken so far const result = event.results[event.results.length - 1][0].transcript; // Reset our silence timer if (silenceTimer) clearTimeout(silenceTimer); // Start a new silence timer - if you stop talking, this will trigger silenceTimer = setTimeout(() => { // You've been quiet long enough, stop listening this.stop(); processFinalSpeech(result); }, CONFIG.silenceTimeout); }
This is genius in its simplicity. Every time you say something, we reset the timer. When you stop talking, the timer counts down and then triggers our processing function.
Once we get the AI's response, we use the browser's built-in speech synthesis to read it aloud:
function speakText(text) { const utterance = new SpeechSynthesisUtterance(text); speechSynthesis.speak(utterance); }
Browser speech synthesis might not sound like Morgan Freeman, but it's surprisingly good these days. And unlike recorded audio, it can say literally anything our AI responds with!
The server part (in index.js
) is the bridge between your voice and the AI's brain. It:
The most interesting part is how we talk to the LLM:
async function queryLLM(ws, message) { try { const response = await got(LLM_API_ENDPOINT, { method: 'POST', headers: { 'content-type': 'application/json', 'Authorization': `Bearer ${API_KEY}`, }, json: { model: LLM_MODEL, messages: [ { role: 'user', content: message } ], max_tokens: 1000 } }); const data = JSON.parse(response.body); const content = data.choices[0].message.content; ws.send(content); } catch (error) { console.error('Error querying LLM:', error); ws.send('Sorry, I encountered an error processing your request.'); } }
This function takes what you said, packages it for the LLM API, and sends the response back to your browser via WebSockets. The beauty of this approach is that it works with virtually any LLM API that follows the standard chat completions format.
Here's what happens when you use this application:
It's like a digital game of telephone, except nothing gets lost in translation!
You can easily adapt this code for your needs:
Building this wasn't all sunshine and JavaScript. Here are some hurdles I overcame:
Voice interfaces are becoming increasingly important. They're not just convenient—they're essential for:
This project demonstrates how relatively simple web technologies can create a powerful hands-free AI assistant. The combination of speech recognition, LLMs, and speech synthesis opens up entirely new ways to interact with artificial intelligence.
By understanding when you've stopped talking, sending your words to an LLM, and speaking the response, we've created a system that feels almost like talking to a real person—except this person knows practically everything (or at least pretends to!).
So next time you're up to your elbows in engine grease, bread dough, or finger paint, remember that your AI assistant is just a few spoken words away!
Full code is available on my GitHub: https://home.davidhsells.ca/Public/voicechat.git
Have you built something similar or have ideas for improvements? Let me know in the comments below!
Tags: #JavaScript #AI #SpeechRecognition #LLM #WebDevelopment #Accessibility