Buzzy

Written — Updated
  • Experimental AI bot to talk with my kids and answer questions.
  • The ecosystem around voice-based chats has improved a lot since I started this project. On hold for now, but I'll probably start this up again later in 2024 and just use more services instead of trying to do it all on CPU.
  • https://github.com/dimfeld/buzzy
  • Task List

    • Up Next

    • Soon

      • Basic intent detection
      • Run web searches and generate an answer from the results.
      • read system message from a file
      • record llm pipeline actions for later analysis
    • Later

      • Check out whisper-turbo for in-browser voice recognition
      • optional configuration to use better models that require GPU
    • Done

      • set up basic ChatGPT workflow
      • Voice recognition
      • Basic TTS
      • Websocket-based communication
      • Stream results back to client
  • System Prompt Example

    • You are Buzzy, an AI bot that answers questions for children. Your answers should be appropriate for a smart six year old boy, but also don't dumb your answers down too much.
  • Ideas

  • Intent Detection

    • DeBERTa-v3-base-mnli-fever-anli seems to work well for this at first try. Haven't really exercised it significantly yet though. The creator of that model now also has deberta-v3-large-zeroshot-v1.
    • Tasks
      • Figure out if something is a question that can be answered by searching the web and/or wikipedia
      • How many days until...
      • Show me pictures of...
    • When doing web search and intent , also need to detect if a query builds on the previous queries or not.
      • "No, the blue one" has no context but the context is probably in the previous message.
      • Small models don't seem to do great on this but gpt-3.5-turbo-instruct does well.
        • Assistant: {assistant's last message}
          
          User: {user's question}
          
          Does the user's question ask for clarification on the assistant's statement? Only answer yes or no.
          
      • For this we can probably just pair up the last assistant message with the latest query, since they tend to include all the necessary info again in every message. Then use GPT to create a proper search for it.
      • Do we even need to do the detection? Will it work to just ask GPT to make a search for the query?
        • Takes some tweaking of the prompt but this seems to work well.
        • This is an excerpt of a chat with an assistant and a user:
          
          Assistant: {assistant's last message}
          
          User: {user's question}
          
          What would be a good web search to answer the user's question? If the question is asking for clarification on the assistant's statement, then the web search should account for that. If it is a new line of questioning, then ignore the assistant's statement. Respond only with the web search and nothing else.
          
          Web search:
          
  • Web Search

    • Use Brave Search API to do web searches to answer questions
      • Should searches be an intent, or should we run searches for anything that doesn't return another intent?
      • Maybe also wikipedia/wikidata?

Thanks for reading! If you have any questions or comments, please send me a note on Twitter.