Member-only story
Build an Alexa- or Siri-Equivalent Bot in Python Using OpenAI
A step-by-step guide to building an AI assistant
Table of ContentsIntroduction
APIs and Design
Code Breakdown
Conclusion
Introduction
It’s not as difficult as you think to build an AI program that listens to speech and answers questions. We can make the magic happen in an afternoon by leveraging a few python packages and APIs.
Here’s some back and forth questions and answers between me and the bot:
Me: What is the population of Brazil?
Bot: The population of Brazil is estimated to be over 209 million people
Me: What does ice cream taste like?
Bot: Most ice cream tastes sweet and creamy
Me: How do I use chopsticks?
Bot: To use chopsticks, you must hold one in each hand. Place the chopstick in your dominant hand between your thumb and index finger, and hold it in place with your middle finger. Place the chopstick in your non-dominant hand between your thumb and index finger, and hold it in place with your ring finger and pinky finger. To pick up food, use the chopstick in your dominant hand to hold the food in place, and then use the chopstick
Granted, these aren’t the most enlightening answers. And that chopstick one at the end is a bit strange, lol. However, the fact that this application can interpret speech and answer questions, no matter how seemingly limited, is amazing in my opinion. And unlike the mainstream AI assistance bots, we can see what’s under the hood here and play around with it.
What This Program Does
- Run the file via the command prompt when the user is ready to ask a question
- Pyaudio enables the computer mic to pick up speech data
- Audio data is stored in a variable called ‘stream,’ then encoded and transformed into JSON data
- JSON data is sent to AssemblyAI API to be converted to text. Text data is then sent back
- Text data is sent to OpenAI API to be channeled into the text-davinci-002 engine for processing