Build a Support Bot From Your Company’s Knowledge Base With Python and OpenAI

Published in

Better Programming

6 min readApr 29, 2023

Introduction

In today’s fast-paced digital world, customer support has become an essential element of any successful business. As a result, organizations are consistently looking for ways to improve their customer service offerings and maximize customer satisfaction. One of the most effective strategies for achieving this is by leveraging artificial intelligence (AI) to create support bots that can quickly and accurately address customer inquiries.

In this blog, we will show you how to build a powerful support bot using your company’s knowledge base and ChatGPT by OpenAI. This innovative combination will revolutionize your customer support experience, providing immediate, accurate, and engaging responses to your customer’s questions and concerns.

We’ve made setting up your support bot quick and easy with our pre-built implementation on GitHub. Simply edit the company name and add your documents to the documents folder at Github Repo. Get started effortlessly and enhance your customer support experience with this powerful AI-driven solution.

GitHub - onlyoneaman/chatgpt-support-bot: A Support Bot based on your company's knowledge base and…

This is a support bot based on the ChatGPT by onlyoneaman which provides information to your customers based on your…

github.com

Get Started

Install Requirements

Create a new directory qna-app, we will run our app here. . Run the below lines in your terminal to install the requirements,

pip3 install flask openai python-dotenv glob2 numpy

Adding OpenAI Key

Create a .env file. Replace YOUR_OPENAI_KEY_HERE with your OpenAI Api Key

OPENAI_KEY=YOUR_OPENAI_KEY_HERE

Adding Documents to Search From

Now, Create a directory named documents and add the files you want to search from. Files should be in .txt format.

For now, you can download this file documents, and unzip it in your project root folder. The folder contains 3 files including About Us, and 2 Blog Pages from dreamboat.ai.

Create Embeddings

GPT has a character limit, and you cannot pass all your data every time to it. To tackle this problem, we are creating embeddings.

Consider Embeddings as numbers assigned for each word, where these numbers contain the true meaning of the word. Later, we will create embedding for the question asked by the user, find text blocks that have similar meaning or contain relevant data to the question asked by the user, and pass only that to GPT.

Create an embed_text.py file and paste the below contents in it:

import openai
import os
import csv
import glob
from dotenv import load_dotenv


load_dotenv()

text_array = []
api_key = os.environ.get('OPENAI_KEY')
openai.api_key = api_key
dir_path = os.path.join(os.getcwd(), 'documents')
dir_full_path = os.path.join(dir_path, '*.txt')
embeddings_filename = "embeddings.csv"

# Loop through all .txt files in the /training-data folder
for file in glob.glob(dir_full_path):
    # Read the data from each file and push to the array
    # The dump method is used to convert spacings into newline characters \n
    with open(file, 'r') as f:
        text = f.read().replace('\n', '')
        text_array.append(text)

# This array is used to store the embeddings
embedding_array = []

if api_key is None or api_key == "YOUR_OPENAI_KEY_HERE":
    print("Invalid API key")
    exit()

# Loop through each element of the array
for text in text_array:
    # Pass the text to the embeddings API which will return a vector and
    # store in the response variable.
    response = openai.Embedding.create(
        input=text,
        model="text-embedding-ada-002"
    )

    # Extract the embedding from the response object
    embedding = response['data'][0]["embedding"]

    # Create a Python dictionary containing the vector and the original text
    embedding_dict = {'embedding': embedding, 'text': text}
    # Store the dictionary in a list.
    embedding_array.append(embedding_dict)

with open(embeddings_filename, 'w', newline='') as f:
    # This sets the headers
    fieldnames = ['embedding', 'text']
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()

    for obj in embedding_array:
        # The embedding vector will be stored as a string to avoid comma
        # separated issues between the values in the CSV
        writer.writerow({'embedding': str(obj['embedding']), 'text': obj['text']})

print("Embeddings saved to:", embeddings_filename)

Now, we have created a function that will be used to create embeddings, lets start

Create the embedding by running the below command in your terminal

python embed_text.py

If you have done everything right, you will see a Embeddings saved to embeddings.csv message on your screen and a file embeddings.csv will be created in your project root folder.

We now have the embeddings, let's create a function that takes in a question and answer the questions using our provided information.

Answer Question

Create a file answer.py and paste the below contents in the file, replace the company name on top with the company name whose knowledge base you are using

import json
import openai
import csv
import os
from dotenv import load_dotenv

load_dotenv()

embeddings_filename = "embeddings.csv"
company_name = "Dreamboats.ai"


def calculate_similarity(vec1, vec2):
    # Calculates the cosine similarity between two vectors.
    dot_product = sum([vec1[i] * vec2[i] for i in range(len(vec1))])
    magnitude1 = sum([vec1[i] ** 2 for i in range(len(vec1))]) ** 0.5
    magnitude2 = sum([vec2[i] ** 2 for i in range(len(vec2))]) ** 0.5
    return dot_product / (magnitude1 * magnitude2)


def chat():
    start_chat = True
    while True:
        openai.api_key = os.environ.get('OPENAI_KEY')
        if start_chat:
            print("Welcome to the", company_name, "Knowledge Base. How can I help you?")
            start_chat = False
            print("Type 'quit' to exit.")
        else:
            print("Any Other Questions?")
        question = input("> ")
        if question == "quit":
            break

        # Exit the loop if the user presses enter without typing anything
        if not question:
            break

        response = openai.Embedding.create(
            model="text-embedding-ada-002",
            input=[question]
        )

        try:
            question_embedding = response['data'][0]["embedding"]
        except Exception as e:
            print(e.message)
            continue

        # Store the similarity scores as the code loops through the CSV
        similarity_array = []

        # Loop through the CSV and calculate the cosine-similarity between
        # the question vector and each text embedding
        with open(embeddings_filename) as f:
            reader = csv.DictReader(f)
            for row in reader:
                # Extract the embedding from the column and parse it back into a list
                text_embedding = json.loads(row['embedding'])

                # Add the similarity score to the array
                similarity_array.append(calculate_similarity(question_embedding, text_embedding))

        # Return the index of the highest similarity score
        index_of_max = similarity_array.index(max(similarity_array))

        # Used to store the original text
        original_text = ""

        # Loop through the CSV and find the text which matches the highest
        # similarity score
        with open(embeddings_filename) as f:
            reader = csv.DictReader(f)
            for rowno, row in enumerate(reader):
                if rowno == index_of_max:
                    original_text = row['text']

        system_prompt = f"""
You are an AI assistant. You work for #{company_name}. You will be asked questions from a
customer and will answer in a helpful and friendly manner.

You will be provided company information from #{company_name} under the
[Article] section. The customer question will be provided under the
[Question] section. You will answer the customers questions based on the
article. Only provide the answer to the query don't respond with completed part of question.
Answer in points and not in long paragraphs

If the users question is not answered by the article you will respond with
'I'm sorry I don't know.
'
"""

        question_prompt = f"""
        [Article]
        {original_text}
        
        [Question]
        {question}
        """

        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[
                {
                    "role": "system",
                    "content": system_prompt
                },
                {
                    "role": "user",
                    "content": question_prompt
                }
            ],
            temperature=0.2,
            max_tokens=2000,
        )

        try:
            answer = response['choices'][0]['message']['content']
        except Exception as e:
            print(e.message)
            continue

        print("\n\033[32mSupport:\033[0m")
        print("\033[32m{}\033[0m".format(answer.lstrip()))
    print("Goodbye! Come back if you have any more questions. :)")


chat()

That’s It. Run the below command in terminal

python answer.py

You can now ask AI any questions about your company and it will reply back with relevant information provided by you. We have also restricted it to answer very general or irrelevant questions not present in the documents like What is capital of USA?

Adding Company Knowledge Base

You can add documents to the knowledge base by adding them to the documentsfolder.
Files should be in .txt format.
After adding the documents, you need to run the embed_text.py script to create embeddings for the documents.

Change the company name variable in answer.py.

Let’s connect if we haven’t yet on Twitter, LinkedIn.

References

GitHub - onlyoneaman/chatgpt-support-bot: A Support Bot based on your company's knowledge base and…

This is a support bot based on the ChatGPT by onlyoneaman which provides information to your customers based on your…

github.com

GitHub - openai/openai-cookbook: Examples and guides for using the OpenAI API

The OpenAI Cookbook shares example code for accomplishing common tasks with the OpenAI API. To run these examples…

github.com

Better Programming

Build a Support Bot From Your Company’s Knowledge Base With Python and OpenAI

Introduction

GitHub - onlyoneaman/chatgpt-support-bot: A Support Bot based on your company's knowledge base and…

This is a support bot based on the ChatGPT by onlyoneaman which provides information to your customers based on your…

Get Started

Install Requirements

Adding OpenAI Key

Adding Documents to Search From

Create Embeddings

Answer Question

Adding Company Knowledge Base

References

GitHub - onlyoneaman/chatgpt-support-bot: A Support Bot based on your company's knowledge base and…

This is a support bot based on the ChatGPT by onlyoneaman which provides information to your customers based on your…

GitHub - openai/openai-cookbook: Examples and guides for using the OpenAI API

The OpenAI Cookbook shares example code for accomplishing common tasks with the OpenAI API. To run these examples…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Better Programming

Written by Aman Kumar

No responses yet