Fine-tuning GPT-3 Using Python to Create a Virtual Mental Health Assistant Bot

Build your own ChatGPT therapist by learning to fine-tune GPT-3

Amogh Agastya
Better Programming

--

Photo by Yuyeung Lau on Unsplash

Hello again! In my previous article, I outlined the steps required to integrate GPT-3 and Dialogflow, by creating a Virtual Mental Health Assistant. In this one, we will refine the Mental Health Chatbot we created, by learning how to fine-tune our GPT-3 model.

But first, what is fine-tuning?

Finetuning is the process of training a Large Language Model (LLM) to recognize a specific pattern of input and output, that can be applied to any custom NLP task. 🥥 In a nutshell, finetuning allows us to fit our custom dataset onto LLMs, so that the model can generalize its output to our particular task.

Why fine-tune?

Taken from the official docs, fine-tuning lets you get more out of the GPT-3 models by providing:

  • Higher quality results than prompt design
  • Ability to train on more examples than can fit in a prompt
  • Token savings due to shorter prompts
  • Lower latency requests
Finetuning clearly outperforms the model with just prompt design

At a high level, fine-tuning involves the following steps:

1. Preparing the training data

GPT-3 expects your finetune dataset to be in a specific JSONL file format, which looks like this -

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

It’s pretty simple — each line consists of a ‘prompt’ and a ‘completion’ which is the ideal text generated for the particular prompt.

Before preparing and uploading our data for finetuning, the first step is to collect our dataset. For our Mental Health Assistant, I found this great handy dataset on Kaggle — Depression Data for Chatbot. As per the author, the data can be used to train the bot to help people suffering from depression, which is exactly our use case!

The data is a YAML file with roughly 50 conversations about depression. It looks like the first line with the double dash (- -) is the user’s question, and the following lines are the bot’s answers to the user. Let’s explore it further in a Colab notebook.

The exploratory notebook can be found here.

Download the dataset from Kaggle and copy it to your Google drive folder. Import the yaml package to read our .yml file, and use yaml.safe_load() to convert it to a JSON object.

import yaml
with open('/content/gdrive/MyDrive/depression-data/depression.yml', 'r') as file:
data = yaml.safe_load(file)
print(data)

Next, let’s prepare our dataset in the required format.

The above code first iterates through our depression data, splits the first line as the prompt and joins the rest of the lines as the completion. It also removes unwanted special characters from the prompt and completions. This results in the right data format that we can use to fine-tune GPT-3.

{'prompt': 'What Is Depression?', 'completion': 'Depression is a common and serious medical illness that negatively affects how you feel, the way you think and how you act. Fortunately,it is also treatable. Depression causes feelings of sadness and/or a loss of interest in activities you once enjoyed. It can lead to a variety of emotional and physical problems and can decrease your ability to function at work and at home.'},
{'prompt': 'I feel i have let my parents down', 'completion': 'No matter what,your parents will always be proud of you and will love you. You will feel much better if you share your feelings with them.'}

Looking at a few lines, we can see that the completions offer some helpful responses based on the user’s query, which is just what we need our model to pick up on, in order to give practical replies to users in need.

Finally, let’s download our data as a JSONL file and move it to our Colab notebook project directory.

from google.colab import files
with open('depression.jsonl', 'w') as outfile:
for i in output:
json.dump(i, outfile)
outfile.write('\n')
files.download('depression.jsonl')

Now that we have our fine-tune dataset, let’s prepare the file so that it can be uploaded to finetune the model. Note — OpenAI provides a helpful CLI (command line interface) tool to prepare our data. It can easily convert CSV, TSV, XLSX, and JSON to the right JSONL format.

To get started, in the notebook cell or from your project terminal, just install the Open AI python dependency with the command -

!pip install openai

Next, to prepare the fine-tune data, run the command -

!openai tools fine_tunes.prepare_data -f '/content/gdrive/MyDrive/depression-data/depression.jsonl'

Make sure to replace your specific file path after ‘-f’. This command will prepare your final finetune dataset, after adding some recommended changes. For now, you can accept ‘Yes’ to all the recommended actions, as this will help improve model performance —

Important Note — accepting ‘yes’ to the first recommended action requires you to add the separator ‘->’ after your prompt during inference time, otherwise the model will simply continue the prompt instead of predicting the completion.

The final file downloaded will be suffixed with ‘prepared’, which can then be uploaded and used to fine-tune the model.

Do note that, we only have 51 samples for our finetune task. To get better accuracy, Open AI recommends having at least 150–200 finetune examples, as the model accuracy increases linearly with the number of training samples.

Reference — Fine Tune GPT-3 For Quality Results by Albarqawi

2. Training a new fine-tuned model

Now that we have our data ready, it’s time to fine-tune GPT-3! ⚙️ There are 3 main ways we can go about fine-tuning the model —

(i) Manually using OpenAI CLI, (ii) Programmatically using the OpenAI package, and (iii) via the finetune API endpoint.

For this tutorial, I’ll use the OpenAI CLI as it is the easiest to get started with. To fine-tune a model, run the command below, where the value after ‘-m’ will be used to train the specific GPT-3 model. Since I wanted to train the most capable model, I chose ‘davinci’ which is also the most expensive to use.

!openai api fine_tunes.create -t "/content/gdrive/MyDrive/depression-data/depression_prepared.jsonl" -m davinci

You might get an error like — ‘No API key provided’, if you tried running that. That’s because we haven’t linked our API key yet. To do that, just set an environment variable in the CLI to your secret API key by running -

!set OPENAI_API_KEY=<YOUR-API-KEY>

or by running the below cell in a notebook -

os.environ['OPENAI_API_KEY'] = "<YOUR-API-KEY>"

Now run the previous finetune command again and voila! — Your GPT-3 finetune job is now successfully created.

The CLI will also show you the associated fine-tune costs, your queue position, and the training epochs. Once done, your very own fine-tuned GPT-3 model is now ready for inference!

3. Using your fine-tuned model

To use the newly created finetuned model, we can start by testing it first in the Open AI Playground. Head over to the Playground and choose your finetuned model, under model, and in fine-tunes.

From here, we can test out some prompts to see how our finetuned model performs.

The Prompt

In my previous post, we gave the following prompt to the model, which in itself performed quite well —

The following is a conversation with an AI assistant that can have meaningful conversations with users. The assistant is helpful, empathic, and friendly. Its objective is to make the user feel better by feeling heard. With each response, the AI assisstant prompts the user to continue the conversation in a natural way

It worked fine, but I decided to add a few more bits of information, after some additional prompt engineering. Firstly, I gave my AI therapy assistant a persona — JOY. (Apt, eh?😏)

I then added some personality traits to JOY, and finally specified JOY’s core objective function — to help the user feel better by feeling heard. The final prompt I ended up using is —

The following is a conversation with a therapist and a user. The therapist is JOY, who uses compassionate listening to have helpful and meaningful conversations with users. JOY is empathic and friendly. JOY's objective is to help the user feel better by feeling heard. With each response, JOY offers follow-up questions to encourage openness and continues the conversation in a natural way.

The Result

As you can see above, the result was a success. JOY is able to employ human-like traits such as compassionate listening and empathy, and is also able to correctly offer helpful solutions to the user’s issue!

Note — JOY still cannot converse in long conversations, and it needs to be fine-tuned on longer conversations with users, for better context and results.

Once you’re satisfied with the outputs, we can then deploy it in our app. All that needs to be changed is the name of the model, which is to be replaced by our newly trained model. We can now use the fine-tuned model programmatically, in our app. Just hit “Show Code” in the OpenAI Playground to copy the code.

Putting it All Together

Now that our fine-tuned model is ready to use, let’s integrate it with our Dialogflow Agent. To do that, we’ll need to update our previous webhook fulfillment code and deploy it.

Last time, we used Node.js and Repl.it for our web server, but to mix things up a bit, let’s use a python Flask server this time 🐍. Once updated, our webhook endpoint should look something like this. Now, onto deploying our webhook fulfillment server.

Before we proceed, let’s take a moment of silence for the death of Heroku’s free tier💀. Heroku was an easy, go-to platform to deploy any of your web apps. Now that it no longer offers a free plan, we’ll need an alternative to quickly deploy our apps for free — Enter Railway.App!

Railway is a cloud platform where you can easily host and deploy any of your applications. They provide a generous free tier and boast blazing fast deployment speeds. Make sure to push your local repo to Github first, so that it can be deployed. Log into your Railway account and once in the dashboard, go to add a new project. Then choose upload from Github and select your git repo, and that's it! Your app is now live. (Make sure to update the Webhook URL in the Dialogflow agent)

Output

Model exhibits compassionate listening, from the training data
Note that the model has also picked up on the bias, like punctuation errors and inconsistencies from the training data.

You can test the fine-tuned Mental Health Assistant here — amagastya.com/gpt3

Special shout-out to David Shapiro! Make sure to check out his awesome public GPT-3 Finetunes Repo for more readily usable data — https://github.com/daveshap/GPT3_Finetunes

Conclusion

Finetuning GPT-3 and other LLMs, prove to be the perfect solution for domain-specific tasks, often improving performance many-fold. The model is extremely effective at picking up patterns specified in the training data and performs well even with minimal training data.

As LLMs become larger, more accessible, and open-sourced, as we see currently, and in the near future to come, we can expect fine-tuning to become ubiquitous in Natural Language Processing as it has the potential to solve any NLP task under the sun.

Hope you enjoyed learning how to finetune GPT-3 and your Virtual Assistant. See you in the next one.

Follow me on LinkedIn at — https://www.linkedin.com/in/amoghagastya/ and feel free to connect at https://amagastya.com

--

--