Better Programming

Follow publication

Advice for programmers.

Follow publication

Build a Spam Checker With OpenAI and GPT-3

A simple tutorial to create a spam filter using the fine-tuned API

Paulo Taylor

Published in

Better Programming

3 min readNov 7, 2022

The OpenAI API by default gives you different AI models or engines that are suited for different cases.

The fine-tuning feature allows taking OpenAI’s models/engines and supply them with new training data and build a new “fine-tuned” model
It’s this fine-tuning feature that we’ll use to build our spam checker.

But first, to build our fine-tuned model we’ll need some training data.
At Call Assistant we can use the data from existing robocalls that our users have screened.

We’ll need examples of telemarketers and robocalls as well as legitimate calls so that the model can better classify text as spam or not spam

Here’s an example of a telemarketer trying to sell a user some kind of credit solution:

Hi there, this is Sarah again with the credit Pros. I’ve tried calling you a few times with no luck on connecting. We’ve helped hundreds of thousands of people improve their credit, and we’d love to help you as well. So give me a call back at this number as soon as you can. Looking forward to chatting. Thanks.

And here’s an example of a legitimate call

Hi, I'm Spencer from a children's dentist Dr. Porter's office regarding Mr. Smith. He's gonna be due for his hygiene visit next month. Do you like to schedule that appointment? Please? Give us a call at xxx–xxx–xxx. Once again, our phone number is xxx-xxx-xxx. Have a wonderful day. Bye.

We’ll need to add a separator between the prompt and the result. For this example, we’ll use \n\n###\n\n as suggested by OpenAI in its tutorials. Using this data we’ll need to upload a JSONL file with the data formatted as required. Here’s a sample of that. You should add as many examples as possible.

After compiling the file we need to start the fine tuning process. You’ll need a lot of data, the OpenAI CLI and an API key

openai api -k sk-YOUR_KEY fine_tunes.create -t file.jsonl -m ada

It may take some time until the fine-tuning is complete depending on the model and the amount of your training data. To track the progress you can use this command:

openai api -k sk-YOUR_KEY fine_tunes.follow -i ft-aBcDeFgHiJkLmNoP
...
[18:26:40] Fine-tune enqueued. Queue number: 0
[18:26:40] Fine-tune is in the queue. Queue number: 0
[18:29:14] Fine-tune started
[18:30:52] Completed epoch 1/4
[18:32:11] Completed epoch 2/4
[18:33:30] Completed epoch 3/4
[18:34:49] Completed epoch 4/4
[18:35:08] Uploaded model: ada:ft-x-xxxx-xx-xx
[18:35:09] Uploaded result file: file-aBcDeFgHiJ
[18:35:09] Fine-tune succeeded

We now have our new model ready to use. In the following example, I’m using a similar sentence and the engine will classify the text as spam:

openai api -k sk-YOUR_KEY completions.create -m ada:ft-x-xxxx-xx-xx -M 4 -p "Hello, this is John from Finance Plus. I've called before,  We've helped other individuals like you improve their credit. Please give me a call later.###"

The reply would be something like this:

Hello, this is John from Finance Plus. I've called before,  We've helped other individuals like you improve their credit. Please give me a call later.###spam

If you use Java you can try something like this

Performance wise it seems to take about 500–900 milliseconds to execute the completion API but from my experience the more you use it the faster it becomes.

Using this approach with AI and GPT-3 we’re able to scan messages for spam while screening calls and notify our Call Assistant users in real time that they’re in presence of a spam call.

Thanks for reading.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Better Programming

221K Followers

Last published Nov 10, 2023

Advice for programmers.

Written by Paulo Taylor

388 Followers

471 Following

Passionate about technology

Responses (2)

Write a response

What are your thoughts?

Also publish to my profile

Morgan

Mar 16, 2023

How does it do out of the box?

Rodavitt

Feb 28, 2023

how much data do you recommend for such an implementation?

More from Paulo Taylor and Better Programming

Google Places API alternatives, a guide to save you big money with GenAI

Paulo Taylor

Google Places API alternatives, a guide to save you big money with GenAI

The Google Maps/Places API pricing is simply outrageous and no one is talking about it

May 31, 2024

How To Update Your Status During Standup Like a Senior Engineer

Better Programming

Edward Huang

How To Update Your Status During Standup Like a Senior Engineer

A status update is where you can showcase how well you manage ambiguity and is an important way to build trust with your team

Oct 20, 2022

4.5K

Why I Prefer Regular Merge Commits Over Squash Commits

Better Programming

Dr. Derek Austin 🥳

Why I Prefer Regular Merge Commits Over Squash Commits

I used to think squash commits were so cool, and then I had to use them all day, every day. Here’s why you should avoid squash

Sep 30, 2022

Twilio Streams + NodeJS + Websockets + Redis

Paulo Taylor

Twilio Streams + NodeJS + Websockets + Redis

Streaming audio between Twilio and clients using Websockets and Redis

Oct 11, 2024

See all from Paulo Taylor

See all from Better Programming

Recommended from Medium

This new IDE from Google is an absolute game changer

Coding Beauty

Tari Ibaba

This new IDE from Google is an absolute game changer

This new IDE from Google is seriously revolutionary.

Mar 11

3.4K

197

Build Smarter AI Agents in Minutes — For Less Than $0!

Mr. Plan ₿ Publication

Ashen Thilakarathna

Build Smarter AI Agents in Minutes — For Less Than $0!

Discover the Secret Tool (MCP) That’s Revolutionizing AI Development — No Coding Expertise Needed!

6d ago

1.3K

Paying for software is stupid ..Open-Source tools to Destroy Your SaaS Expenses

Dipanshu ‎

Paying for software is stupid ..Open-Source tools to Destroy Your SaaS Expenses

These 40 Open-Source Tools Will Make Your SaaS Subscriptions Look Obsolete

Mar 26

1.7K

RAG-Based AI for Personalized Job Matching

Scott Stempak

RAG-Based AI for Personalized Job Matching

Retrieving Relevant Postings and Generating Tailored Cover Letters from Resumes

Mar 29

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

12.1K

298

I tested out all of the best language models for frontend development. One model stood out.

CodeX

Austin Starks

I tested out all of the best language models for frontend development. One model stood out.

A Side-By-Side Comparison of Grok 3, Gemini 2.5 Pro, DeepSeek V3, and Claude 3.7 Sonnet

Mar 28

314

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech