Better Programming

Advice for programmers.

Follow publication

Member-only story

Build an Article Recommendation Engine With AI/ML

Tyler Hawkins
Better Programming
Published in
6 min readAug 19, 2021
Typewriter
Photo by Markus Winkler on Unsplash

Content platforms thrive on suggesting related content to their users. The more relevant items the platform can provide, the longer the user will stay on the site, which often translates to increased ad revenue for the company.

If you’ve ever visited a news website, online publication, or blogging platform, you’ve likely been exposed to a recommendation engine. Each of these takes input based on your reading history and then suggests more content you might like.

As a simple solution, a platform might implement a tag-based recommendation engine — you read a “Business” article, so here are five more articles tagged “Business.” However, an even better approach to building a recommendation engine is to use similarity search and a machine learning algorithm.

In this article, we’ll build a Python Flask app that uses Pinecone — a similarity search service — to create our very own article recommendation engine.

Demo App Overview

Below, you can see a brief animation of how our demo app works. Ten articles are initially displayed on the page. The user can choose any combination of those ten articles to represent their reading history. When the user clicks the Submit button, the reading history is used as input to query the article database, and then ten more related articles are displayed to the user.

Demo app — article recommendation engine
Demo app — article recommendation engine

As you can see, the related articles returned are exceptionally accurate! There are 1,024 possible combinations of reading history that can be used as input in this example, and every combination produces meaningful results.

So, how did we do it?

In building the app, we first found a dataset of news articles from Kaggle. This dataset contains 143,000 news articles from 15 major publications, but we’re just using the first 20,000. (The full dataset that this one is derived from contains over two million articles!)

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Tyler Hawkins
Tyler Hawkins

Written by Tyler Hawkins

Staff software engineer. Continuous learner. Educator. http://tylerhawkins.info

No responses yet

Write a response