Member-only story
How To Use BigQuery ML on Google Cloud’s Vertex AI
Entertain the idea of BigQuery ML for image classification
Vertex AI Tutorial Series
- A Step-by-Step Guide to Training a Model on Google Cloud’s Vertex AI
- A Step-by-Step Guide to Tuning a Model on Google Cloud’s Vertex AI
- How To Operationalize a Model on Google Cloud’s Vertex AI
- How To Use AutoML on Google Cloud’s Vertex AI
- How To Use BigQuery ML on Google Cloud’s Vertex AI (this article)
- How to Use Pipeline on Google Cloud’s Vertex AI
Background
We’ve covered many Vertex AI services so far — notebook, custom training, hypertune, experiment, dataset, AutoML, model, endpoint, etc. — in previous articles. We’re going to try something new in this article: BigQuery ML. The basic premise of BigQuery ML is very simple. BigQuery is often the centerpiece of many data analytics workflows, and people like to use it as their data warehouse. Since machine learning is all about data, why not bring ML to where the data is?
Concretely, BigQuery ML allows us to create and train models using SQL. We write a simple SQL to tell BigQuery which column is the label, which columns are input features, and what kind of ML tasks we want to perform. BigQuery takes care of the model training and optimization. So it’s sort of like AutoML but without having to explicitly load the data from somewhere else.
We’ll continue our story of image classification on the CIFAR10 dataset, which contains 60,000 32x32 images of ten classes. BigQuery ML is more suited for tabular data. But let’s just entertain the idea of using BigQuery ML for image classification. The point in this article is about the process, not the result.
Data Preparation
Obviously, to use BigQuery ML, we’ll have to load the data into BigQuery first. Let’s do a very simple and dumb thing here: We flatten the images and store the pixel values one per column. The images are of 32x32 resolution and have three color channels (RBG). So the total number of columns per image is 32x32x3=3,072. In addition, we’ll need to store the labels as a…