Member-only story
Run S3 Locally With MinIO for the DVC Machine Learning Pipeline
The cheapest and fastest way to begin to work with object storage.
Object Storage is an abstraction layer above the file system and helps to work with data through an API. The most well-known tool is AWS S3 but there exist a lot of other solutions allowing it to run on a private network. In this article, we will cover one of these tools called MinIO.
In my opinion, MinIO is the cheapest and fastest way to begin to work with object storage. It is compatible with S3, easy to deploy, manage locally, and upscale if needed. If your project is at the early stage MinIO may come in handy.
Install MinIO Server
In this article, we will deploy MinIO and perform some simple tasks on it. There are now a lot of different ways how to run it but we will prefer to build a Docker container and run MinIO on it.
To build the container we will use docker-compose. Here I was inspired by an excellent repo kafka-to-s3 where MinIO was used to mock S3 for the data that was sink from Kafka Connect. The docker-compose.yml
file is defined as follows:
Here we define two services: minio
and aws
. In minio
we define the release version to install and the command to run the server which allows us to access MinIO UI with the exposed port 9000
. We also define aws-cli
image to build the S3 storage locally. We access MinIO with an URL http://minio:9000
and define the region eu-west-1
. We also define the secret keys to log-in UI after running the server.
After saving the manifest we run docker-compose up
in the same directory. After downloading and building images we will see the messages that minio
is up and running and attached to S3. By going on the URL http://localhost:9000
use the login and password created previously we can access the UI.