Member-only story
Unzip and Gzip Incoming S3 Files With AWS Lambda
Easier, faster, and better

A while back I encountered a scenario where the incoming S3 files were zipped. Each zipped file contained five text or CSV files. However, for further processing, I needed to extract the zipped content and convert it into gzipped format. Since there was a large influx of files, unzipping and gzipping files manually did not seem to be possible.
Automation
To best way to automate the process seemed to use AWS Lambda Functions. If you head to the Properties tab of your S3 bucket, you can set up an Event Notification for all object “create” events (or just PutObject events). As the destination, you can select the Lambda function where you will write your code to unzip and gzip files.

Now, every time there is a new .zip file added to your S3 bucket, the lambda function will be triggered. You can also add a prefix to your event notification settings, for example, if you only want to run the lambda function when files are uploaded to a specific folder within the S3 bucket.
The lambda function can then look like this:
When an event triggers this lambda function, the function will extract the file key that caused the trigger. Using the file key, we will then load the incoming zip file into a buffer, unzip it, and read each file individually.
Within the loop, each individual file within the zipped folder will be separately compressed into a gzip format file and then will be uploaded to the destination S3 bucket.
You can update the final_file_path
parameter if you want to upload the files in a specific folder. Similarly, you can update parameters like sourcebucketname
and destination_bucket
according to your requirements. You can also upload the gzipped files to the same source bucket.