Better Programming

Follow publication

Advice for programmers.

Follow publication

Member-only story

3 Pandas Functions To Group and Aggregate Data

Import data and do both simple and multiple aggregations

Published in

Better Programming

7 min readMay 3, 2021

Flowers — Photo by John-Mark Smith on Unsplash.

When you work with data in Python, there is surely a library that will never leave your side: pandas. It’s a pretty powerful and intuitive open source library that provides data structures that are useful for dealing with high-dimensional datasets.

There are two principal data structures:

Series for one-dimensional arrays.
DataFrame for two-dimensional tables that contain rows and columns.

In this article, I will focus on the most useful functions that split the dataset into groups. Then you can compute statistics, such as average, standard deviation, maximum, minimum, and much more.

You’ll learn to utilize the apply, cut, groupby, and agg functions. They can be very useful to have new insights about the data through graphical representations.

Table of contents:1. Import data2. Simple aggregations3. Multiple aggregations

1. Import Data

Let’s import the libraries and the dataset. We’ll use the Boston house prices dataset that is available in the sklearn library.

Table with data

This DataFrame contains only numeric features, but we need categorical variables to split the dataset into groups. Thus, we’ll create these categorical variables using the descriptive statistics of the data set:

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Continue in app

Or, continue in mobile web

Sign up with Google

Sign up with Facebook

Already have an account? Sign in

Published in Better Programming

Last published Nov 10, 2023

Advice for programmers.

Written by Eugenia Anello

Data Scientist | Top 1500 Writer on Medium | Love to share Data Science articles| https://www.linkedin.com/in/eugenia-anello

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Eugenia Anello and Better Programming

Top 5 Python OCR Libraries for Extracting Text from Images

In

TDS Archive

by

Eugenia Anello

Top 5 Python OCR Libraries for Extracting Text from Images

Understand and master OCR tools for text localization and recognition

Jul 25, 2023

How To Update Your Status During Standup Like a Senior Engineer

In

Better Programming

by

Edward Huang

How To Update Your Status During Standup Like a Senior Engineer

A status update is where you can showcase how well you manage ambiguity and is an important way to build trust with your team

Oct 20, 2022

Why I Prefer Regular Merge Commits Over Squash Commits

In

Better Programming

by

Dr. Derek Austin 🥳

Why I Prefer Regular Merge Commits Over Squash Commits

I used to think squash commits were so cool, and then I had to use them all day, every day. Here’s why you should avoid squash

Sep 30, 2022

GitHub repository pushing and pulling to/from the local copy of the GitHub repository

In

Better Programming

by

Eugenia Anello

How To Update Your GitHub Repository in Visual Studio Code

An overview to clone a repository, make push changes, and pull requests in VS code

Mar 4, 2022

See all from Eugenia Anello

See all from Better Programming

Recommended from Medium

This new IDE from Google is an absolute game changer

In

Coding Beauty

by

Tari Ibaba

This new IDE from Google is an absolute game changer

This new IDE from Google is seriously revolutionary.

Mar 11

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

The Data Beast

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

In today’s fast-paced data world, traditional methods are evolving rapidly. In 2025, the fusion of classical statistics, AI, and modern…

Mar 7

A better way to see DataFrames in Jupyter

Paddy Mullen

A better way to see DataFrames in Jupyter

Buckaroo expedites the core task of data work by showing histograms and summary stats with every DataFrame.

Feb 14

Just Stop Writing Python Functions Like This!!!

In

Python in Plain English

by

Kiran Maan

Just Stop Writing Python Functions Like This!!!

I just reviewed someone else’s code and I was just shocked.

Jan 19

The 1-Minute Introduction That Makes People Remember You Forever

In

Psyc Digest

by

Alessia Fransisca

The 1-Minute Introduction That Makes People Remember You Forever

A Behavioral Scientist’s Trick to Hack the “Halo Effect”

Mar 19

How to Become a Top 1% Data Analyst in 2025

Uttam Kumar

How to Become a Top 1% Data Analyst in 2025

The field of data analytics is growing at an unprecedented rate. With businesses relying more on data-driven decision-making, the demand…

Mar 30

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech