Transforming DataFrames with Apply and Lambda Functions

Create a DataFrame with a "numbers" column from 10,000 to 10,010 and transform it using apply with functions (like add_five and add_sets), then chain them or replace with lambda functions for concise transformations.

Learn how to efficiently transform data in a Pandas DataFrame using the apply function to run custom transformations on each item in a column. This article walks through practical examples using numeric ranges and both named and anonymous functions to streamline data manipulation tasks.

Key Insights

  • Use the Pandas apply method to apply a function to each value in a DataFrame column, returning a newly transformed column.
  • Chain multiple apply calls (e.g., .apply(add_five).apply(add_sets)) to perform sequential transformations on a single column in a clean and readable way.
  • Noble Desktop demonstrates how to simplify code further by using lambda (anonymous) functions within apply, reducing the need for multiple named functions.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Let's create a simple data frame. Actually, the very first thing I want to do is remove these prints. It makes your output very challenging if you leave prints from random steps back in.

Okay, let's create a data frame with a column of numbers in the range 10,000 to 10,010, and you'll see why. I mean, the why is so that we can practice with some apply. All right, so I'm just going to call it df.

There's nothing particular here. There's no real data here. Pandas.dataframe, and we're going to pass in a simple dictionary to define our one column.

Our one column is numbers, and the value for it is a list made from the range 10,000 to 10,011, and of course, range is exclusive, so this will give us 10,000 to 10,010, and then we can just evaluate and see what df is. Great. It's a data frame with a numbers column that goes from 10,000 to 10,010, so now what we want to do is we're going to use apply to apply both functions to this one column, to the column df at numbers, so we're going to say df at numbers equals df at numbers.apply, and what apply will do is it will run a given function, a function you give it, to every single value in that column, so you call dot apply in a column.

It'll return to you a new column where you've called that function on every single value, and it builds up this new column for you, so it'll build up a great big column here, and then we're going to say, hey, df at numbers is now that column you've built up, so we can pass in here add five. If I run this and then maybe examine df again, here it is. Every single number in the numbers column has had the add five function run on it, so it takes the return value from calling add five on every item in this column and builds up a new column, and then we say df at numbers, it's that column.

We can also do that with the next one. Df at numbers equals df at numbers dot apply add sets. Now yours will be a little different from mine because of the randomization here, but here we go.

Now we have run the add five function on each one and the add sets function on each one, and it's gone through and taken the return value from calling add five, built up a new column, and then taken that column and applied add sets to it. You also can, if you have multiple steps like this, do the nice thing of saying dot apply add five dot apply add sets because this evaluates to a new column. Well great, that's a column we can call dot apply on, and that's a nice way to do it.

And if you're applying multiple different sets of predefined transformations to your data, this makes it very easy to do that. Okay, now that we've done that, our next step will be to use anonymous lambdas to throw that in here and not have to come up with function names at all or waste lines of code on them.

Yelp Facebook LinkedIn YouTube Twitter Instagram