Finding interesting bits of data in a DataFrame is often easier if you change the rows’ order. You can sort the rows by passing a column name to .sort_values().

In cases where rows have the same value (this is more common if you sort on a categorical variable), you may want to break the ties by sorting on another column. You can sort on multiple columns in this way by passing a list of column names.

Modifying the Order of Columns

You can change the rows’ order by sorting them so that the most interesting data is at the top of the DataFrame.

For example, when we apply sort_values() on the weight_kg column of the dogs DataFrame, we get the lightest dog at the top, Stella the Chihuahua, & the heaviest dog at the bottom, Bernie the Saint Bernard.

Setting the ascending argument to False will sort the data the other way round, from heaviest to lightest dog.

Sorting by Multiple Values

We can sort by multiple variables by passing a list of column names to sort_values. Here, we sort first by weight, then by height. Now, Charlie, Lucy, & Bella are ordered from shortest to tallest, even though they all weigh the same.

To change the direction values are sorted in, pass a list to the ascending argument to specify which direction sorting should be done for each variable. Now, Charlie, Lucy, & Bella are ordered from tallest to shortest.

Example

In the following example, we will sort homelessness by the number of homeless individuals, from smallest to largest, & save this as homelessness_ind. Finally, we will print the head of the sorted DataFrame.

When we run the above code, it outputs the following result:

RELATED LINKS

Data Scientist & Machine Learning Engineer