Finding interesting bits of data in a DataFrame is often easier if you change the rows’ order. You can sort the rows by passing a column name to
In cases where rows have the same value (this is more common if you sort on a categorical variable), you may want to break the ties by sorting on another column. You can sort on multiple columns in this way by passing a list of column names.
Modifying the Order of Columns
You can change the rows’ order by sorting them so that the most interesting data is at the top of the DataFrame.
For example, when we apply
sort_values() on the
weight_kg column of the dogs DataFrame, we get the lightest dog at the top, Stella the Chihuahua, & the heaviest dog at the bottom, Bernie the Saint Bernard.
ascending argument to False will sort the data the other way round, from heaviest to lightest dog.
Sorting by Multiple Values
We can sort by multiple variables by passing a list of column names to
sort_values. Here, we sort first by weight, then by height. Now, Charlie, Lucy, & Bella are ordered from shortest to tallest, even though they all weigh the same.
To change the direction values are sorted in, pass a list to the ascending argument to specify which direction sorting should be done for each variable. Now, Charlie, Lucy, & Bella are ordered from tallest to shortest.
In the following example, we will sort
homelessness by the number of homeless individuals, from smallest to largest, & save this as
homelessness_ind. Finally, we will print the head of the sorted DataFrame.
When we run the above code, it outputs the following result: