Python Select Columns

If we have a DataFrame & would like to access or select a specific few rows/columns from that DataFrame, we can use square brackets or other advanced methods such as loc & iloc.

Selecting Columns Using Square Brackets

Selecting a Column

Checking the Type of the Object

Let’s check the type of the object that gets returned with the type function.

As we can see from the above output, we are dealing with a pandas series here! Series could be thought of as a one-dimensional array that could be labeled just like a DataFrame.

If we want to select data & keep it in a DataFrame, we will need to use double square brackets:

If we check the type of this output, it’s a DataFrame! With only one column, though.

Selecting Multiple Columns

We can extend this call to select two columns. Let’s try to select country & capital.

If we look at this closely, we are actually putting a list with column labels inside another set of square brackets & end up with a sub DataFrame containing only the country & capitalcolumns.

Selecting Rows Using Square Brackets

Example

We can only select rows using square brackets if we specify a slice, like 0:4. Also, we’re using the integer indexes of the rows here, not the row labels!

To get the second, third, & fourth rows of brics DataFrame, we use the slice 1 through 4. Remember that end the of the slice is exclusive, & the index starts at zero.

These square brackets work, but they only offer limited functionality. Ideally, we would want something similar to 2D Numpy arrays, where we also use square brackets. The index, or slice, before the comma refers to the rows, & the slice after the comma refers to the columns.

Example of 2D Numpy array:

If we want to do something similar with pandas, we need to look at using the loc & ilocfunctions.

  • loc: label-based
  • iloc: integer position-based

loc Function

To achieve this, we will put the label of interest in square brackets after loc.

Selecting Rows

We get a pandas series containing all of the rows information; inconveniently, though, it is shown on different lines. To get a DataFrame, we have to put the RU sting in another pair of brackets. We can also select multiple rows at the same time. Suppose we want to also include India & China. Simply add those row labels to the list.

The difference between using a loc & basic square brackets is that we can extend the selection with a comma & a specification of the columns of interest.

Selecting Rows & Columns

Let’s extend the previous call to only include the country &capital columns. We add a comma & list the column labels we want to keep. The intersection gets returned.

We can also use loc to select all rows but only a specific number of columns. Simply replace the first list that specifies the row labels with a colon. A slice going from beginning to end. This time, we get back all of the rows but only two columns.

Selecting All Rows & Specific Columns

iloc Function

Selecting Rows

Let’s use the same data & similar examples as we did for loc. Let's start by getting the row for Russia.

To get the rows for Russia, India, & China. We can now use a list of index 1, 2, 3.

Selecting Rows & Columns

Similar to loc, we can also select both rows & columns using iloc. Here, we will select rows for Russia, India, & China and columns country & capital.

Selecting All Rows & Specific Columns

Finally, if we wanted to select all rows but just keep the country & capital columns, we can:

loc & iloc functions are pretty similar. The only difference is how we refer to columns & rows.

Interactive Example on Selecting a Subset of Data

The single bracket version gives a Pandas Series; the double bracket version gives a Pandas DataFrame.

  • We will use single square brackets to print out the country column of cars as a Pandas Series.
  • Then use double square brackets to print out the country column of cars as a Pandas DataFrame.
  • Finally, use the double square brackets to print out a DataFrame with both the country & drives_right columns of cars, in this order.

When we run the above code, it produces the following result:

RELATED LINKS

Data Scientist & Machine Learning Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store