pip is a standard package manager used to install & maintain packages for Python. The Python standard library comes with a collection of built-in functions & built-in packages.

Data science packages like scikit-learn & statsmodel are NOT part of the Python standard library. They can be installed through pip, the standard package manager for Python, via the command line.

Pip Documentation

Pip has a variety of commands & option flags designed to manage Python packages.

We can print the pip version the same way we print the Python version. It is important that the pip version is compatible with the Python version. Here we see that pip 19.1.1 is compatible with Python 3.5.2.

Upgrading Pip

If pip is giving us an upgrade warning, we can upgrade using pip itself:

Upgrade pip using itself:

Viewing a Pip List

Before we make any installs, it is a good idea to see what is already installed. We can use pip listin the command line, & it will display the Python packages in our current working environment in alphabetical order.

Installing the scikit-learn Package

In the following example, we will learn how we can install the scikit-learn package, which will install the other necessary dependencies.

We may notice from the logs that more then the scikit-learn package is being installed. This is because pip will install any other packages that scikit-learn depends on. These other packages are called dependencies.

Installing a Specific Package Version

pip will always install the latest version, so if we wish to install an older version of scikit-learn, all we need to do is specify it in the installation statement use a double equal sign:

Upgrading Packages

If the package we are looking to use is already installed but simply out of date, we can update the package in a similar way we upgraded pip above.

This upgrade will also upgrade any necessary dependency packages as well, automatically.

Installing & Upgrading the scikit-learn & statsmodel Package

To pip install more than one Python package, the packages can be listed in line with the same pip install command as long as they are separated with spaces. Here we are installing both scikit-learn & the statsmodel package in one line of code.

We can also upgrade multiple packages in one line of code.

Installing Packages With requirements.txt

If we want to install many packages at once, we can save them one package per line in a text file called requirements.txt. If we preview the file, it looks like this:

It’s conventional for Python package developers to create a requirements.txt file in their Github repositories listing all dependencies for pip to find & install.

The -r option flag in pip allows pip install to install packages from the file specified after the option flag. Keep in mind that naming this file requirements.txt is conventional but not required.

Using our examples, pip install -r requirements.txt will have the same effect as pip install scikit-learn statsmodel. Typing out each package could get messy if you needed to install ten packages. Using the requirements.txt file is much cleaner.

is the same as

RELATED LINKS:

Data Scientist & Machine Learning Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store