Get Started

Python package installation with pip

1. Python package installation with pip

In this lesson, let's learn about pip, the standard package manager for Python.

2. Python standard library

Python comes by default with a collection of built-in functions such as the print function and built-in packages such as the math or os library. However, because Python is a general-purpose language, packages for specific use cases need to be separately installed by the user. This includes data science packages such as scikit-learn and statsmodels. The installation is handled through pip, the standard package manager for Python.

3. Using pip documentation

First things first, read the documentation through pip dash-h. pip has a variety of commands and option flags designed to manage Python packages.

4. Using pip documentation

Secondly, note that you can print the pip version the same way you print the Python version. It's important that the pip version is compatible with the Python version. Here, we see that pip 19 dot 1 dot 1 is compatible with Python 3 dot 5.

5. Upgrading pip

If pip is giving you an upgrade warning, you can upgrade pip using itself via pip install dash-dash upgrade pip.

6. pip list

Before we do any installing, it's a good idea to see what is already installed. Using pip list on the command line prints an alphabetical list of all the Python packages in your current Python environment.

7. pip install one package

To install a library that is not currently installed, as indicated by pip list, the syntax is pip-install followed by the library name. In this case, we are installing scikit-learn. You might notice from the logs though, it looks like more than one Python package is being installed in this one single pip-install call. This is because pip will automatically install scikit-learn as well as any other Python packages that it is dependent on to operate. We call these other packages: dependencies.

8. pip install a specific version

By default, pip-install will always install the latest version of the library. Here, we have installed Scikit-Learn version zero dot twenty one dot 3.

9. pip install a specific version

If we wish to install an older version of scikit-learn, we simply have to specify it during the install statement, with double equal signs, like so.

10. Upgrading packages using pip

If your package is already installed, and simply out of date, upgrade the package the same way you upgraded pip earlier. Pip will automatically take care of any dependency upgrades as well.

11. pip install multiple packages

To pip install more than one Python package, the packages can be listed in-line with the same pip install command as long as they are separated by spaces. Here, we are installing both scikit-learn and statsmodels in one go. We can also upgrade multiple packages in one command.

12. pip install with requirements.txt

If you want to install many packages at once, you can save them one package per line in a text file called requirements-dot-txt. If we preview the file, it looks like this. It's conventional for Python package developers to create a requirements file listing all dependencies for pip to find and install.

13. pip install with requirements.txt

The dash-r option flag in pip allows pip install to install packages from the file specified after the option flag. Keep in mind that naming this file requirements-dot-txt is convention, but not required. In our example, pip install dash-r requirements will have the same effect as pip install scikit learn statsmodel. However, imagine how much messier this could be if we had, say, 10 packages to install? Using the requirements file method is much cleaner.

14. Let's practice!

In this lesson, we learned about Python package manager pip, a crucial building block in understanding data science pipelines using Python. Let's put this into practice!