1. Starting a package
Welcome to this course on building packages in Python. I'm James, and I will be your foreman as you build your own complete package during this course.
2. Why build a package anyway?
Building packages is a tremendously useful skill for anyone who codes. Whether you are running analysis or simulations, building machine learning models, data pipelines, or applications.
If you have ever copied and pasted functions, classes or any code from one project to another, packaging is for you.
Packaging makes your code easier to reuse by bundling it up and making it importable, just like your other favorite Python packages such as numpy, pandas, and scikit-learn.
Packaging stops you having to copy or rewrite lots of functions, and also stops you having multiple versions of the same code spread across many files.
We write code for ourselves and others, and packaging makes it easier to share your code.
3. Course content
In this course, you are going to build a full package from scratch.
You will learn about file layout within your package; how to structure your import statements to expose functions to users; how to make your package installable; how to include licenses and READMEs; maintaining your package quality using style convention and unit testing tools; building distributable versions of your package you can share publicly; and using package templates to speed up development.
4. Scripts, modules, and packages
We'll be using the terms 'script', 'package' and 'module' in this course.
A script is a python file which is made to be run directly. It is designed to do one set of tasks.
A package is a directory of Python files which you import functions from. All the code in this directory is related and works together.
Sometimes packages are layered. They can contain smaller packages inside them. We call the inside ones subpackages.
A module is one of the Python files inside a directory which you import code from. Each module stores some of the package code. You'll see an example module later.
In the wild, you will also hear people use the term 'library'. This is sometimes used to mean package, but sometimes people will use it to refer to a group of related packages. For example the Python standard library, which includes lots of basic Python packages such as the math, os and datetime packages.
5. Directory tree of a package
Because this course is about organizing Python code into a package, we will need to show the structure of directories. We'll be using directory trees for this.
In this example, we are looking inside the the directory called my-simple-package. Inside this directory, there are two files, __init__.py and simple-module.py.
This directory is the simplest python package you could make. The simple-module.py file contains the package code, and the __init__.py file is a special file which tells Python that this directory is a package.
6. Contents of simple package
Initially, the __init__.py file will be completely empty. But it is an important file, and we'll use this to structure the package imports later.
The simple-module.py file has all the code for our package.
7. Subpackages
The package directory can contain subdirectories,
Here, preprocessing and regression are subdirectories of my-sk-learn. Each of these directories is a subpackage, and has its own __init__.py file.
Using subpackages helps to organize your code, just as using subdirectories helps to organize your documents. You should place closely related functions and classes in the same module, and related modules in the same subpackage.
8. Let's practice!
In the following exercises, you'll begin creating your first package. Let's practice.