Python#
This section has some materials for getting started with Python. Python is an interpreted programming language. To use it you need a Python interpreter - the one available from Python.org is most widely used. Depending on your operating system your system package manager may provide this interpreter for you.
On Mac there is a system installation of Python, however you may want to use a newer or different version from brew
:
brew install python
Managing Dependencies#
One of the main strengths of Python is its vast ecosystem of third-party packages. Package installation is reasonably straightforward with the built-in package manager pip
. You can call it as Python module with:
python -m pip ARGS
or in some systems you will have an independent pip
executable for convenience:
pip ARGS
When you install a package with pip
by default it will be added to your system Python installation. If you are using another package manager, like brew
or apt
the package you install with pip
can conflict with the system package manager’s version and introduce incompatibilities with other packages.
For this reason, and several others, you should avoid using pip
to install Python packages directly on your system and always install them into per-project Python ‘virtual environment’s.
You can create a per-project environment inside your project source directory, SOURCE
, as follows:
cd $SOURCE
python -m venv --prompt $PROJECT_NAME .venv
By convention the name .venv
is used - this will make it a hidden directory and in some IDEs it will automatically recognize the environment and find all your dependencies for code completion etc.
If not already there, you need to add a .venv
entry to the .gitignore
file in your project (creating that file also if not already there) to make sure that your virtual environment binaries don’t end up in your version control.
The above command will have created the virtual environment directory structure, but to actually use it in the shell you need to ‘activate’ it as follows:
source .venv/bin/activate
After this you can install a third-party package with:
pip install package_name
Some projects will gather all the packages they use in a single file requirements.txt
for convenience. You can install them all at once with:
pip install -r requirements.txt
The above notes are sufficient to work with most Python projects. However there are several other package management tools in the Python ecosystem, Conda being one of the more popular.
Conda is a more feature-rich package manager than pip
- giving finer control of versions of packages that are installed and providing a more feature-rich build environment for package maintainers, allowing more C/C++ wrapped packages to be easily deployed for example. As a result, Conda will usually have more packages or more optimized packages than pip
for scientific applications.
With the added features offered by Conda there is also more maintenance effort needed for a project to use it. Once you add a dependency on Conda to your project then every other contributor and all of your users will also need to install it, rather than simply using pip
. Conda is not always available in environments where pip is. So, it is best to use Conda only if it is really needed and to isolate elements of your project that need Conda from those that don’t. You can always include elements of a non-Conda project into a Conda one - e.g. via a requirements.txt
file or structuring your project as a Python package. You should be aware of condas
licensing requirements also when using it. Using packages from the default repository requires a paid license for most organizations.
Packaging#
Unless you are doing basic prototyping or scripting you will likely want to structure your code as a Python package. This has advantages in terms of automated testing and sharing your project with others. Even if you don’t end up making your own Python packages - since a lot of Python development involves interacting with third-party packages it is useful to understand how Python packaging works.
A typical package structure is shown in Fig. 26.

Fig. 26 Sketch of a standard Python project layout.#
The pyproject.toml
file holds most configuration for the project - including dependencies and test configurations. When you include one, most IDEs will be able to understand your project structure and you will be able to ‘install’ your project via pip:
pip install -e SOURCE_DIR
The -e
here means ‘editable mode’ - meaning changes in your project source will be picked up immediately, without having to reinstall the package.
If you do build a package of your own and want to share it with others you can build it and upload it to a repository like PyPI. To do the build you install dependencies:
pip install --upgrade build
and build with:
python -m build
in the project directory. If you are uploading it you need to make an account on PyPI and get an upload token under your ‘Account settings’ with scope to your project. To do the upload you can install the dependency:
pip install --upgrade twine
and run it with:
twine upload dist/*
Testing and Static Analysis#
Since Python is not typed, as a codebase grows it can become difficult to maintain and collaborate on. Automated testing is particularly important for non-typed languages like Python. It is good practice to think about automated testing in every code contribution you make - it helps to design higher quality code that is easier for others to trust and work with after you. This section covers some typical tools involved in testing and static analysis.
unittest is the built-in Python testing library - since it is foundational it may be worth quickly familiarizing yourself with it. It can often be incorporated in more advanced libraries. pytest is a modern testing framework for Python projects. Over unittest
it offers automated test discovery and configuration on a project level. It can incorporate unittest
tests. pytest-sugar and pytest-cov
are useful extensions.
You can run Pytest on a directory and it will discover tests in files and functions with the test_
prefix. You can also get it to run individual tests with the -k test_func_name
flag.
Python coding stlye is define in the PEP 8. Following a standard coding guide is highly beneficial. The flake8 package can be used to run a check on your code to make sure it follows the guidelines, it is also useful for catching mistakes. The black package has similar checks - but also has a code reformatter, which will modify the code to meet the guidelines automatically. pylint
is another commonly used tool - which is more opinionated and strict than the others by default. It is very useful for finding bugs. You can run any of these tools from the shell, however most IDEs have packages that will run them automatically and highlight issues as you edit source files.
Recently Python has introduced Types as annotations. They are not used by in interpreter but external tooling can understand them and be used to run checks. mypy is a popular tool for this purpose.
On a package level, the settings for all of these tools can be included in the pyproject.toml
and tests with each of the tools can be launched with:
tox
Coding Style#
When importing modules they should always follow this order:
site-packages, i.e., any installed package
local, self-contained files
See import-order for a pip package CLI that checks this ordering.
Debugging and Profiling#
pdb#
pdb
is the most common debugger for Python and comes built-in.
cProfile#
cProfile
is the built in Python profiler. For a basic use it can be added to code as follows:
import cProfile
profiler = cProfile.Profile()
Then, around the code you wish to profile:
profiler.enable()
YOUR_CODE
profiler.disable()
profiler.dump_stats("my_profile.html")
There are several tools that can be used to view the resulting output. Including snakeviz
:
$ snakeviz -s "FILENAME"
and gprof2dot
which produces a nice graph with graphviz. To use it install graphviz:
brew install graphviz
and do:
gprof2dot -f pstats profile.prof -o callgraph.dot
dot -Tpng callgraph.dot > output.png
Viztracer#
viztracer is another profiling tool, providing an easily read timeline.
To run viztracer:
$ viztracer SCRIPT.py
Then to view the resulting json file:
$ vizviewer result.json
Further Reading#
Documentation#
Sphinx is the standard documentation generator for Python packages. You can install it with pip install sphinx
.
In terms of project layout it is useful to have a top-level docs
folder. In the project docs
folder you can run:
sphinx-quickstart
Note you will want to add .gitkeep
files to the generated _static
and _templates
directories if you are not populating them with anything else.