Documenting code for your users and your future self.
Repository ยท Documentation
๐ฌ Receive new lessons straight to your inbox (once a month) and join 20K+ developers in learning how to responsibly deliver value with ML.
Intuition
Code tells you how, comments tell you why. -- Jeff Atwood
Another way to organize our code is to document it. We want to do this so we can make it easier for others (and our future selves) to easily navigate the code base and build on it. We know our code base best the moment we finish writing it but fortunately documenting it will allow us to quickly get back to that stage time and time again. Documentation involves many different things to developers so let's define the most common (and required) components:
comments
: Terse descriptions of why a piece of code exists.typing
: Specification of a function's inputs and outputs data types, providing insight into what a function consumes and produces at a quick glance.docstrings
: Meaningful descriptions for functions and classes that describe overall utility as wel as arguments, returns, etc.documentation
: A rendered webpage that summarizes all the functions, classes, API calls, workflows, examples, etc. so we can view and traverse through the code base without actually having to look at the code just yet.
Application
Let's look at what documentation looks like for our application and be sure to check out the auto-generated documentation page for it as well.
Typing
It's important to be as explicit as possible with our code. We're already discussed choosing explicit names for variables, functions, etc. but another way we can be explicit is by defining the types for our function's inputs and outputs. We want to do this so we can quickly know what data types a function expects and how we can utilize it's outputs for downstream processes.
So far, our functions have looked like this:
1 2 3 |
|
But we can incorporate so much more information using typing:
1 2 3 |
|
Here we're defining that our input argument sequences
is a NumPy array, max_seq_len
is an integer with a default value of 0 and our output is also a NumPy array. There are many data types that we can work with, including but not limited to List
, Set
, Dict
, Tuple
, Sequence
and more and of course included types such as int
, float
, etc. You can also use any of your own defined classes as types (ex. nn.Module
, LabelEncoder
).
Note
Starting from Python 3.9+, common types are built in so we don't need to import them with from typing import List, Set, Dict, Tuple, Sequence
anymore.
Docstrings
We can make our code even more explicit by adding docstrings to functions and classes to describe overall utility, arguments, returns, exceptions and more. Let's take a look at an example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
Let's unpack the different parts of this function's docstring:
[Lines 2-3]
: Summary of the overall utility of the function.[Lines 5-16]
: Example of how to use our function.[Lines 18-19]
: Insertion of aNote
or other types of admonitions.[Lines 21-23]
: Description of the function's input arguments.[Lines 25-26]
: Any exceptions that may be raised in the function.[Lines 28-29]
: Description of the function's output(s).
Tip
If you're using Visual Studio Code (highly recommend), you should get the free Python Docstrings Generator extension so you can type """
under a function and then hit the Shift key to generate a template docstring. It will autofill parts of the docstring using the typing information and even exception in your code!
Mkdocs
So we're going through all this effort to including typing and docstrings to our functions but it's all tucked away inside our scripts. But what if we can collect all this effort and automatically surface it as documentation? Well that's exactly what we'll do with the following open-source packages โ final result here.
- mkdocs (generates project documentation)
- mkdocs-macros-plugin (required plugins)
- mkdocs-material (styling to beautiful render documentation)
- mkdocstrings (fetch documentation automatically from docstrings)
Here are the steps we'll follow to automatically generate our documentation and serve it. You can find all the files we're talking about in our repository.
- Create
mkdocs.yml
in root directory.1
touch mkdocs.yaml
- Fill in metadata, config, extensions and plugins (more setup options like custom styling, overrides, etc. here). I add some custom CSS inside
docs/static/csc
to make things look a little bit nicer :)1 2 3 4 5 6 7 8 9 10 11
# Project information site_name: TagifAI site_url: https://madewithml.com/#mlops site_description: Tag suggestions for projects on Made With ML. site_author: Goku Mohandas # Repository repo_url: https://github.com/GokuMohandas/mlops repo_name: GokuMohandas/mlops edit_uri: "" #disables edit button ...
- Add logo image and favicon to
static/images
.1 2 3 4 5
# Configuration theme: name: material logo: static/images/logo.png favicon: static/images/favicon.ico
- Fill in navigation in
mkdocs.yml
.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# Page tree nav: - Home: - TagIfAI: index.md - Getting started: - Workflow: workflows.md - Reference: - CLI: tagifai/main.md - Configuration: tagifai/config.md - Data: tagifai/data.md - Models: tagifai/models.md - Training: tagifai/train.md - Inference: tagifai/predict.md - Utilities: tagifai/utils.md - API: api.md
- Fill in
mkdocstrings
plugin information insidemkdocs.yml
. - Rerun
make install-dev
to make sure you have the required packages for documentation. - Add
::: tagifai.data
Markdown file to populate it with the information from function and class docstrings fromtagifai/data.py
. Repeat for other scripts as well. We can add our own text directly to the Markdown file as well, like we do intagifai/config.md
. - Run
python -m mkdocs serve
to serve your docs tohttp://localhost:8000/
.
# Serve documentation
$ python -m mkdocs serve
INFO - Building documentation...
INFO - Cleaning site directory
INFO - Serving on http://127.0.0.1:8000
View our rendered documentation via GitHub pages โ here.
Note
We can easily serve our documentation for free using GitHub pages and even host it on a custom domain. All we had to do was add the file .github/workflows/documentation.yml
which GitHub Actions will use to build and deploy our documentation every time we push to the main
branch (we'll learn about GitHub Actions in our CI/CD lesson soon).
To cite this lesson, please use:
1 2 3 4 5 6 |
|