- The overall structure of the codebase and how Docker can be used to interact with it is shown below.

- Instead of specifying only the required Python packages, the developer specifies a recipe to build what is known as an image, which contains everything required to run the app, including an operating system and all dependencies. The recipe is called the
Dockerfile - it is a bit like the requirements.txt files, but for the entire runtime environment rather than just the Python packages.
Dockerfile
- You can see how the Dockerfile can be thought of as a recipe for building a particular run-time environment.
- Let’s walk through an example
Dockerfile line by line to understand it better.
FROM ubuntu:20.04 #<1>
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update #<2>
RUN apt-get install -y unzip graphviz curl musescore3 python3-pip
RUN pip install --upgrade pip #<3>
WORKDIR /app #<4>
COPY ./requirements.txt /app #<5>
RUN pip install -r /app/requirements.txt
# Hack to get around tensorflow-io issue - <https://github.com/tensorflow/io/issues/1755>
RUN pip install tensorflow-io
RUN pip uninstall -y tensorflow-io
COPY /notebooks/. /app/notebooks #<6>
COPY /scripts/. /app/scripts
ENV PYTHONPATH="${PYTHONPATH}:/app" #<7>
- The first line defines the base image. Our base image is an Ubuntu 20.04 (Linux) operating system. This is pulled from DockerHub - the online store of publicly available images (
https://hub.docker.com/_/ubuntu).
- Update
apt-get, the Linux package manager and install relevant packages
- Upgrade
pip the Python package manager
- Change the working directory to
/app.
- Copy the
requirements.txt file into the image and use pip to install all relevant Python packages
- Copy relevant folders into the image
- Update the
PYTHONPATH so that we can import functions that we write from our /app directory