visit
To name just a few examples, here are 3 major issues with Python that I've run into:
1. Applications that depend on environment variables may need these variables to be set before the app can run.2. Applications which use auth certificates for communication between different services, may require the generation of these certificates locally before running the application. 3. Dependency versioning clashes can occur between different microservices within the same project.Things can get especially gnarly when working with multiple microservices which depend on each other, and, frankly, as a developer, I don't really want to be managing all of this overhead just to get up and running. This is especially true if I'm just onboarding to a new project.One common solution I've seen used when developing Python apps, is to use , which are isolated environments that contain a Python installation and required packages. However, managing multiple virtual environments and other environment-related configurations can still be time-consuming and cumbersome, as the virtual environment only provides isolation at the Python interpreter level. This means that other environment-related setup, such as environment variables and port allocation, is still shared globally for all project components.The solution I'll demonstrate in this article is the use of containerization, which is a method of packaging an application and its dependencies into a self-contained unit that can be easily deployed and run on any platform. is a popular platform for developing, deploying, and running containerized applications, and is a tool that makes it easy to define and run multi-container Docker applications using a single YAML file (which is typically named
docker-compose.yml
). Although there are alternative solutions such as , for simplicity's sake, I'll stick to using Docker and Docker Compose in this example. I'll demonstrate how to set up and use a containerized development environment using Docker and Docker Compose. I'll also discuss some of the challenges of using a containerized development environment, and how to overcome them by configuring Docker and Docker compose to fit the following key requirements for an effective development environment: 1. Run - Running end-to-end scenarios that simulate execution on the target production environment.2. Deploy - Making code changes and redeploying quickly, as with a non-containerized application runtime stack.3. Debug - Setting breakpoints and using a debugger to step through code, as with a non-containerized application runtime stack, to identify and fix errors.To illustrate this by example, I'll define a simple Python application that uses the lightweight Python web framework, , to create a RESTful API for querying information about authors and their posts. The API has a single endpoint,
/authors/{author_id}
, which can be used to retrieve information about a particular author by specifying their ID as a path parameter. The application then uses the module to make HTTP requests to a separate posts service, which is expected to provide a list of posts by that author. To keep the code concise, all data will be randomly generated on the fly using the library.To start off, I'll initialize and then open an empty directory for the project. Next, I'll create two sub-directories: The first will be called
authors_service
, and the second posts_service
. Inside each of these directories, I'll create 3 files:1.
app.py
: The main entry point for the Flask app, which defines the app, sets up routes, and specifies the functions to be called when a request is made to those routes.2.
requirements.txt
: A plain text file that specifies the Python packages that are required for the application to run.3.
Dockerfile
: A text file containing instructions for building a Docker image, which, as mentioned above, is a lightweight, stand-alone, and executable package that includes everything needed to run the application, including the code, a runtime, libraries, environment variables, and pretty much anything else.In each
app.py
file, I'll implement a Flask microservice with the desired logic. For
authors_service
, the app.py
file looks as follows:import os
import flask
import requests
from faker import Faker
app = flask.Flask(__name__)
@app.route("/authors/<string:author_id>", methods=["GET"])
def get_author_by_id(author_id: str):
author = {
"id": author_id,
"name": Faker().name(),
"email": Faker().email(),
"posts": _get_authors_posts(author_id)
}
return flask.jsonify(author)
def _get_authors_posts(author_id: str):
response = requests.get(
f'{os.environ["POSTS_SERVICE_URL"]}/{author_id}'
)
return response.json()
if __name__ == "__main__":
app.run(
host=os.environ['SERVICE_HOST'],
port=int(os.environ['SERVICE_PORT'])
)
This code sets up a Flask app and defines a route to handle GET requests to the endpoint
/authors/{author_id}
. When this endpoint is accessed, it generates mock data for an author with the provided ID and retrieves a list of posts for that author from the separate posts service. It then runs the Flask app, listening on the hostname and port specified in corresponding environment variables.requirements.txt
file, as follows:flask==2.2.2
requests==2.28.1
Faker==15.3.4
Note that there are no specific package versioning requirements for any of the dependencies referenced throughout this guide. The versions used were the latest available at the time of writing.
For the
posts_service
, app.py
looks as follows:import os
import uuid
from random import randint
import flask
from faker import Faker
app = flask.Flask(__name__)
@app.route('/posts/<string:author_id>', methods=['GET'])
def get_posts_by_author_id(author_id: str):
posts = [
{
"id:": str(uuid.uuid4()),
"author_id": author_id,
"title": Faker().sentence(),
"body": Faker().paragraph()
}
for _ in range(randint(1, 5))
]
return flask.jsonify(posts)
if __name__ == '__main__':
app.run(
host=os.environ['SERVICE_HOST'],
port=int(os.environ['SERVICE_PORT'])
)
In this code, when a client (i.e.
authors_service
) sends a GET request to the route /posts/{author_id}
, the function get_posts_by_author_id
is called with the specified author_id
as a parameter. The function generates mock data for between 1 and 5 posts written by the author using the Faker library, and returns the list of posts as a JSON response to the client. I'll also need to add the flask and Faker packages to posts service's
requirements.txt
file, as follows:flask==2.2.2
Faker==15.3.4
Both services use the environment variables
SERVICE_HOST
and SERVICE_PORT
to define the socket on which the Flask server will be launched. While SERVICE_HOST
is not an issue (multiple services can listen on the same host), SERVICE_PORT
can cause problems. If I were to install all dependencies in a local Python environment and run both services, the first service to start would use the specified port, causing the second service to crash because it couldn't use the same port. One simple solution is to use separate environment variables (e.g., AUTHORS_SERVICE_PORT
and POSTS_SERVICE_PORT
) instead. However, modifying the source code to adapt to environmental constraints can become complex when scaling up. Containerization helps to avoid issues like this by setting up the environment to be adapted for the application, rather than the other way around. In this case, I can set the
SERVICE_PORT
environment variable to a different value for each service, and each service will be able to use its own port without interference from other services.Dockerfile
in each service's directory. The contents of this file (for both services) are as follows:FROM python:3.8
RUN mkdir /app
WORKDIR /app
COPY requirements.txt /app/
RUN pip install -r requirements.txt
COPY . /app/
CMD ["python", "app.py"]
This
Dockerfile
builds off of a Python 3.8 and sets up a directory for the application in the container. It then copies the requirements.txt
file from the host machine to the container and installs the dependencies listed in that file. Finally, it copies the rest of the application code from the host machine to the container and runs the main application script when the container is started.Next, I'll create a file named
docker-compose.yml
in the root project directory. As briefly mentioned above, this file is used to define and run multi-container Docker applications. In the docker-compose.yml
file, I can define the services that make up the application, specify the dependencies between them, and configure how they should be built and run. In this case, it looks as follows:---
# Specify the version of the Docker Compose file format
version: '3.9'
services:
# Define the authors_service service
authors_service:
# This service relies on, and is therefor dependent on, the below `posts_service` service
depends_on:
- posts_service
# Specify the path to the Dockerfile for this service
build:
context: ./authors_service
dockerfile: Dockerfile
# Define environment variables for this service
environment:
- SERVICE_HOST=0.0.0.0
- PYTHONPATH=/app
- SERVICE_PORT=5000
- POSTS_SERVICE_URL=//posts_service:6000/posts
# Map port 5000 on the host machine to port 5000 on the container
ports:
- "5000:5000"
# Mount the authors_service source code directory on the host to the working directory on the container
volumes:
- ./authors_service:/app
# Define the posts_service service
posts_service:
# Specify the path to the Dockerfile for this service
build:
context: ./posts_service
dockerfile: Dockerfile
# Define environment variables for this service
environment:
- PYTHONPATH=/app
- SERVICE_HOST=0.0.0.0
- SERVICE_PORT=6000
# Mount the posts_service source code directory on the host to the working directory on the container
volumes:
- ./posts_service:/app
The containers can be started with the
docker-compose up
command. The first time this is run the docker images will automatically be built. This satisfies the first above core requirement of "Run".Note that in the
docker-compose.yml
file, volume mounts are used to share the source code directories for the authors_service
and posts_service
services between the host machine and the containers. This allows for code to be edited on the host machine with the changes automatically reflected in the containers (and vice versa).For example, the following line mounts the
./authors_service
directory on the host machine to the /app
directory in the authors_service
container:volumes:
- ./authors_service:/app
Changes made on the host machine are immediately available on the container, and changes made in the container are immediately persisted to the host machine's source code directory. This allows for quickly redeploying changes by restarting the relevant container without rebuilding the image, effectively satisfying the second core requirement of "Deploy."
Debugging the code from within the container itself
To develop and debug code from within a running docker container using VSCode, I will:1. Ensure the for VSCode is installed and enabled.2. Ensure the container I want to attach to is up and running.3. Open the Docker extension's explorer view by clicking on the Docker icon in the left sidebar.
4. In the explorer view, expand the "Running Containers" section and select the container I want to attach to.
5. Right-click on the container and select the "Attach Visual Studio Code" option from the context menu.
In order to avoid having to install VSCode extensions such as every time the container restarts, I can mount a volume inside the container that stores the VSCode extensions. This way, when the container is restarted, the extensions will still be available because they are stored on the host machine. To do this using docker compose in this demo project, the
docker-compose.yml
file can be modified as follows:---
# Specify the version of the Docker Compose file format
version: '3.9'
services:
# Define the authors_service service
authors_service:
...
# Mount the authors_service source code directory on the host to the working directory on the container
volumes:
- ./authors_service:/app
# Mount the vscode extension directory on the host to the vscode extension directory on the container
- /path/to/host/extensions:/root/.vscode/extensions
# Define the posts_service service
posts_service:
...
Note that the VSCode extensions can typically be found at
~/.vscode/extensions
on Linux and macOS, or %USERPROFILE%\.vscode\extensions
on Windows.Using a remote Python debug server
The above method of debugging works well for standalone scripts or for writing, running, and debugging tests. However, debugging a logical flow involving multiple services running in different containers is more complex.When a container is started, the service it contains is typically launched immediately. In this case, the Flask servers on both services are already running by the time VSCode is attached, so clicking "Run and debug" and launching another instance of the Flask server is not practical as it would result in multiple instances of the same service running on the same container and competing with each other, which is usually not a reliable debugging flow.
This brings me to option number two; using a remote Python debug server. A remote Python debug server is a Python interpreter that is running on a remote host and is configured to accept connections from a debugger. This allows for the use of a debugger which is running locally, to examine and control a Python process which is running on a remote environment.
To get started, the first step is to add the package to the
requirements.txt
files for both services. debugpy is a high-level, open-source Python debugger that can be used to debug Python programs locally or remotely. I'll add the following line to both requirements.txt
files: debugpy==1.6.4
Now I need to rebuild the images in order to install debugpy on the Docker images for each service. I'll run the
docker-compose build
command to do this. Then I'll run docker-compose up
to launch the containers.Next, I'll attach VSCode to the running container containing the process I want to debug, as I did above. In order to attach a debugger to the running python application, I'll need to add the following snippet to the code at the point from which I wish to begin debugging:import debugpy; debugpy.listen(5678)
This snippet imports the debugpy module and calls the
listen
function, which starts a debugpy server that listens for connections from a debugger on the specified port number (in this case, 5678).If I wanted to debug the
authors_service
, I could place the above snippet just before the get_author_by_id
function declaration within the app.py
file - as follows:import os
import flask
import requests
from faker import Faker
app = flask.Flask(__name__)
import debugpy; debugpy.listen(5678)
@app.route("/authors/<string:author_id>", methods=["GET"])
def get_author_by_id(author_id: str):
...
This would start a debugpy server on application startup as the
app.py
script is executed.The next step is to create a VSCode launch configuration for debugging the application. In the root directory for the service whose container I've attached to (and on which I'm running the VSCode window), I'll create a folder named
.vscode
. Then, within this folder, I'll create a file named launch.json
, with the following contents:{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Remote Attach",
"type": "python",
"request": "attach",
"connect": {
"host": "localhost",
"port": 5678
}
}
]
}
This configuration specifies that VSCode should attach to a Python debugger running on the local machine (i.e. the current container) on port 5678 - which, importantly, was the port specified when calling the
debugpy.listen
function above.I will then save all changes. In the Docker extension's explorer view, I will right-click the container I'm currently attached to and select "Restart container" from the context menu (done on the local VSCode instance). After restarting the container, the VSCode window within the container will display a dialog asking if I want to reload the window - the correct answer is yes.Now all that remains is to see it in action! To start debugging, within the VSCode instance running on the container, I'll open the script I want to debug and press F5 to start the debugger. The debugger will attach to the script and pause execution at the line where the
debugpy.listen
function is called. The debugger controls in the Debug tab can now be used to step through the code, set breakpoints, and inspect variables.This satisfies the above "Debug" requirement.
Setting up a remote interpreter
2. Click Add new interpreter, and in then select On docker compose... from the pop up menu.
3. In the next pop up window, select the relevant docker compose file, and then select the relevant service from the dropdown. PyCharm will now attempt to connect to the docker image and retrieve the available python interpreters.4. In the next window, select the python interpreter I wish to use (e.g.
/usr/local/bin/python
). Once the interpreter has been selected, click "Create". PyCharm will then index the new interpreter, after which I can run or debug code as usual - PyCharm will orchestrate docker compose behind the scenes for me whenever I wish to do so. Setting up a remote debug server configuration
In order to set up a remote debug server configuration, I first need to add two dependencies to the relevant
requirements.txt
file(s): , and . These are similar in function to the debugpy package demonstrated above, but, as its name suggests, pydevd_pycharm is specifically designed for debugging with PyCharm. In the context of this demo project, I'll add the following two lines to both requirements.txt
files:pydevd~=2.9.1
pydevd-pycharm==223.8214.17
import pydevd_pycharm; pydevd_pycharm.settrace('host.docker.internal', 5678)
Note that unlike with debugpy, here I specified a hostname address with the value "host.docker.internal", which is a DNS name that resolves to the internal IP address of the host machine from within a Docker container. This is because I'm not running PyCharm on the container; instead, I'm effectively configuring the debug server to listen on port 5678 of the host machine.
This option also exists with debugpy, but since in that case I was running an instance of VSCode on the container itself, it simplified things to just let the hostname address default to "localhost" (i.e. the loopback interface of the container itself, not the host machine).
1. Open the Run/Debug Configuration dialog by selecting Run > Edit Configurations from the main menu.
2. Click the + button in the top-left corner of the dialog and select Python Remote Debug from the drop-down menu.
3. In the Name field, enter a name for the run configuration.
4. In the Script path field, specify the path to the script I want to debug.
5. In the Host field, enter the IP address of the host machine where the debugger server will run. In this example, it's "localhost".
6. In the Port field, enter the port number that the debugger server will listen on. In this example, it's 5678.
7. In the Path mappings section, I can specify how the paths on the host machine map to paths within the container. This is useful if I'm debugging code that is mounted into the container from the host, as the paths may not be the same in both environments. In this example, I'll want to map
path/to/project/on/host/authors_service
on the host machine, to /app
in the container for debugging authors_service, or path/to/project/on/host/posts_service
to /app
on the container for debugging posts_service (these would need to be two separate run configurations).8. Click OK to save the run configuration.
To start debugging, I'll select the above run configuration from the Run drop-down menu and click the Debug button, and then spin up the relevant container(s) with the
docker-compose up
command. The PyCharm debugger will attach to the script and pause execution at the line where the pydevd_pycharm.settrace
function is called, allowing me to begin smashing those bugs.