Taki Rahal

Taki Rahal

1669975357

Deploying Spring Boot Applications on Render.com for FREE

In this article, you’ll learn how to host a Spring boot application on Render platform

Prerequisites
Java Development Kit (JDK) version 11.
Maven 3.8.2 or newer.
PostgreSQL
Docker

Full code source
 https://github.com/TakiRahal/spring-boot-render 

#springboot  #java  #postgres , #docker , #render, #programming  #developer  

Deploying Spring Boot Applications on Render.com for FREE
Nat  Grady

Nat Grady

1669006620

How to Set Up FastAPI with Postgres, Uvicorn, and Docker

In this tutorial, we'll look at how to set up FastAPI with Postgres, Uvicorn, and Docker. For production environments, we'll add on Gunicorn, Traefik, and Let's Encrypt.

Project Setup

Start by creating a project directory:

$ mkdir fastapi-docker-traefik && cd fastapi-docker-traefik
$ python3.9 -m venv venv
$ source venv/bin/activate

Feel free to swap out virtualenv and Pip for Poetry or Pipenv. For more, review Modern Python Environments.

Then, create the following files and folders:

├── app
│   ├── __init__.py
│   └── main.py
└── requirements.txt

Add FastAPI and Uvicorn, an ASGI server, to requirements.txt:

fastapi==0.63.0
uvicorn==0.13.4

Install them:

(venv)$ pip install -r requirements.txt

Next, let's create a simple FastAPI application in app/main.py:

# app/main.py

from fastapi import FastAPI

app = FastAPI(title="FastAPI, Docker, and Traefik")


@app.get("/")
def read_root():
    return {"hello": "world"}

Run the application:

(venv)$ uvicorn app.main:app

Navigate to 127.0.0.1:8000. You should see:

{
    "hello": "world"
}

Kill the server once done. Exit then remove the virtual environment as well.

Docker

Install Docker, if you don't already have it, then add a Dockerfile to the project root:

# Dockerfile

# pull the official docker image
FROM python:3.9.4-slim

# set work directory
WORKDIR /app

# set env variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# copy project
COPY . .

So, we started with a slim Docker image for Python 3.9.4. We then set up a working directory along with two environment variables:

  1. PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)
  2. PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Finally, we copied over the requirements.txt file, installed the dependencies, and copied over the project.

Review Docker for Python Developers for more on structuring Dockerfiles as well as some best practices for configuring Docker for Python-based development.

Next, add a docker-compose.yml file to the project root:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: .
    command: uvicorn app.main:app --host 0.0.0.0
    volumes:
      - .:/app
    ports:
      - 8008:8000

Review the Compose file reference for info on how this file works.

Build the image:

$ docker-compose build

Once the image is built, run the container:

$ docker-compose up -d

Navigate to http://localhost:8008 to again view the hello world sanity check.

Check for errors in the logs if this doesn't work via docker-compose logs -f.

Postgres

To configure Postgres, we need to add a new service to the docker-compose.yml file, set up an ORM, and install asyncpg.

First, add a new service called db to docker-compose.yml:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: .
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; uvicorn app.main:app --host 0.0.0.0'
    volumes:
      - .:/app
    ports:
      - 8008:8000
    environment:
      - DATABASE_URL=postgresql://fastapi_traefik:fastapi_traefik@db:5432/fastapi_traefik
    depends_on:
      - db
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=fastapi_traefik
      - POSTGRES_PASSWORD=fastapi_traefik
      - POSTGRES_DB=fastapi_traefik

volumes:
  postgres_data:

To persist the data beyond the life of the container we configured a volume. This config will bind postgres_data to the "/var/lib/postgresql/data/" directory in the container.

We also added an environment key to define a name for the default database and set a username and password.

Review the "Environment Variables" section of the Postgres Docker Hub page for more info.

Take note of the new command in the web service:

bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; uvicorn app.main:app --host 0.0.0.0'

while !</dev/tcp/db/5432; do sleep 1 will continue until Postgres is up. Once up, uvicorn app.main:app --host 0.0.0.0 runs.

Next, add a new file called config.py to the "app" directory, where we'll define environment-specific configuration variables:

# app/config.py

import os

from pydantic import BaseSettings, Field


class Settings(BaseSettings):
    db_url: str = Field(..., env='DATABASE_URL')

settings = Settings()

Here, we defined a Settings class with a db_url attribute. BaseSettings, from pydantic, validates the data so that when we create an instance of Settings, db_url will be automatically loaded from the environment variable.

We could have used os.getenv(), but as the number of environment variables increases, this becomes very repetitive. By using a BaseSettings, you can specify the environment variable name and it will automatically be loaded.

You can learn more about pydantic settings management here.

We'll use ormar for communicating with the database.

Add ormar, an async mini ORM for Python, to requirements.txt along with asyncpg and psycopg2:

asyncpg==0.22.0
fastapi==0.63.0
ormar==0.10.5
psycopg2-binary==2.8.6
uvicorn==0.13.4

Feel free to swap ormar for the ORM of your choice. Looking for some async options? Check out the Awesome FastAPI repo and this Twitter thread.

Next, create a app/db.py file to set up a model:

# app/db.py

import databases
import ormar
import sqlalchemy

from .config import settings

database = databases.Database(settings.db_url)
metadata = sqlalchemy.MetaData()


class BaseMeta(ormar.ModelMeta):
    metadata = metadata
    database = database


class User(ormar.Model):
    class Meta(BaseMeta):
        tablename = "users"

    id: int = ormar.Integer(primary_key=True)
    email: str = ormar.String(max_length=128, unique=True, nullable=False)
    active: bool = ormar.Boolean(default=True, nullable=False)


engine = sqlalchemy.create_engine(settings.db_url)
metadata.create_all(engine)

This will create a pydanic model and a SQLAlchemy table.

ormar uses SQLAlchemy for creating databases/tables and constructing database queries, databases for executing the queries asynchronously, and pydantic for data validation. Note that each ormar.Model is also a pydantic.BaseModel, so all pydantic methods are also available on a model. Since the tables are created using SQLAlchemy (under the hood), database migration is possible via Alembic.

Check out Alembic usage, from the official ormar documentation, for more on using Alembic with ormar.

Next, update app/main.py to connect to the database and add a dummy user:

# app/main.py

from fastapi import FastAPI

from app.db import database, User


app = FastAPI(title="FastAPI, Docker, and Traefik")


@app.get("/")
async def read_root():
    return await User.objects.all()


@app.on_event("startup")
async def startup():
    if not database.is_connected:
        await database.connect()
    # create a dummy entry
    await User.objects.get_or_create(email="test@test.com")


@app.on_event("shutdown")
async def shutdown():
    if database.is_connected:
        await database.disconnect()

Here, we used FastAPI's event handlers to create a database connection. @app.on_event("startup") creates a database connection pool before the app starts up.

await User.objects.get_or_create(email="test@test.com")

The above line in the startup event adds a dummy entry to our table once the connection has been established. get_or_create makes sure that the entry is created only if it doesn't already exist.

The shutdown event closes all connections to the database. We also added a route to display all the entries in the users table.

Build the new image and spin up the two containers:

$ docker-compose up -d --build

Ensure the users table was created:

$ docker-compose exec db psql --username=fastapi_traefik --dbname=fastapi_traefik

psql (13.2)
Type "help" for help.

fastapi_traefik=# \l
                                              List of databases
      Name       |      Owner      | Encoding |  Collate   |   Ctype    |          Access privileges
-----------------+-----------------+----------+------------+------------+-------------------------------------
 fastapi_traefik | fastapi_traefik | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres        | fastapi_traefik | UTF8     | en_US.utf8 | en_US.utf8 |
 template0       | fastapi_traefik | UTF8     | en_US.utf8 | en_US.utf8 | =c/fastapi_traefik                 +
                 |                 |          |            |            | fastapi_traefik=CTc/fastapi_traefik
 template1       | fastapi_traefik | UTF8     | en_US.utf8 | en_US.utf8 | =c/fastapi_traefik                 +
                 |                 |          |            |            | fastapi_traefik=CTc/fastapi_traefik
(4 rows)


fastapi_traefik=# \c fastapi_traefik
You are now connected to database "fastapi_traefik" as user "fastapi_traefik".

fastapi_traefik=# \dt
            List of relations
 Schema | Name  | Type  |      Owner
--------+-------+-------+-----------------
 public | users | table | fastapi_traefik
(1 row)

fastapi_traefik=# \q

You can check that the volume was created as well by running:

$ docker volume inspect fastapi-docker-traefik_postgres_data

You should see something similar to:

[
    {
        "CreatedAt": "2021-04-29T12:41:19Z",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "fastapi-docker-traefik",
            "com.docker.compose.version": "1.29.0",
            "com.docker.compose.volume": "postgres_data"
        },
        "Mountpoint": "/var/lib/docker/volumes/fastapi-docker-traefik_postgres_data/_data",
        "Name": "fastapi-docker-traefik_postgres_data",
        "Options": null,
        "Scope": "local"
    }
]

Navigate to 127.0.0.1:8008. You should see:

[
    {
        "id": 1,
        "email": "test@test.com",
        "active": true
    }
]

Production Dockerfile

For deployment of our application, we need to add Gunicorn, a WSGI server, to spawn instances of Uvicorn. Rather than writing our own production Dockerfile, we can leverage uvicorn-gunicorn, a pre-built Docker image with Uvicorn and Gunicorn for high-performance web applications maintained by the core FastAPI author.

Create a new Dockerfile called Dockerfile.prod for use with production builds:

# Dockerfile.prod

FROM tiangolo/uvicorn-gunicorn:python3.8-slim

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

That's it. The tiangolo/uvicorn-gunicorn:python3.8-slim image does much of the work for us. We just copied over the requirements.txt file, installed the dependencies, and then copied over all the project files.

Next, create a new compose file called docker-compose.prod.yml for production:

# docker-compose.prod.yml

version: '3.8'

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.prod
    ports:
      - 8009:80
    environment:
      - DATABASE_URL=postgresql://fastapi_traefik_prod:fastapi_traefik_prod@db:5432/fastapi_traefik_prod
    depends_on:
      - db
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=fastapi_traefik_prod
      - POSTGRES_PASSWORD=fastapi_traefik_prod
      - POSTGRES_DB=fastapi_traefik_prod

volumes:
  postgres_data_prod:

Compare this file to docker-compose.yml. What's different?

The uvicorn-gunicorn Docker image that we used uses a prestart.sh script to run commands before the app starts. We can use this to wait for Postgres.

Modify Dockerfile.prod like so:

# Dockerfile.prod

FROM tiangolo/uvicorn-gunicorn:python3.8-slim

RUN apt-get update && apt-get install -y netcat

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

Then, add a prestart.sh file to the root of the project:

# prestart.sh

echo "Waiting for postgres connection"

while ! nc -z db 5432; do
    sleep 0.1
done

echo "PostgreSQL started"

exec "$@"

Update the file permissions locally:

$ chmod +x prestart.sh

Bring down the development containers (and the associated volumes with the -v flag):

$ docker-compose down -v

Then, build the production images and spin up the containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Test that 127.0.0.1:8009 works.

Traefik

Next, let's add Traefik, a reverse proxy, into the mix.

New to Traefik? Check out the offical Getting Started guide.

Traefik vs Nginx: Traefik is a modern, HTTP reverse proxy and load balancer. It's often compared to Nginx, a web server and reverse proxy. Since Nginx is primarily a webserver, it can be used to serve up a webpage as well as serve as a reverse proxy and load balancer. In general, Traefik is simpler to get up and running while Nginx is more versatile.

Traefik:

  1. Reverse proxy and load balancer
  2. Automatically issues and renews SSL certificates, via Let's Encrypt, out-of-the-box
  3. Use Traefik for simple, Docker-based microservices

Nginx:

  1. Web server, reverse proxy, and load balancer
  2. Slightly faster than Traefik
  3. Use Nginx for complex services

Add a new file called traefik.dev.toml:

# traefik.dev.toml

# listen on port 80
[entryPoints]
  [entryPoints.web]
    address = ":80"

# Traefik dashboard over http
[api]
insecure = true

[log]
level = "DEBUG"

[accessLog]

# containers are not discovered automatically
[providers]
  [providers.docker]
    exposedByDefault = false

Here, since we don't want to expose the db service, we set exposedByDefault to false. To manually expose a service we can add the "traefik.enable=true" label to the Docker Compose file.

Next, update the docker-compose.yml file so that our web service is discovered by Traefik and add a new traefik service:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: .
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; uvicorn app.main:app --host 0.0.0.0'
    volumes:
      - .:/app
    expose:  # new
      - 8000
    environment:
      - DATABASE_URL=postgresql://fastapi_traefik:fastapi_traefik@db:5432/fastapi_traefik
    depends_on:
      - db
    labels: # new
      - "traefik.enable=true"
      - "traefik.http.routers.fastapi.rule=Host(`fastapi.localhost`)"
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=fastapi_traefik
      - POSTGRES_PASSWORD=fastapi_traefik
      - POSTGRES_DB=fastapi_traefik
  traefik: # new
    image: traefik:v2.2
    ports:
      - 8008:80
      - 8081:8080
    volumes:
      - "./traefik.dev.toml:/etc/traefik/traefik.toml"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

volumes:
  postgres_data:

First, the web service is only exposed to other containers on port 8000. We also added the following labels to the web service:

  1. traefik.enable=true enables Traefik to discover the service
  2. traefik.http.routers.fastapi.rule=Host(`fastapi.localhost`) when the request has Host=fastapi.localhost, the request is redirected to this service

Take note of the volumes within the traefik service:

  1. ./traefik.dev.toml:/etc/traefik/traefik.toml maps the local config file to the config file in the container so that the settings are kept in sync
  2. /var/run/docker.sock:/var/run/docker.sock:ro enables Traefik to discover other containers

To test, first bring down any existing containers:

$ docker-compose down -v
$ docker-compose -f docker-compose.prod.yml down -v

Build the new development images and spin up the containers:

$ docker-compose up -d --build

Navigate to http://fastapi.localhost:8008/. You should see:

[
    {
        "id": 1,
        "email": "test@test.com",
        "active": true
    }
]

You can test via cURL as well:

$ curl -H Host:fastapi.localhost http://0.0.0.0:8008

Next, check out the dashboard at fastapi.localhost:8081:

traefik dashboard

Bring the containers and volumes down once done:

$ docker-compose down -v

Let's Encrypt

We've successfully created a working example of FastAPI, Docker, and Traefik in development mode. For production, you'll want to configure Traefik to manage TLS certificates via Let's Encrypt. In short, Traefik will automatically contact the certificate authority to issue and renew certificates.

Since Let's Encrypt won't issue certificates for localhost, you'll need to spin up your production containers on a cloud compute instance (like a DigitalOcean droplet or an AWS EC2 instance). You'll also need a valid domain name. If you don't have one, you can create a free domain at Freenom.

We used a DigitalOcean droplet along with Docker machine to quickly provision a compute instance with Docker and deployed the production containers to test out the Traefik config. Check out the DigitalOcean example from the Docker docs for more on using Docker Machine to provision a droplet.

Assuming you configured a compute instance and set up a free domain, you're now ready to set up Traefik in production mode.

Start by adding a production version of the Traefik config to a file called traefik.prod.toml:

# traefik.prod.toml

[entryPoints]
  [entryPoints.web]
    address = ":80"
  [entryPoints.web.http]
    [entryPoints.web.http.redirections]
      [entryPoints.web.http.redirections.entryPoint]
        to = "websecure"
        scheme = "https"

  [entryPoints.websecure]
    address = ":443"

[accessLog]

[api]
dashboard = true

[providers]
  [providers.docker]
    exposedByDefault = false

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

Make sure to replace your@email.com with your actual email address.

What's happening here:

  1. entryPoints.web sets the entry point for our insecure HTTP application to port 80
  2. entryPoints.websecure sets the entry point for our secure HTTPS application to port 443
  3. entryPoints.web.http.redirections.entryPoint redirects all insecure requests to the secure port
  4. exposedByDefault = false unexposes all services
  5. dashboard = true enables the monitoring dashboard

Finally, take note of:

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

This is where the Let's Encrypt config lives. We defined where the certificates will be stored along with the verification type, which is an HTTP Challenge.

Next, assuming you updated your domain name's DNS records, create two new A records that both point at your compute instance's public IP:

  1. fastapi-traefik.your-domain.com - for the web service
  2. dashboard-fastapi-traefik.your-domain.com - for the Traefik dashboard

Make sure to replace your-domain.com with your actual domain.

Next, update docker-compose.prod.yml like so:

# docker-compose.prod.yml

version: '3.8'

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile.prod
    expose:  # new
      - 80
    environment:
      - DATABASE_URL=postgresql://fastapi_traefik_prod:fastapi_traefik_prod@db:5432/fastapi_traefik_prod
    depends_on:
      - db
    labels:  # new
      - "traefik.enable=true"
      - "traefik.http.routers.fastapi.rule=Host(`fastapi-traefik.your-domain.com`)"
      - "traefik.http.routers.fastapi.tls=true"
      - "traefik.http.routers.fastapi.tls.certresolver=letsencrypt"
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=fastapi_traefik_prod
      - POSTGRES_PASSWORD=fastapi_traefik_prod
      - POSTGRES_DB=fastapi_traefik_prod
  traefik:  # new
    build:
      context: .
      dockerfile: Dockerfile.traefik
    ports:
      - 80:80
      - 443:443
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./traefik-public-certificates:/certificates"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.dashboard.rule=Host(`dashboard-fastapi-traefik.your-domain.com`) && (PathPrefix(`/`)"
      - "traefik.http.routers.dashboard.tls=true"
      - "traefik.http.routers.dashboard.tls.certresolver=letsencrypt"
      - "traefik.http.routers.dashboard.service=api@internal"
      - "traefik.http.routers.dashboard.middlewares=auth"
      - "traefik.http.middlewares.auth.basicauth.users=testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1"

volumes:
  postgres_data_prod:
  traefik-public-certificates:

Again, make sure to replace your-domain.com with your actual domain.

What's new here?

In the web service, we added the following labels:

  1. traefik.http.routers.fastapi.rule=Host(`fastapi-traefik.your-domain.com`) changes the host to the actual domain
  2. traefik.http.routers.fastapi.tls=true enables HTTPS
  3. traefik.http.routers.fastapi.tls.certresolver=letsencrypt sets the certificate issuer as Let's Encrypt

Next, for the traefik service, we added the appropriate ports and a volume for the certificates directory. The volume ensures that the certificates persist even if the container is brought down.

As for the labels:

  1. traefik.http.routers.dashboard.rule=Host(`dashboard-fastapi-traefik.your-domain.com`) defines the dashboard host, so it can can be accessed at $Host/dashboard/
  2. traefik.http.routers.dashboard.tls=true enables HTTPS
  3. traefik.http.routers.dashboard.tls.certresolver=letsencrypt sets the certificate resolver to Let's Encrypt
  4. traefik.http.routers.dashboard.middlewares=auth enables HTTP BasicAuth middleware
  5. traefik.http.middlewares.auth.basicauth.users defines the username and hashed password for logging in

You can create a new password hash using the htpasswd utility:

# username: testuser
# password: password

$ echo $(htpasswd -nb testuser password) | sed -e s/\\$/\\$\\$/g
testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Feel free to use an env_file to store the username and password as environment variables

USERNAME=testuser
HASHED_PASSWORD=$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Finally, add a new Dockerfile called Dockerfile.traefik:

# Dockerfile.traefik

FROM traefik:v2.2

COPY ./traefik.prod.toml ./etc/traefik/traefik.toml

Next, spin up the new container:

$ docker-compose -f docker-compose.prod.yml up -d --build

Ensure the two URLs work:

  1. https://fastapi-traefik.your-domain.com
  2. https://dashboard-fastapi-traefik.your-domain.com/dashboard

Also, make sure that when you access the HTTP versions of the above URLs, you're redirected to the HTTPS versions.

Finally, Let's Encrypt certificates have a validity of 90 days. Treafik will automatically handle renewing the certificates for you behind the scenes, so that's one less thing you'll have to worry about!

Conclusion

In this tutorial, we walked through how to containerize a FastAPI application with Postgres for development. We also created a production-ready Docker Compose file, set up Traefik and Let's Encrypt to serve the application via HTTPS, and enabled a secure dashboard to monitor our services.

In terms of actual deployment to a production environment, you'll probably want to use a:

  1. Fully-managed database service -- like RDS or Cloud SQL -- rather than managing your own Postgres instance within a container.
  2. Non-root user for the services

You can find the code in the fastapi-docker-traefik repo.

Original article source at: https://testdriven.io/

#docker #fastapi #postgres 

How to Set Up FastAPI with Postgres, Uvicorn, and Docker

How to Set Up Django with Postgres and Docker

In this tutorial, we'll look at how to set up Django with Postgres and Docker. For production environments, we'll add on Gunicorn, Traefik, and Let's Encrypt.

Project Setup

Start by creating a project directory:

$ mkdir django-docker-traefik && cd django-docker-traefik
$ mkdir app && cd app
$ python3.9 -m venv venv
$ source venv/bin/activate

Feel free to swap out virtualenv and Pip for Poetry or Pipenv. For more, review Modern Python Environments.

Next, let's install Django and create a simple Django application:

(venv)$ pip install django==3.2.3
(venv)$ django-admin.py startproject config .
(venv)$ python manage.py migrate

Run the application:

(venv)$ python manage.py runserver

Navigate to http://localhost:8000/ to view the Django welcome screen. Kill the server and exit from the virtual environment once done. Delete the virtual environment as well. We now have a simple Django project to work with.

Create a requirements.txt file in the "app" directory and add Django as a dependency:

Django==3.2.3

Since we'll be moving to Postgres, go ahead and remove the db.sqlite3 file from the "app" directory.

Your project directory should look like:

└── app
    ├── config
    │   ├── __init__.py
    │   ├── asgi.py
    │   ├── settings.py
    │   ├── urls.py
    │   └── wsgi.py
    ├── manage.py
    └── requirements.txt

Docker

Install Docker, if you don't already have it, then add a Dockerfile to the "app" directory:

# app/Dockerfile

# pull the official docker image
FROM python:3.9.5-slim

# set work directory
WORKDIR /app

# set env variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# copy project
COPY . .

So, we started with a slim Docker image for Python 3.9.5. We then set up a working directory along with two environment variables:

  1. PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)
  2. PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Finally, we copied over the requirements.txt file, installed the dependencies, and copied over the project.

Review Docker for Python Developers for more on structuring Dockerfiles as well as some best practices for configuring Docker for Python-based development.

Next, add a docker-compose.yml file to the project root:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: ./app
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - ./app:/app
    ports:
      - 8008:8000
    environment:
      - DEBUG=1

Review the Compose file reference for info on how this file works.

Build the image:

$ docker-compose build

Once the image is built, run the container:

$ docker-compose up -d

Navigate to http://localhost:8008 to again view the welcome page.

Check for errors in the logs if this doesn't work via docker-compose logs -f.

Postgres

To configure Postgres, we'll need to add a new service to the docker-compose.yml file, update the Django settings, and install Psycopg2.

First, add a new service called db to docker-compose.yml:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: ./app
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py runserver 0.0.0.0:8000'
    volumes:
      - ./app:/app
    ports:
      - 8008:8000
    environment:
      - DEBUG=1
      - DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik
    depends_on:
      - db
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=django_traefik
      - POSTGRES_PASSWORD=django_traefik
      - POSTGRES_DB=django_traefik

volumes:
  postgres_data:

To persist the data beyond the life of the container we configured a volume. This config will bind postgres_data to the "/var/lib/postgresql/data/" directory in the container.

We also added an environment key to define a name for the default database and set a username and password.

Review the "Environment Variables" section of the Postgres Docker Hub page for more info.

Take note of the new command in the web service:

bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py runserver 0.0.0.0:8000'

while !</dev/tcp/db/5432; do sleep 1 will continue until Postgres is up. Once up, python manage.py runserver 0.0.0.0:8000 runs.

To configure Postgres, add django-environ, to load/read environment variables, and Psycopg2 to requirements.txt:

Django==3.2.3
django-environ==0.4.5
psycopg2-binary==2.8.6

Initialize environ at the top of config/settings.py:

# config/settings.py

import environ

env = environ.Env()

Then, update the DATABASES dict:

# config/settings.py

DATABASES = {
    'default': env.db(),
}

django-environ will automatically parse the database connection URL string that we added to docker-compose.yml:

DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik

Update the DEBUG variables as well:

# config/settings.py

DEBUG = env('DEBUG')

Build the new image and spin up the two containers:

$ docker-compose up -d --build

Run the initial migration:

$ docker-compose exec web python manage.py migrate --noinput

Ensure the default Django tables were created:

$ docker-compose exec db psql --username=django_traefik --dbname=django_traefik

psql (13.2)
Type "help" for help.

django_traefik=# \l
                                            List of databases
      Name      |     Owner      | Encoding |  Collate   |   Ctype    |         Access privileges
----------------+----------------+----------+------------+------------+-----------------------------------
 django_traefik | django_traefik | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres       | django_traefik | UTF8     | en_US.utf8 | en_US.utf8 |
 template0      | django_traefik | UTF8     | en_US.utf8 | en_US.utf8 | =c/django_traefik                +
                |                |          |            |            | django_traefik=CTc/django_traefik
 template1      | django_traefik | UTF8     | en_US.utf8 | en_US.utf8 | =c/django_traefik                +
                |                |          |            |            | django_traefik=CTc/django_traefik
(4 rows)

django_traefik=# \c django_traefik
You are now connected to database "django_traefik" as user "django_traefik".

django_traefik=# \dt
                      List of relations
 Schema |            Name            | Type  |     Owner
--------+----------------------------+-------+----------------
 public | auth_group                 | table | django_traefik
 public | auth_group_permissions     | table | django_traefik
 public | auth_permission            | table | django_traefik
 public | auth_user                  | table | django_traefik
 public | auth_user_groups           | table | django_traefik
 public | auth_user_user_permissions | table | django_traefik
 public | django_admin_log           | table | django_traefik
 public | django_content_type        | table | django_traefik
 public | django_migrations          | table | django_traefik
 public | django_session             | table | django_traefik
(10 rows)

django_traefik=# \q

You can check that the volume was created as well by running:

$ docker volume inspect django-docker-traefik_postgres_data

You should see something similar to:

[
    {
        "CreatedAt": "2021-05-20T01:01:34Z",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "django-docker-traefik",
            "com.docker.compose.version": "1.29.1",
            "com.docker.compose.volume": "postgres_data"
        },
        "Mountpoint": "/var/lib/docker/volumes/django-docker-traefik_postgres_data/_data",
        "Name": "django-docker-traefik_postgres_data",
        "Options": null,
        "Scope": "local"
    }
]

Gunicorn

Moving along, for production environments, let's add Gunicorn, a production-grade WSGI server, to the requirements file:

Django==3.2.3
django-environ==0.4.5
gunicorn==20.1.0
psycopg2-binary==2.8.6

Since we still want to use Django's built-in server in development, create a new compose file called docker-compose.prod.yml for production:

# docker-compose.prod.yml

version: '3.8'

services:
  web:
    build: ./app
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:8000 config.wsgi'
    ports:
      - 8008:8000
    environment:
      - DEBUG=0
      - DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik
    depends_on:
      - db
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=django_traefik
      - POSTGRES_PASSWORD=django_traefik
      - POSTGRES_DB=django_traefik

volumes:
  postgres_data_prod:

If you have multiple environments, you may want to look at using a docker-compose.override.yml configuration file. With this approach, you'd add your base config to a docker-compose.yml file and then use a docker-compose.override.yml file to override those config settings based on the environment.

Take note of the default command. We're running Gunicorn rather than the Django development server. We also removed the volume from the web service since we don't need it in production.

Bring down the development containers (and the associated volumes with the -v flag):

$ docker-compose down -v

Then, build the production images and spin up the containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Run the migrations:

$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput

Verify that the django_traefik database was created along with the default Django tables. Test out the admin page at http://localhost:8008/admin. The static files are not being loaded correctly. This is expected. We'll fix this shortly.

Again, if the container fails to start, check for errors in the logs via docker-compose -f docker-compose.prod.yml logs -f.

Production Dockerfile

Create a new Dockerfile called Dockerfile.prod for use with production builds:

# app/Dockerfile.prod

###########
# BUILDER #
###########

# pull official base image
FROM python:3.9.5-slim as builder

# set work directory
WORKDIR /app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install system dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

# lint
RUN pip install --upgrade pip
RUN pip install flake8==3.9.1
COPY . .
RUN flake8 --ignore=E501,F401 .

# install python dependencies
COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /usr/src/app/wheels -r requirements.txt


#########
# FINAL #
#########

# pull official base image
FROM python:3.9.5-slim

# create directory for the app user
RUN mkdir -p /home/app

# create the app user
RUN addgroup --system app && adduser --system --group app

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
WORKDIR $APP_HOME

# install dependencies
COPY --from=builder /usr/src/app/wheels /wheels
COPY --from=builder /app/requirements.txt .
RUN pip install --upgrade pip
RUN pip install --no-cache /wheels/*

# copy project
COPY . $APP_HOME

# chown all the files to the app user
RUN chown -R app:app $APP_HOME

# change to the app user
USER app

Here, we used a Docker multi-stage build to reduce the final image size. Essentially, builder is a temporary image that's used for building the Python wheels. The wheels are then copied over to the final production image and the builder image is discarded.

You could take the multi-stage build approach a step further and use a single Dockerfile instead of creating two Dockerfiles. Think of the pros and cons of using this approach over two different files.

Did you notice that we created a non-root user? By default, Docker runs container processes as root inside of a container. This is a bad practice since attackers can gain root access to the Docker host if they manage to break out of the container. If you're root in the container, you'll be root on the host.

Update the web service within the docker-compose.prod.yml file to build with Dockerfile.prod:

web:
  build:
    context: ./app
    dockerfile: Dockerfile.prod
  command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:8000 config.wsgi'
  ports:
    - 8008:8000
  environment:
    - DEBUG=0
    - DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik
  depends_on:
    - db

Try it out:

$ docker-compose -f docker-compose.prod.yml down -v
$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput

Traefik

Next, let's add Traefik, a reverse proxy, into the mix.

New to Traefik? Check out the offical Getting Started guide.

Traefik vs Nginx: Traefik is a modern, HTTP reverse proxy and load balancer. It's often compared to Nginx, a web server and reverse proxy. Since Nginx is primarily a webserver, it can be used to serve up a webpage as well as serve as a reverse proxy and load balancer. In general, Traefik is simpler to get up and running while Nginx is more versatile.

Traefik:

  1. Reverse proxy and load balancer
  2. Automatically issues and renews SSL certificates, via Let's Encrypt, out-of-the-box
  3. Use Traefik for simple, Docker-based microservices

Nginx:

  1. Web server, reverse proxy, and load balancer
  2. Slightly faster than Traefik
  3. Use Nginx for complex services

Add a new file called traefik.dev.toml:

# traefik.dev.toml

# listen on port 80
[entryPoints]
  [entryPoints.web]
    address = ":80"

# Traefik dashboard over http
[api]
insecure = true

[log]
level = "DEBUG"

[accessLog]

# containers are not discovered automatically
[providers]
  [providers.docker]
    exposedByDefault = false

Here, since we don't want to expose the db service, we set exposedByDefault to false. To manually expose a service we can add the "traefik.enable=true" label to the Docker Compose file.

Next, update the docker-compose.yml file so that our web service is discovered by Traefik and add a new traefik service:

# docker-compose.yml

version: '3.8'

services:
  web:
    build: ./app
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py runserver 0.0.0.0:8000'
    volumes:
      - ./app:/app
    expose:  # new
      - 8000
    environment:
      - DEBUG=1
      - DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik
    depends_on:
      - db
    labels: # new
      - "traefik.enable=true"
      - "traefik.http.routers.django.rule=Host(`django.localhost`)"
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=django_traefik
      - POSTGRES_PASSWORD=django_traefik
      - POSTGRES_DB=django_traefik
  traefik: # new
    image: traefik:v2.2
    ports:
      - 8008:80
      - 8081:8080
    volumes:
      - "$PWD/traefik.dev.toml:/etc/traefik/traefik.toml"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

volumes:
  postgres_data:

First, the web service is only exposed to other containers on port 8000. We also added the following labels to the web service:

  1. traefik.enable=true enables Traefik to discover the service
  2. traefik.http.routers.django.rule=Host(`django.localhost`) when the request has Host=django.localhost, the request is redirected to this service

Take note of the volumes within the traefik service:

  1. "$PWD/traefik.dev.toml:/etc/traefik/traefik.toml" maps the local config file to the config file in the container so that the settings are kept in sync
  2. "/var/run/docker.sock:/var/run/docker.sock:ro" enables Traefik to discover other containers

To test, first bring down any existing containers:

$ docker-compose down -v
$ docker-compose -f docker-compose.prod.yml down -v

Build the new development images and spin up the containers:

$ docker-compose up -d --build

Navigate to http://django.localhost:8008/. You should see the Django welcome page.

Next, check out the dashboard at django.localhost:8081:

traefik dashboard

Bring the containers and volumes down once done:

$ docker-compose down -v

Let's Encrypt

We've successfully created a working example of Django, Docker, and Traefik in development mode. For production, you'll want to configure Traefik to manage TLS certificates via Let's Encrypt. In short, Traefik will automatically contact the certificate authority to issue and renew certificates.

Since Let's Encrypt won't issue certificates for localhost, you'll need to spin up your production containers on a cloud compute instance (like a DigitalOcean droplet or an AWS EC2 instance). You'll also need a valid domain name. If you don't have one, you can create a free domain at Freenom.

We used a DigitalOcean droplet along with Docker machine to quickly provision a compute instance with Docker and deployed the production containers to test out the Traefik config. Check out the DigitalOcean example from the Docker docs for more on using Docker Machine to provision a droplet.

Assuming you configured a compute instance and set up a free domain, you're now ready to set up Traefik in production mode.

Start by adding a production version of the Traefik config to a file called traefik.prod.toml:

# traefik.prod.toml

[entryPoints]
  [entryPoints.web]
    address = ":80"
  [entryPoints.web.http]
    [entryPoints.web.http.redirections]
      [entryPoints.web.http.redirections.entryPoint]
        to = "websecure"
        scheme = "https"

  [entryPoints.websecure]
    address = ":443"

[accessLog]

[api]
dashboard = true

[providers]
  [providers.docker]
    exposedByDefault = false

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

Make sure to replace your@email.com with your actual email address.

What's happening here:

  1. entryPoints.web sets the entry point for our insecure HTTP application to port 80
  2. entryPoints.websecure sets the entry point for our secure HTTPS application to port 443
  3. entryPoints.web.http.redirections.entryPoint redirects all insecure requests to the secure port
  4. exposedByDefault = false unexposes all services
  5. dashboard = true enables the monitoring dashboard

Finally, take note of:

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

This is where the Let's Encrypt config lives. We defined where the certificates will be stored along with the verification type, which is an HTTP Challenge.

Next, assuming you updated your domain name's DNS records, create two new A records that both point at your compute instance's public IP:

  1. django-traefik.your-domain.com - for the web service
  2. dashboard-django-traefik.your-domain.com - for the Traefik dashboard

Make sure to replace your-domain.com with your actual domain.

Next, update docker-compose.prod.yml like so:

# docker-compose.prod.yml

version: '3.8'

services:
  web:
    build:
      context: ./app
      dockerfile: Dockerfile.prod
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:8000 config.wsgi'
    expose:  # new
      - 8000
    environment:
      - DEBUG=0
      - DATABASE_URL=postgresql://django_traefik:django_traefik@db:5432/django_traefik
      - DJANGO_ALLOWED_HOSTS=.your-domain.com
    depends_on:
      - db
    labels:  # new
      - "traefik.enable=true"
      - "traefik.http.routers.django.rule=Host(`django-traefik.your-domain.com`)"
      - "traefik.http.routers.django.tls=true"
      - "traefik.http.routers.django.tls.certresolver=letsencrypt"
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    expose:
      - 5432
    environment:
      - POSTGRES_USER=django_traefik
      - POSTGRES_PASSWORD=django_traefik
      - POSTGRES_DB=django_traefik
  traefik:  # new
    build:
      context: .
      dockerfile: Dockerfile.traefik
    ports:
      - 80:80
      - 443:443
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./traefik-public-certificates:/certificates"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.dashboard.rule=Host(`dashboard-django-traefik.your-domain.com`)"
      - "traefik.http.routers.dashboard.tls=true"
      - "traefik.http.routers.dashboard.tls.certresolver=letsencrypt"
      - "traefik.http.routers.dashboard.service=api@internal"
      - "traefik.http.routers.dashboard.middlewares=auth"
      - "traefik.http.middlewares.auth.basicauth.users=testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1"

volumes:
  postgres_data_prod:
  traefik-public-certificates:

Again, make sure to replace your-domain.com with your actual domain.

What's new here?

In the web service, we added the following labels:

  1. traefik.http.routers.django.rule=Host(`django-traefik.your-domain.com`) changes the host to the actual domain
  2. traefik.http.routers.django.tls=true enables HTTPS
  3. traefik.http.routers.django.tls.certresolver=letsencrypt sets the certificate issuer as Let's Encrypt

Next, for the traefik service, we added the appropriate ports and a volume for the certificates directory. The volume ensures that the certificates persist even if the container is brought down.

As for the labels:

  1. traefik.http.routers.dashboard.rule=Host(`dashboard-django-traefik.your-domain.com`) defines the dashboard host, so it can can be accessed at $Host/dashboard/
  2. traefik.http.routers.dashboard.tls=true enables HTTPS
  3. traefik.http.routers.dashboard.tls.certresolver=letsencrypt sets the certificate resolver to Let's Encrypt
  4. traefik.http.routers.dashboard.middlewares=auth enables HTTP BasicAuth middleware
  5. traefik.http.middlewares.auth.basicauth.users defines the username and hashed password for logging in

You can create a new password hash using the htpasswd utility:

# username: testuser
# password: password

$ echo $(htpasswd -nb testuser password) | sed -e s/\\$/\\$\\$/g
testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Feel free to use an env_file to store the username and password as environment variables

USERNAME=testuser
HASHED_PASSWORD=$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Next, update the ALLOWED_HOSTS environment variable in config/settings.py like so:

# config/settings.py

ALLOWED_HOSTS = env('DJANGO_ALLOWED_HOSTS', default=[])

Finally, add a new Dockerfile called Dockerfile.traefik:

# Dockerfile.traefik

FROM traefik:v2.2

COPY ./traefik.prod.toml ./etc/traefik/traefik.toml

Next, spin up the new containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Ensure the two URLs work:

  1. https://django-traefik.your-domain.com
  2. https://dashboard-django-traefik.your-domain.com/dashboard/

Also, make sure that when you access the HTTP versions of the above URLs, you're redirected to the HTTPS versions.

Finally, Let's Encrypt certificates have a validity of 90 days. Treafik will automatically handle renewing the certificates for you behind the scenes, so that's one less thing you'll have to worry about!

Static Files

Since Traefik doesn't serve static files, we'll use WhiteNoise to manage the static assets.

First, add the package to the requirements.txt file:

Django==3.2.3
django-environ==0.4.5
gunicorn==20.1.0
psycopg2-binary==2.8.6
whitenoise==5.2.0

Update the middleware in config/settings.py like so:

# config/settings.py

MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    'whitenoise.middleware.WhiteNoiseMiddleware',  # new
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
]

Then configure the handling of your staticfiles with STATIC_ROOT:

# config/settings.py

STATIC_ROOT = BASE_DIR / 'staticfiles'

FInally, add compression and caching support:

# config/settings.py

STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage'

To test, update the images and containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Collect the static files:

$ docker-compose -f docker-compose.prod.yml exec web python manage.py collectstatic

Ensure the static files are being served correctly at https://django-traefik.your-domain.com/admin.

Conclusion

In this tutorial, we walked through how to containerize a Django application with Postgres for development. We also created a production-ready Docker Compose file, set up Traefik and Let's Encrypt to serve the application via HTTPS, and enabled a secure dashboard to monitor our services.

In terms of actual deployment to a production environment, you'll probably want to use a:

  1. Fully-managed database service -- like RDS or Cloud SQL -- rather than managing your own Postgres instance within a container.
  2. Non-root user for the services

You can find the code in the django-docker-traefik repo.

Original article source at: https://testdriven.io/

#django #postgres #docker 

How to Set Up Django with Postgres and Docker
Desmond  Gerber

Desmond Gerber

1669001489

Dockerizing Flask with Postgres, Gunicorn, and Traefik

In this tutorial, we'll look at how to set up Flask with Postgres and Docker. For production environments, we'll add on Gunicorn, Traefik, and Let's Encrypt.

Project Setup

Start by creating a project directory:

$ mkdir flask-docker-traefik && cd flask-docker-traefik
$ python3.9 -m venv venv
$ source venv/bin/activate
(venv)$

Feel free to swap out virtualenv and Pip for Poetry or Pipenv. For more, review Modern Python Environments.

Then, create the following files and folders:

└── services
    └── web
        ├── manage.py
        ├── project
        │   └── __init__.py
        └── requirements.txt

Add Flask to requirements.txt:

Flask==2.0.1

Install the package from "services/web":

(venv)$ pip install -r requirements.txt

Next, let's create a simple Flask application in __init.py__:

from flask import Flask, jsonify

app = Flask(__name__)


@app.get("/")
def read_root():
    return jsonify(hello="world")

Then, to configure the Flask CLI tool to run and manage the app from the command line, add the following to services/web/manage.py:

from flask.cli import FlaskGroup

from project import app

cli = FlaskGroup(app)

if __name__ == "__main__":
    cli()

Here, we created a new FlaskGroup instance to extend the normal CLI with commands related to the Flask app.

Run the server from the "web" directory:

(venv)$ export FLASK_APP=project/__init__.py
(venv)$ python manage.py run

Navigate to 127.0.0.1:5000, you should see:

{
  "hello": "world"
}

Kill the server once done. Exit from the virtual environment, and remove it as well.

Docker

Install Docker, if you don't already have it, then add a Dockerfile to the "web" directory:

# pull the official docker image
FROM python:3.9.5-slim

# set work directory
WORKDIR /app

# set env variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# copy project
COPY . .

So, we started with a slim-based Docker image for Python 3.9.5. We then set a working directory along with two environment variables:

  1. PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)
  2. PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Finally, we copied over the requirements.txt file, installed the dependencies, and copied over the Flask app itself.

Review Docker for Python Developers for more on structuring Dockerfiles as well as some best practices for configuring Docker for Python-based development.

Next, add a docker-compose.yml file to the project root:

version: '3.8'

services:
  web:
    build: ./services/web
    command: python manage.py run -h 0.0.0.0
    volumes:
      - ./services/web/:/app
    ports:
      - 5000:5000
    environment:
      - FLASK_APP=project/__init__.py
      - FLASK_ENV=development

Review the Compose file reference for info on how this file works.

Build the image:

$ docker-compose build

Once the image is built, run the container:

$ docker-compose up -d

Navigate to http://127.0.0.1:5000/ to again view the hello world sanity check.

Check for errors in the logs if this doesn't work via docker-compose logs -f.

Postgres

To configure Postgres, we need to add a new service to the docker-compose.yml file, set up Flask-SQLAlchemy, and install Psycopg2.

First, add a new service called db to docker-compose.yml:

version: '3.8'

services:
  web:
    build: ./services/web
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py run -h 0.0.0.0'
    volumes:
      - ./services/web/:/app
    ports:
      - 5000:5000
    environment:
      - FLASK_APP=project/__init__.py
      - FLASK_ENV=development
      - DATABASE_URL=postgresql://hello_flask:hello_flask@db:5432/hello_flask_dev
    depends_on:
      - db

  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=hello_flask
      - POSTGRES_PASSWORD=hello_flask
      - POSTGRES_DB=hello_flask_dev

volumes:
  postgres_data:

To persist the data beyond the life of the container we configured a volume. This config will bind postgres_data to the "/var/lib/postgresql/data/" directory in the container.

We also added an environment key to define a name for the default database and set a username and password.

Review the "Environment Variables" section of the Postgres Docker Hub page for more info.

Take note of the new command in the web service:

bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py run -h 0.0.0.0'

while !</dev/tcp/db/5432; do sleep 1 will continue until Postgres is up. Once up, python manage.py run -h 0.0.0.0 runs.

Then, add a new file called config.py to the "project" directory, where we'll define environment-specific configuration variables:

import os


class Config(object):
    SQLALCHEMY_DATABASE_URI = os.getenv("DATABASE_URL", "sqlite://")
    SQLALCHEMY_TRACK_MODIFICATIONS = False

Here, the database is configured based on the DATABASE_URL environment variable that we just defined. Take note of the default value.

Update __init__.py to pull in the config on init:

from flask import Flask, jsonify

app = Flask(__name__)
app.config.from_object("project.config.Config")


@app.get("/")
def read_root():
    return jsonify(hello="world")

Add Flask-SQLAlchemy and Psycopg2 to requirements.txt:

Flask==2.0.1
Flask-SQLAlchemy==2.5.1
psycopg2-binary==2.8.6

Update __init__.py again to create a new SQLAlchemy instance and define a database model:

from dataclasses import dataclass

from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config.from_object("project.config.Config")
db = SQLAlchemy(app)


@dataclass
class User(db.Model):
    id: int = db.Column(db.Integer, primary_key=True)
    email: str = db.Column(db.String(120), unique=True, nullable=False)
    active: bool = db.Column(db.Boolean(), default=True, nullable=False)

    def __init__(self, email: str) -> None:
        self.email = email


@app.get("/")
def read_root():
    users = User.query.all()
    return jsonify(users)

Using the dataclass decorator on the database model helps us serialize the database objects.

Finally, update manage.py:

from flask.cli import FlaskGroup

from project import app, db

cli = FlaskGroup(app)


@cli.command("create_db")
def create_db():
    db.drop_all()
    db.create_all()
    db.session.commit()


if __name__ == "__main__":
    cli()

This registers a new command, create_db, to the CLI so that we can run it from the command line, which we'll use shortly to apply the model to the database.

Build the new image and spin up the two containers:

$ docker-compose up -d --build

Create the table:

$ docker-compose exec web python manage.py create_db

Get the following error?

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL:  database "hello_flask_dev" does not exist

Run docker-compose down -v to remove the volumes along with the containers. Then, re-build the images, run the containers, and apply the migrations.

Ensure the users table was created:

$ docker-compose exec db psql --username=hello_flask --dbname=hello_flask_dev

psql (13.3)
Type "help" for help.

hello_flask_dev=# \l
                                        List of databases
      Name       |    Owner    | Encoding |  Collate   |   Ctype    |      Access privileges
-----------------+-------------+----------+------------+------------+-----------------------------
 hello_flask_dev | hello_flask | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres        | hello_flask | UTF8     | en_US.utf8 | en_US.utf8 |
 template0       | hello_flask | UTF8     | en_US.utf8 | en_US.utf8 | =c/hello_flask             +
                 |             |          |            |            | hello_flask=CTc/hello_flask
 template1       | hello_flask | UTF8     | en_US.utf8 | en_US.utf8 | =c/hello_flask             +
                 |             |          |            |            | hello_flask=CTc/hello_flask
(4 rows)

hello_flask_dev=# \c hello_flask_dev
You are now connected to database "hello_flask_dev" as user "hello_flask".

hello_flask_dev=# \dt
          List of relations
 Schema | Name | Type  |    Owner
--------+------+-------+-------------
 public | user | table | hello_flask
(1 row)

hello_flask_dev=# \q

You can check that the volume was created as well by running:

$ docker volume inspect flask-docker-traefik_postgres_data

You should see something similar to:

[
    {
        "CreatedAt": "2021-06-05T14:12:52Z",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "flask-docker-traefik",
            "com.docker.compose.version": "1.29.1",
            "com.docker.compose.volume": "postgres_data"
        },
        "Mountpoint": "/var/lib/docker/volumes/flask-docker-traefik_postgres_data/_data",
        "Name": "flask-docker-traefik_postgres_data",
        "Options": null,
        "Scope": "local"
    }
]

Navigate to http://127.0.0.1:5000. The sanity check shows an empty list. That's because we haven't populated the users table. Let's add a CLI seed command for adding sample users to the users table in manage.py:

from flask.cli import FlaskGroup

from project import User, app, db

cli = FlaskGroup(app)


@cli.command("create_db")
def create_db():
    db.drop_all()
    db.create_all()
    db.session.commit()


@cli.command("seed_db") # new
def seed_db():
    db.session.add(User(email="michael@mherman.org"))
    db.session.add(User(email="test@example.com"))
    db.session.commit()


if __name__ == "__main__":
    cli()

Try it out:

$ docker-compose exec web python manage.py seed_db

Navigate to http://127.0.0.1:5000 again. You should now see:

[
  {
    "active": true,
    "email": "michael@mherman.org",
    "id": 1
  },
  {
    "active": true,
    "email": "test@example.com",
    "id": 2
  }
]

Gunicorn

Moving along, for production environments, let's add Gunicorn, a production-grade WSGI server, to the requirements file:

Flask==2.0.1
Flask-SQLAlchemy==2.5.1
gunicorn==20.1.0
psycopg2-binary==2.8.6

Since we still want to use Flask's built-in server in development, create a new compose file in the project root called docker-compose.prod.yml for production:

version: '3.8'

services:
  web:
    build: ./services/web
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:5000 manage:app'
    ports:
      - 5000:5000
    environment:
      - FLASK_APP=project/__init__.py
      - FLASK_ENV=production
      - DATABASE_URL=postgresql://hello_flask:hello_flask@db:5432/hello_flask_prod
    depends_on:
      - db
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=hello_flask
      - POSTGRES_PASSWORD=hello_flask
      - POSTGRES_DB=hello_flask_prod

volumes:
  postgres_data_prod:

If you have multiple environments, you may want to look at using a docker-compose.override.yml configuration file. With this approach, you'd add your base config to a docker-compose.yml file and then use a docker-compose.override.yml file to override those config settings based on the environment.

Take note of the default command. We're running Gunicorn rather than the Flask development server. We also removed the volume from the web service since we don't need it in production.

Bring down the development containers (and the associated volumes with the -v flag):

$ docker-compose down -v

Then, build the production images and spin up the containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Create the table and apply the seed:

$ docker-compose -f docker-compose.prod.yml exec web python manage.py create_db
$ docker-compose -f docker-compose.prod.yml exec web python manage.py seed_db

Verify that the hello_flask_prod database was created along with the users table. Test out http://127.0.0.1:5000/.

Again, if the container fails to start, check for errors in the logs via docker-compose -f docker-compose.prod.yml logs -f.

Production Dockerfile

Create a new Dockerfile in the "web " directory called Dockerfile.prod for use with production builds:

###########
# BUILDER #
###########

# pull official base image
FROM python:3.9.5-slim as builder

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install system dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

# lint
RUN pip install --upgrade pip
RUN pip install flake8==3.9.1
COPY . .
RUN flake8 --ignore=E501,F401 .

# install python dependencies
COPY ./requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /usr/src/app/wheels -r requirements.txt


#########
# FINAL #
#########

# pull official base image
FROM python:3.9.5-slim

# create directory for the app user
RUN mkdir -p /home/app

# create the app user
RUN addgroup --system app && adduser --system --group app

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
WORKDIR $APP_HOME

# install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends netcat
COPY --from=builder /usr/src/app/wheels /wheels
COPY --from=builder /usr/src/app/requirements.txt .
RUN pip install --upgrade pip
RUN pip install --no-cache /wheels/*

# copy project
COPY . $APP_HOME

# chown all the files to the app user
RUN chown -R app:app $APP_HOME

# change to the app user
USER app

Here, we used a Docker multi-stage build to reduce the final image size. Essentially, builder is a temporary image that's used for building the Python wheels. The wheels are then copied over to the final production image and the builder image is discarded.

You could take the multi-stage build approach a step further and use a single Dockerfile instead of creating two Dockerfiles. Think of the pros and cons of using this approach over two different files.

Did you notice that we created a non-root user? By default, Docker runs container processes as root inside of a container. This is a bad practice since attackers can gain root access to the Docker host if they manage to break out of the container. If you're root in the container, you'll be root on the host.

Update the web service within the docker-compose.prod.yml file to build with Dockerfile.prod:

web:
  build:
    context: ./services/web
    dockerfile: Dockerfile.prod
  command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:5000 manage:app'
  ports:
    - 5000:5000
  environment:
    - FLASK_APP=project/__init__.py
    - FLASK_ENV=production
    - DATABASE_URL=postgresql://hello_flask:hello_flask@db:5432/hello_flask_prod
  depends_on:
    - db

Try it out:

$ docker-compose -f docker-compose.prod.yml down -v
$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py create_db
$ docker-compose -f docker-compose.prod.yml exec web python manage.py seed_db

Traefik

Next, let's add Traefik, a reverse proxy, into the mix.

New to Traefik? Check out the offical Getting Started guide.

Traefik vs Nginx: Traefik is a modern, HTTP reverse proxy and load balancer. It's often compared to Nginx, a web server and reverse proxy. Since Nginx is primarily a webserver, it can be used to serve up a webpage as well as serve as a reverse proxy and load balancer. In general, Traefik is simpler to get up and running while Nginx is more versatile.

Traefik:

  1. Reverse proxy and load balancer
  2. Automatically issues and renews SSL certificates, via Let's Encrypt, out-of-the-box
  3. Use Traefik for simple, Docker-based microservices

Nginx:

  1. Web server, reverse proxy, and load balancer
  2. Slightly faster than Traefik
  3. Use Nginx for complex services

Add a new folder called "traefik" to the "services" directory along with the following files:

traefik
├── Dockerfile.traefik
├── traefik.dev.toml
└── traefik.prod.toml

Your project structure should now look like this:

├── docker-compose.prod.yml
├── docker-compose.yml
└── services
    ├── traefik
    │   ├── Dockerfile.traefik
    │   ├── traefik.dev.toml
    │   └── traefik.prod.toml
    └── web
        ├── Dockerfile
        ├── Dockerfile.prod
        ├── manage.py
        ├── project
        │   ├── __init__.py
        │   └── config.py
        └── requirements.txt

Add the following to traefik.dev.toml:

# listen on port 80
[entryPoints]
  [entryPoints.web]
    address = ":80"

# Traefik dashboard over http
[api]
insecure = true

[log]
level = "DEBUG"

[accessLog]

# containers are not discovered automatically
[providers]
  [providers.docker]
    exposedByDefault = false

Here, since we don't want to expose the db service, we set exposedByDefault to false. To manually expose a service we can add the "traefik.enable=true" label to the Docker Compose file.

Next, update the docker-compose.yml file so that our web service is discovered by Traefik and add a new traefik service:

version: '3.8'

services:
  web:
    build: ./services/web
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; python manage.py run -h 0.0.0.0'
    volumes:
      - ./services/web/:/app
    expose:  # new
      - 5000
    environment:
      - FLASK_APP=project/__init__.py
      - FLASK_ENV=development
      - DATABASE_URL=postgresql://hello_flask:hello_flask@db:5432/hello_flask_dev
    depends_on:
      - db
    labels:  # new
      - "traefik.enable=true"
      - "traefik.http.routers.flask.rule=Host(`flask.localhost`)"

  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=hello_flask
      - POSTGRES_PASSWORD=hello_flask
      - POSTGRES_DB=hello_flask_dev

  traefik:  # new
    image: traefik:v2.2
    ports:
      - 80:80
      - 8081:8080
    volumes:
      - "./services/traefik/traefik.dev.toml:/etc/traefik/traefik.toml"
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

volumes:
  postgres_data:

First, the web service is only exposed to other containers on port 5000. We also added the following labels to the web service:

  1. traefik.enable=true enables Traefik to discover the service
  2. traefik.http.routers.flask.rule=Host(`flask.localhost`) when the request has Host=flask.localhost, the request is redirected to this service

Take note of the volumes within the traefik service:

  1. ./services/traefik/traefik.dev.toml:/etc/traefik/traefik.toml maps the local config file to the config file in the container so that the settings are kept in sync
  2. /var/run/docker.sock:/var/run/docker.sock:ro enables Traefik to discover other containers

To test, first bring down any existing containers:

$ docker-compose down -v
$ docker-compose -f docker-compose.prod.yml down -v

Build the new development images and spin up the containers:

$ docker-compose up -d --build

Create the table and apply the seed:

$ docker-compose exec web python manage.py create_db
$ docker-compose exec web python manage.py seed_db

Navigate to http://flask.localhost. You should see:

[
  {
    "active": true,
    "email": "michael@mherman.org",
    "id": 1
  },
  {
    "active": true,
    "email": "test@example.com",
    "id": 2
  }
]

You can test via cURL as well:

$ curl -H Host:flask.localhost http://0.0.0.0

Next, checkout the dashboard at http://flask.localhost:8081:

traefik dashboard

Bring the containers and volumes down once done:

$ docker-compose down -v

Let's Encrypt

We've successfully created a working example of Flask, Docker, and Traefik in development mode. For production, you'll want to configure Traefik to manage TLS certificates via Let's Encrypt. In short, Traefik will automatically contact the certificate authority to issue and renew certificates.

Since Let's Encrypt won't issue certificates for localhost, you'll need to spin up your production containers on a cloud compute instance (like a DigitalOcean droplet or an AWS EC2 instance). You'll also need a valid domain name. If you don't have one, you can create a free domain at Freenom.

We used a DigitalOcean droplet along with Docker machine to quickly provision a compute instance with Docker and deployed the production containers to test out the Traefik config. Check out the DigitalOcean example from the Docker docs for more on using Docker Machine to provision a droplet.

Assuming you configured a compute instance and set up a free domain, you're now ready to set up Traefik in production mode.

Start by adding a production version of the Traefik config to traefik.prod.toml:

[entryPoints]
  [entryPoints.web]
    address = ":80"
  [entryPoints.web.http]
    [entryPoints.web.http.redirections]
      [entryPoints.web.http.redirections.entryPoint]
        to = "websecure"
        scheme = "https"

  [entryPoints.websecure]
    address = ":443"

[accessLog]

[api]
dashboard = true

[providers]
  [providers.docker]
    exposedByDefault = false

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

Make sure to replace your@email.com with your actual email address.

What's happening here:

  1. entryPoints.web sets the entry point for our insecure HTTP application to port 80
  2. entryPoints.websecure sets the entry point for our secure HTTPS application to port 443
  3. entryPoints.web.http.redirections.entryPoint redirects all insecure requests to the secure port
  4. exposedByDefault = false unexposes all services
  5. dashboard = true enables the monitoring dashboard

Finally, take note of:

[certificatesResolvers.letsencrypt.acme]
  email = "your@email.com"
  storage = "/certificates/acme.json"
  [certificatesResolvers.letsencrypt.acme.httpChallenge]
    entryPoint = "web"

This is where the Let's Encrypt config lives. We defined where the certificates will be stored along with the verification type, which is an HTTP Challenge.

Next, assuming you updated your domain name's DNS records, create two new A records that both point at your compute instance's public IP:

  1. flask-traefik.your-domain.com - for the web service
  2. dashboard-flask-traefik.your-domain.com - for the Traefik dashboard

Make sure to replace your-domain.com with your actual domain.

Next, update docker-compose.prod.yml like so:

version: '3.8'

services:
  web:
    build:
      context: ./services/web
      dockerfile: Dockerfile.prod
    command: bash -c 'while !</dev/tcp/db/5432; do sleep 1; done; gunicorn --bind 0.0.0.0:5000 manage:app'
    expose:  # new
      - 5000
    environment:
      - FLASK_APP=project/__init__.py
      - FLASK_ENV=production
      - DATABASE_URL=postgresql://hello_flask:hello_flask@db:5432/hello_flask_prod
    depends_on:
      - db
    labels:  # new
      - "traefik.enable=true"
      - "traefik.http.routers.flask.rule=Host(`flask-traefik.your-domain.com`)"
      - "traefik.http.routers.flask.tls=true"
      - "traefik.http.routers.flask.tls.certresolver=letsencrypt"
  db:
    image: postgres:13-alpine
    volumes:
      - postgres_data_prod:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=hello_flask
      - POSTGRES_PASSWORD=hello_flask
      - POSTGRES_DB=hello_flask_prod
  traefik:  # new
    build:
      context: ./services/traefik
      dockerfile: Dockerfile.traefik
    ports:
      - 80:80
      - 443:443
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./traefik-public-certificates:/certificates"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.dashboard.rule=Host(`dashboard-flask-traefik.your-domain.com`)"
      - "traefik.http.routers.dashboard.tls=true"
      - "traefik.http.routers.dashboard.tls.certresolver=letsencrypt"
      - "traefik.http.routers.dashboard.service=api@internal"
      - "traefik.http.routers.dashboard.middlewares=auth"
      - "traefik.http.middlewares.auth.basicauth.users=testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1"

volumes:
  postgres_data_prod:
  traefik-public-certificates:

Again, make sure to replace your-domain.com with your actual domain.

What's new here?

In the web service, we added the following labels:

  1. traefik.http.routers.flask.rule=Host(`flask-traefik.your-domain.com`) changes the host to the actual domain
  2. traefik.http.routers.flask.tls=true enables HTTPS
  3. traefik.http.routers.flask.tls.certresolver=letsencrypt sets the certificate issuer as Let's Encrypt

Next, for the traefik service, we added the appropriate ports and a volume for the certificates directory. The volume ensures that the certificates persist even if the container is brought down.

As for the labels:

  1. traefik.http.routers.dashboard.rule=Host(`dashboard-flask-traefik.your-domain.com`) defines the dashboard host, so it can can be accessed at $Host/dashboard/
  2. traefik.http.routers.dashboard.tls=true enables HTTPS
  3. traefik.http.routers.dashboard.tls.certresolver=letsencrypt sets the certificate resolver to Let's Encrypt
  4. traefik.http.routers.dashboard.middlewares=auth enables HTTP BasicAuth middleware
  5. traefik.http.middlewares.auth.basicauth.users defines the username and hashed password for logging in

You can create a new password hash using the htpasswd utility:

# username: testuser
# password: password

$ echo $(htpasswd -nb testuser password) | sed -e s/\\$/\\$\\$/g
testuser:$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Feel free to use an env_file to store the username and password as environment variables

USERNAME=testuser
HASHED_PASSWORD=$$apr1$$jIKW.bdS$$eKXe4Lxjgy/rH65wP1iQe1

Update Dockerfile.traefik:

FROM traefik:v2.2

COPY ./traefik.prod.toml ./etc/traefik/traefik.toml

Next, spin up the new container:

$ docker-compose -f docker-compose.prod.yml up -d --build

Create the table and apply the seed:

$ docker-compose -f docker-compose.prod.yml exec web python manage.py create_db
$ docker-compose -f docker-compose.prod.yml exec web python manage.py seed_db

Ensure the two URLs work:

  1. https://flask-traefik.your-domain.com
  2. https://dashboard-flask-traefik.your-domain.com/dashboard/

Also, make sure that when you access the HTTP versions of the above URLs, you're redirected to the HTTPS versions.

Finally, Let's Encrypt certificates have a validity of 90 days. Treafik will automatically handle renewing the certificates for you behind the scenes, so that's one less thing you'll have to worry about!

Conclusion

In this tutorial, we walked through how to containerize a Flask application with Postgres for development. We also created a production-ready Docker Compose file, set up Traefik and Let's Encrypt to serve the application via HTTPS, and enabled a secure dashboard to monitor our services.

In terms of actual deployment to a production environment, you'll probably want to use a:

  1. Fully-managed database service -- like RDS or Cloud SQL -- rather than managing your own Postgres instance within a container.
  2. Non-root user for the services

You can find the code in the flask-docker-traefik repo.

Original article source at: https://testdriven.io/

#flask #postgres #gunicorn 

Dockerizing Flask with Postgres, Gunicorn, and Traefik

How to Configure Django To Run on Docker with Postgres

This is a step-by-step tutorial that details how to configure Django to run on Docker with Postgres. For production environments, we'll add on Nginx and Gunicorn. We'll also take a look at how to serve Django static and media files via Nginx.

Dependencies:

  1. Django v3.2.6
  2. Docker v20.10.8
  3. Python v3.9.6

Project Setup

Create a new project directory along with a new Django project:

$ mkdir django-on-docker && cd django-on-docker
$ mkdir app && cd app
$ python3.9 -m venv env
$ source env/bin/activate
(env)$

(env)$ pip install django==3.2.6
(env)$ django-admin.py startproject hello_django .
(env)$ python manage.py migrate
(env)$ python manage.py runserver

Feel free to swap out virtualenv and Pip for Poetry or Pipenv. For more, review Modern Python Environments.

Navigate to http://localhost:8000/ to view the Django welcome screen. Kill the server once done. Then, exit from and remove the virtual environment. We now have a simple Django project to work with.

Create a requirements.txt file in the "app" directory and add Django as a dependency:

Django==3.2.6

Since we'll be moving to Postgres, go ahead and remove the db.sqlite3 file from the "app" directory.

Your project directory should look like:

└── app
    ├── hello_django
    │   ├── __init__.py
    │   ├── asgi.py
    │   ├── settings.py
    │   ├── urls.py
    │   └── wsgi.py
    ├── manage.py
    └── requirements.txt

Docker

Install Docker, if you don't already have it, then add a Dockerfile to the "app" directory:

# pull official base image
FROM python:3.9.6-alpine

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install dependencies
RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

# copy project
COPY . .

So, we started with an Alpine-based Docker image for Python 3.9.6. We then set a working directory along with two environment variables:

  1. PYTHONDONTWRITEBYTECODE: Prevents Python from writing pyc files to disc (equivalent to python -B option)
  2. PYTHONUNBUFFERED: Prevents Python from buffering stdout and stderr (equivalent to python -u option)

Finally, we updated Pip, copied over the requirements.txt file, installed the dependencies, and copied over the Django project itself.

Review Docker for Python Developers for more on structuring Dockerfiles as well as some best practices for configuring Docker for Python-based development.

Next, add a docker-compose.yml file to the project root:

version: '3.8'

services:
  web:
    build: ./app
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - ./app/:/usr/src/app/
    ports:
      - 8000:8000
    env_file:
      - ./.env.dev

Review the Compose file reference for info on how this file works.

Update the SECRET_KEY, DEBUG, and ALLOWED_HOSTS variables in settings.py:

SECRET_KEY = os.environ.get("SECRET_KEY")

DEBUG = int(os.environ.get("DEBUG", default=0))

# 'DJANGO_ALLOWED_HOSTS' should be a single string of hosts with a space between each.
# For example: 'DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]'
ALLOWED_HOSTS = os.environ.get("DJANGO_ALLOWED_HOSTS").split(" ")

Make sure to add the import to the top:

import os

Then, create a .env.dev file in the project root to store environment variables for development:

DEBUG=1
SECRET_KEY=foo
DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]

Build the image:

$ docker-compose build

Once the image is built, run the container:

$ docker-compose up -d

Navigate to http://localhost:8000/ to again view the welcome screen.

Check for errors in the logs if this doesn't work via docker-compose logs -f.

Postgres

To configure Postgres, we'll need to add a new service to the docker-compose.yml file, update the Django settings, and install Psycopg2.

First, add a new service called db to docker-compose.yml:

version: '3.8'

services:
  web:
    build: ./app
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - ./app/:/usr/src/app/
    ports:
      - 8000:8000
    env_file:
      - ./.env.dev
    depends_on:
      - db
  db:
    image: postgres:13.0-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    environment:
      - POSTGRES_USER=hello_django
      - POSTGRES_PASSWORD=hello_django
      - POSTGRES_DB=hello_django_dev

volumes:
  postgres_data:

To persist the data beyond the life of the container we configured a volume. This config will bind postgres_data to the "/var/lib/postgresql/data/" directory in the container.

We also added an environment key to define a name for the default database and set a username and password.

Review the "Environment Variables" section of the Postgres Docker Hub page for more info.

We'll need some new environment variables for the web service as well, so update .env.dev like so:

DEBUG=1
SECRET_KEY=foo
DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]
SQL_ENGINE=django.db.backends.postgresql
SQL_DATABASE=hello_django_dev
SQL_USER=hello_django
SQL_PASSWORD=hello_django
SQL_HOST=db
SQL_PORT=5432

Update the DATABASES dict in settings.py:

DATABASES = {
    "default": {
        "ENGINE": os.environ.get("SQL_ENGINE", "django.db.backends.sqlite3"),
        "NAME": os.environ.get("SQL_DATABASE", BASE_DIR / "db.sqlite3"),
        "USER": os.environ.get("SQL_USER", "user"),
        "PASSWORD": os.environ.get("SQL_PASSWORD", "password"),
        "HOST": os.environ.get("SQL_HOST", "localhost"),
        "PORT": os.environ.get("SQL_PORT", "5432"),
    }
}

Here, the database is configured based on the environment variables that we just defined. Take note of the default values.

Update the Dockerfile to install the appropriate packages required for Psycopg2:

# pull official base image
FROM python:3.9.6-alpine

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install psycopg2 dependencies
RUN apk update \
    && apk add postgresql-dev gcc python3-dev musl-dev

# install dependencies
RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

# copy project
COPY . .

Add Psycopg2 to requirements.txt:

Django==3.2.6
psycopg2-binary==2.9.1

Review this GitHub Issue for more info on installing Psycopg2 in an Alpine-based Docker Image.

Build the new image and spin up the two containers:

$ docker-compose up -d --build

Run the migrations:

$ docker-compose exec web python manage.py migrate --noinput

Get the following error?

django.db.utils.OperationalError: FATAL:  database "hello_django_dev" does not exist

Run docker-compose down -v to remove the volumes along with the containers. Then, re-build the images, run the containers, and apply the migrations.

Ensure the default Django tables were created:

$ docker-compose exec db psql --username=hello_django --dbname=hello_django_dev

psql (13.0)
Type "help" for help.

hello_django_dev=# \l
                                          List of databases
       Name       |    Owner     | Encoding |  Collate   |   Ctype    |       Access privileges
------------------+--------------+----------+------------+------------+-------------------------------
 hello_django_dev | hello_django | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres         | hello_django | UTF8     | en_US.utf8 | en_US.utf8 |
 template0        | hello_django | UTF8     | en_US.utf8 | en_US.utf8 | =c/hello_django              +
                  |              |          |            |            | hello_django=CTc/hello_django
 template1        | hello_django | UTF8     | en_US.utf8 | en_US.utf8 | =c/hello_django              +
                  |              |          |            |            | hello_django=CTc/hello_django
(4 rows)

hello_django_dev=# \c hello_django_dev
You are now connected to database "hello_django_dev" as user "hello_django".

hello_django_dev=# \dt
                     List of relations
 Schema |            Name            | Type  |    Owner
--------+----------------------------+-------+--------------
 public | auth_group                 | table | hello_django
 public | auth_group_permissions     | table | hello_django
 public | auth_permission            | table | hello_django
 public | auth_user                  | table | hello_django
 public | auth_user_groups           | table | hello_django
 public | auth_user_user_permissions | table | hello_django
 public | django_admin_log           | table | hello_django
 public | django_content_type        | table | hello_django
 public | django_migrations          | table | hello_django
 public | django_session             | table | hello_django
(10 rows)

hello_django_dev=# \q

You can check that the volume was created as well by running:

$ docker volume inspect django-on-docker_postgres_data

You should see something similar to:

[
    {
        "CreatedAt": "2021-08-23T15:49:08Z",
        "Driver": "local",
        "Labels": {
            "com.docker.compose.project": "django-on-docker",
            "com.docker.compose.version": "1.29.2",
            "com.docker.compose.volume": "postgres_data"
        },
        "Mountpoint": "/var/lib/docker/volumes/django-on-docker_postgres_data/_data",
        "Name": "django-on-docker_postgres_data",
        "Options": null,
        "Scope": "local"
    }
]

Next, add an entrypoint.sh file to the "app" directory to verify that Postgres is healthy before applying the migrations and running the Django development server:

#!/bin/sh

if [ "$DATABASE" = "postgres" ]
then
    echo "Waiting for postgres..."

    while ! nc -z $SQL_HOST $SQL_PORT; do
      sleep 0.1
    done

    echo "PostgreSQL started"
fi

python manage.py flush --no-input
python manage.py migrate

exec "$@"

Update the file permissions locally:

$ chmod +x app/entrypoint.sh

Then, update the Dockerfile to copy over the entrypoint.sh file and run it as the Docker entrypoint command:

# pull official base image
FROM python:3.9.6-alpine

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install psycopg2 dependencies
RUN apk update \
    && apk add postgresql-dev gcc python3-dev musl-dev

# install dependencies
RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

# copy entrypoint.sh
COPY ./entrypoint.sh .
RUN sed -i 's/\r$//g' /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh

# copy project
COPY . .

# run entrypoint.sh
ENTRYPOINT ["/usr/src/app/entrypoint.sh"]

Add the DATABASE environment variable to .env.dev:

DEBUG=1
SECRET_KEY=foo
DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]
SQL_ENGINE=django.db.backends.postgresql
SQL_DATABASE=hello_django_dev
SQL_USER=hello_django
SQL_PASSWORD=hello_django
SQL_HOST=db
SQL_PORT=5432
DATABASE=postgres

Test it out again:

  1. Re-build the images
  2. Run the containers
  3. Try http://localhost:8000/

Notes

First, despite adding Postgres, we can still create an independent Docker image for Django as long as the DATABASE environment variable is not set to postgres. To test, build a new image and then run a new container:

$ docker build -f ./app/Dockerfile -t hello_django:latest ./app
$ docker run -d \
    -p 8006:8000 \
    -e "SECRET_KEY=please_change_me" -e "DEBUG=1" -e "DJANGO_ALLOWED_HOSTS=*" \
    hello_django python /usr/src/app/manage.py runserver 0.0.0.0:8000

You should be able to view the welcome page at http://localhost:8006

Second, you may want to comment out the database flush and migrate commands in the entrypoint.sh script so they don't run on every container start or re-start:

#!/bin/sh

if [ "$DATABASE" = "postgres" ]
then
    echo "Waiting for postgres..."

    while ! nc -z $SQL_HOST $SQL_PORT; do
      sleep 0.1
    done

    echo "PostgreSQL started"
fi

# python manage.py flush --no-input
# python manage.py migrate

exec "$@"

Instead, you can run them manually, after the containers spin up, like so:

$ docker-compose exec web python manage.py flush --no-input
$ docker-compose exec web python manage.py migrate

Gunicorn

Moving along, for production environments, let's add Gunicorn, a production-grade WSGI server, to the requirements file:

Django==3.2.6
gunicorn==20.1.0
psycopg2-binary==2.9.1

Curious about WSGI and Gunicorn? Review the WSGI chapter from the Building Your Own Python Web Framework course.

Since we still want to use Django's built-in server in development, create a new compose file called docker-compose.prod.yml for production:

version: '3.8'

services:
  web:
    build: ./app
    command: gunicorn hello_django.wsgi:application --bind 0.0.0.0:8000
    ports:
      - 8000:8000
    env_file:
      - ./.env.prod
    depends_on:
      - db
  db:
    image: postgres:13.0-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    env_file:
      - ./.env.prod.db

volumes:
  postgres_data:

If you have multiple environments, you may want to look at using a docker-compose.override.yml configuration file. With this approach, you'd add your base config to a docker-compose.yml file and then use a docker-compose.override.yml file to override those config settings based on the environment.

Take note of the default command. We're running Gunicorn rather than the Django development server. We also removed the volume from the web service since we don't need it in production. Finally, we're using separate environment variable files to define environment variables for both services that will be passed to the container at runtime.

.env.prod:

DEBUG=0
SECRET_KEY=change_me
DJANGO_ALLOWED_HOSTS=localhost 127.0.0.1 [::1]
SQL_ENGINE=django.db.backends.postgresql
SQL_DATABASE=hello_django_prod
SQL_USER=hello_django
SQL_PASSWORD=hello_django
SQL_HOST=db
SQL_PORT=5432
DATABASE=postgres

.env.prod.db:

POSTGRES_USER=hello_django
POSTGRES_PASSWORD=hello_django
POSTGRES_DB=hello_django_prod

Add the two files to the project root. You'll probably want to keep them out of version control, so add them to a .gitignore file.

Bring down the development containers (and the associated volumes with the -v flag):

$ docker-compose down -v

Then, build the production images and spin up the containers:

$ docker-compose -f docker-compose.prod.yml up -d --build

Verify that the hello_django_prod database was created along with the default Django tables. Test out the admin page at http://localhost:8000/admin. The static files are not being loaded anymore. This is expected since Debug mode is off. We'll fix this shortly.

Again, if the container fails to start, check for errors in the logs via docker-compose -f docker-compose.prod.yml logs -f.

Production Dockerfile

Did you notice that we're still running the database flush (which clears out the database) and migrate commands every time the container is run? This is fine in development, but let's create a new entrypoint file for production.

entrypoint.prod.sh:

#!/bin/sh

if [ "$DATABASE" = "postgres" ]
then
    echo "Waiting for postgres..."

    while ! nc -z $SQL_HOST $SQL_PORT; do
      sleep 0.1
    done

    echo "PostgreSQL started"
fi

exec "$@"

Update the file permissions locally:

$ chmod +x app/entrypoint.prod.sh

To use this file, create a new Dockerfile called Dockerfile.prod for use with production builds:

###########
# BUILDER #
###########

# pull official base image
FROM python:3.9.6-alpine as builder

# set work directory
WORKDIR /usr/src/app

# set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

# install psycopg2 dependencies
RUN apk update \
    && apk add postgresql-dev gcc python3-dev musl-dev

# lint
RUN pip install --upgrade pip
RUN pip install flake8==3.9.2
COPY . .
RUN flake8 --ignore=E501,F401 .

# install dependencies
COPY ./requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /usr/src/app/wheels -r requirements.txt


#########
# FINAL #
#########

# pull official base image
FROM python:3.9.6-alpine

# create directory for the app user
RUN mkdir -p /home/app

# create the app user
RUN addgroup -S app && adduser -S app -G app

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
WORKDIR $APP_HOME

# install dependencies
RUN apk update && apk add libpq
COPY --from=builder /usr/src/app/wheels /wheels
COPY --from=builder /usr/src/app/requirements.txt .
RUN pip install --no-cache /wheels/*

# copy entrypoint.prod.sh
COPY ./entrypoint.prod.sh .
RUN sed -i 's/\r$//g'  $APP_HOME/entrypoint.prod.sh
RUN chmod +x  $APP_HOME/entrypoint.prod.sh

# copy project
COPY . $APP_HOME

# chown all the files to the app user
RUN chown -R app:app $APP_HOME

# change to the app user
USER app

# run entrypoint.prod.sh
ENTRYPOINT ["/home/app/web/entrypoint.prod.sh"]

Here, we used a Docker multi-stage build to reduce the final image size. Essentially, builder is a temporary image that's used for building the Python wheels. The wheels are then copied over to the final production image and the builder image is discarded.

You could take the multi-stage build approach a step further and use a single Dockerfile instead of creating two Dockerfiles. Think of the pros and cons of using this approach over two different files.

Did you notice that we created a non-root user? By default, Docker runs container processes as root inside of a container. This is a bad practice since attackers can gain root access to the Docker host if they manage to break out of the container. If you're root in the container, you'll be root on the host.

Update the web service within the docker-compose.prod.yml file to build with Dockerfile.prod:

web:
  build:
    context: ./app
    dockerfile: Dockerfile.prod
  command: gunicorn hello_django.wsgi:application --bind 0.0.0.0:8000
  ports:
    - 8000:8000
  env_file:
    - ./.env.prod
  depends_on:
    - db

Try it out:

$ docker-compose -f docker-compose.prod.yml down -v
$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput

Nginx

Next, let's add Nginx into the mix to act as a reverse proxy for Gunicorn to handle client requests as well as serve up static files.

Add the service to docker-compose.prod.yml:

nginx:
  build: ./nginx
  ports:
    - 1337:80
  depends_on:
    - web

Then, in the local project root, create the following files and folders:

└── nginx
    ├── Dockerfile
    └── nginx.conf

Dockerfile:

FROM nginx:1.21-alpine

RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d

nginx.conf:

upstream hello_django {
    server web:8000;
}

server {

    listen 80;

    location / {
        proxy_pass http://hello_django;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
    }

}

Review Using NGINX and NGINX Plus as an Application Gateway with uWSGI and Django for more info on configuring Nginx to work with Django.

Then, update the web service, in docker-compose.prod.yml, replacing ports with expose:

web:
  build:
    context: ./app
    dockerfile: Dockerfile.prod
  command: gunicorn hello_django.wsgi:application --bind 0.0.0.0:8000
  expose:
    - 8000
  env_file:
    - ./.env.prod
  depends_on:
    - db

Now, port 8000 is only exposed internally, to other Docker services. The port will no longer be published to the host machine.

For more on ports vs expose, review this Stack Overflow question.

Test it out again.

$ docker-compose -f docker-compose.prod.yml down -v
$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput

Ensure the app is up and running at http://localhost:1337.

Your project structure should now look like:

├── .env.dev
├── .env.prod
├── .env.prod.db
├── .gitignore
├── app
│   ├── Dockerfile
│   ├── Dockerfile.prod
│   ├── entrypoint.prod.sh
│   ├── entrypoint.sh
│   ├── hello_django
│   │   ├── __init__.py
│   │   ├── asgi.py
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   ├── manage.py
│   └── requirements.txt
├── docker-compose.prod.yml
├── docker-compose.yml
└── nginx
    ├── Dockerfile
    └── nginx.conf

Bring the containers down once done:

$ docker-compose -f docker-compose.prod.yml down -v

Since Gunicorn is an application server, it will not serve up static files. So, how should both static and media files be handled in this particular configuration?

Static Files

Update settings.py:

STATIC_URL = "/static/"
STATIC_ROOT = BASE_DIR / "staticfiles"

Development

Now, any request to http://localhost:8000/static/* will be served from the "staticfiles" directory.

To test, first re-build the images and spin up the new containers per usual. Ensure static files are still being served correctly at http://localhost:8000/admin.

Production

For production, add a volume to the web and nginx services in docker-compose.prod.yml so that each container will share a directory named "staticfiles":

version: '3.8'

services:
  web:
    build:
      context: ./app
      dockerfile: Dockerfile.prod
    command: gunicorn hello_django.wsgi:application --bind 0.0.0.0:8000
    volumes:
      - static_volume:/home/app/web/staticfiles
    expose:
      - 8000
    env_file:
      - ./.env.prod
    depends_on:
      - db
  db:
    image: postgres:13.0-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    env_file:
      - ./.env.prod.db
  nginx:
    build: ./nginx
    volumes:
      - static_volume:/home/app/web/staticfiles
    ports:
      - 1337:80
    depends_on:
      - web

volumes:
  postgres_data:
  static_volume:

We need to also create the "/home/app/web/staticfiles" folder in Dockerfile.prod:

...

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
RUN mkdir $APP_HOME/staticfiles
WORKDIR $APP_HOME

...

Why is this necessary?

Docker Compose normally mounts named volumes as root. And since we're using a non-root user, we'll get a permission denied error when the collectstatic command is run if the directory does not already exist

To get around this, you can either:

  1. Create the folder in the Dockerfile (source)
  2. Change the permissions of the directory after it's mounted (source)

We used the former.

Next, update the Nginx configuration to route static file requests to the "staticfiles" folder:

upstream hello_django {
    server web:8000;
}

server {

    listen 80;

    location / {
        proxy_pass http://hello_django;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
    }

    location /static/ {
        alias /home/app/web/staticfiles/;
    }

}

Spin down the development containers:

$ docker-compose down -v

Test:

$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput
$ docker-compose -f docker-compose.prod.yml exec web python manage.py collectstatic --no-input --clear

Again, requests to http://localhost:1337/static/* will be served from the "staticfiles" directory.

Navigate to http://localhost:1337/admin and ensure the static assets load correctly.

You can also verify in the logs -- via docker-compose -f docker-compose.prod.yml logs -f -- that requests to the static files are served up successfully via Nginx:

nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /admin/ HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /admin/login/?next=/admin/ HTTP/1.1" 200 2214 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/css/base.css HTTP/1.1" 304 0 "http://localhost:1337/admin/login/?next=/admin/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/css/nav_sidebar.css HTTP/1.1" 304 0 "http://localhost:1337/admin/login/?next=/admin/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/css/responsive.css HTTP/1.1" 304 0 "http://localhost:1337/admin/login/?next=/admin/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/css/login.css HTTP/1.1" 304 0 "http://localhost:1337/admin/login/?next=/admin/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/js/nav_sidebar.js HTTP/1.1" 304 0 "http://localhost:1337/admin/login/?next=/admin/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/css/fonts.css HTTP/1.1" 304 0 "http://localhost:1337/static/admin/css/base.css" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/fonts/Roboto-Regular-webfont.woff HTTP/1.1" 304 0 "http://localhost:1337/static/admin/css/fonts.css" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"
nginx_1  | 192.168.144.1 - - [23/Aug/2021:20:11:00 +0000] "GET /static/admin/fonts/Roboto-Light-webfont.woff HTTP/1.1" 304 0 "http://localhost:1337/static/admin/css/fonts.css" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" "-"

Bring the containers once done:

$ docker-compose -f docker-compose.prod.yml down -v

Media Files

To test out the handling of media files, start by creating a new Django app:

$ docker-compose up -d --build
$ docker-compose exec web python manage.py startapp upload

Add the new app to the INSTALLED_APPS list in settings.py:

INSTALLED_APPS = [
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",

    "upload",
]

app/upload/views.py:

from django.shortcuts import render
from django.core.files.storage import FileSystemStorage


def image_upload(request):
    if request.method == "POST" and request.FILES["image_file"]:
        image_file = request.FILES["image_file"]
        fs = FileSystemStorage()
        filename = fs.save(image_file.name, image_file)
        image_url = fs.url(filename)
        print(image_url)
        return render(request, "upload.html", {
            "image_url": image_url
        })
    return render(request, "upload.html")

Add a "templates", directory to the "app/upload" directory, and then add a new template called upload.html:

{% block content %}

  <form action="{% url "upload" %}" method="post" enctype="multipart/form-data">
    {% csrf_token %}
    <input type="file" name="image_file">
    <input type="submit" value="submit" />
  </form>

  {% if image_url %}
    <p>File uploaded at: <a href="{{ image_url }}">{{ image_url }}</a></p>
  {% endif %}

{% endblock %}

app/hello_django/urls.py:

from django.contrib import admin
from django.urls import path
from django.conf import settings
from django.conf.urls.static import static

from upload.views import image_upload

urlpatterns = [
    path("", image_upload, name="upload"),
    path("admin/", admin.site.urls),
]

if bool(settings.DEBUG):
    urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)

app/hello_django/settings.py:

MEDIA_URL = "/media/"
MEDIA_ROOT = BASE_DIR / "mediafiles"

Development

Test:

$ docker-compose up -d --build

You should be able to upload an image at http://localhost:8000/, and then view the image at http://localhost:8000/media/IMAGE_FILE_NAME.

Production

For production, add another volume to the web and nginx services:

version: '3.8'

services:
  web:
    build:
      context: ./app
      dockerfile: Dockerfile.prod
    command: gunicorn hello_django.wsgi:application --bind 0.0.0.0:8000
    volumes:
      - static_volume:/home/app/web/staticfiles
      - media_volume:/home/app/web/mediafiles
    expose:
      - 8000
    env_file:
      - ./.env.prod
    depends_on:
      - db
  db:
    image: postgres:13.0-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
    env_file:
      - ./.env.prod.db
  nginx:
    build: ./nginx
    volumes:
      - static_volume:/home/app/web/staticfiles
      - media_volume:/home/app/web/mediafiles
    ports:
      - 1337:80
    depends_on:
      - web

volumes:
  postgres_data:
  static_volume:
  media_volume:

Create the "/home/app/web/mediafiles" folder in Dockerfile.prod:

...

# create the appropriate directories
ENV HOME=/home/app
ENV APP_HOME=/home/app/web
RUN mkdir $APP_HOME
RUN mkdir $APP_HOME/staticfiles
RUN mkdir $APP_HOME/mediafiles
WORKDIR $APP_HOME

...

Update the Nginx config again:

upstream hello_django {
    server web:8000;
}

server {

    listen 80;

    location / {
        proxy_pass http://hello_django;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
        proxy_redirect off;
    }

    location /static/ {
        alias /home/app/web/staticfiles/;
    }

    location /media/ {
        alias /home/app/web/mediafiles/;
    }

}

Re-build:

$ docker-compose down -v

$ docker-compose -f docker-compose.prod.yml up -d --build
$ docker-compose -f docker-compose.prod.yml exec web python manage.py migrate --noinput
$ docker-compose -f docker-compose.prod.yml exec web python manage.py collectstatic --no-input --clear

Test it out one final time:

  1. Upload an image at http://localhost:1337/.
  2. Then, view the image at http://localhost:1337/media/IMAGE_FILE_NAME.

If you see an 413 Request Entity Too Large error, you'll need to increase the maximum allowed size of the client request body in either the server or location context within the Nginx config.

Example:

location / {    proxy_pass http://hello_django;    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;    proxy_set_header Host $host;    proxy_redirect off;    client_max_body_size 100M; }

Conclusion

In this tutorial, we walked through how to containerize a Django web application with Postgres for development. We also created a production-ready Docker Compose file that adds Gunicorn and Nginx into the mix to handle static and media files. You can now test out a production setup locally.

In terms of actual deployment to a production environment, you'll probably want to use a:

  1. Fully managed database service -- like RDS or Cloud SQL -- rather than managing your own Postgres instance within a container.
  2. Non-root user for the db and nginx services

For other production tips, review this discussion.

You can find the code in the django-on-docker repo.

There's also an older, Pipenv version of the code available here.

Thanks for reading!

Django on Docker Series:

  1. Dockerizing Django with Postgres, Gunicorn, and Nginx (this article!)
  2. Securing a Containerized Django Application with Let's Encrypt
  3. Deploying Django to AWS with Docker and Let's Encrypt

Original article source at: https://testdriven.io/

#django #docker #postgres 

How to Configure Django To Run on Docker with Postgres
Oral  Brekke

Oral Brekke

1668842220

How to Basic and Full-text Search with Django and Postgres

Unlike relational databases, full-text search is not standardized. There are several open-source options like ElasticSearch, Solr, and Xapian. ElasticSearch is probably the most popular solution; however, it's complicated to set up and maintain. Further, if you're not taking advantage of some of the advanced features that ElasticSearch offers, you should stick with the full-text search capabilities that many relational (like Postgres, MySQL, SQLite) and non-relational databases (like MongoDB and CouchDB) offer. Postgres in particular is well-suited for full-text search. Django supports it out-of-the-box as well.

For the vast majority of your Django apps, you should, at the very least, start out with leveraging full-text search from Postgres before looking to a more powerful solution like ElasticSearch or Solr.

In this tutorial, you'll learnUnlike relational databases, full-text search is not standardized. There are several open-source options like ElasticSearch, Solr, and Xapian. ElasticSearch is probably the most popular solution; however, it's complicated to set up and maintain. Further, if you're not taking advantage of some of the advanced features that ElasticSearch offers, you should stick with the full-text search capabilities that many relational (like Postgres, MySQL, SQLite) and non-relational databases (like MongoDB and CouchDB) offer. Postgres in particular is well-suited for full-text search. Django supports it out-of-the-box as well.

For the vast majority of your Django apps, you should, at the very least, start out with leveraging full-text search from Postgres before looking to a more powerful solution like ElasticSearch or Solr.

In this tutorial, you'll learn how to add basic and full-text search to a Django app with Postgres. You'll also optimize the full-text search by adding a search vector field and a database index.

This is an intermediate-level tutorial. It assumes that you're familiar with both Django and Docker. Review the Dockerizing Django with Postgres, Gunicorn, and Nginx tutorial for more info.

Objectives

By the end of this tutorial, you will be able to:

  1. Set up basic search functionality in a Django app with the Q object module
  2. Add full-text search to a Django app
  3. Sort full-text search results by relevance using stemming, ranking and weighting techniques
  4. Add a preview to your search results
  5. Optimize full-text search with a search vector field and a database index

Project Setup and Overview

Clone down the base branch from the django-search repo:

$ git clone https://github.com/testdrivenio/django-search --branch base --single-branch
$ cd django-search

You'll use Docker to simplify setting up and running Postgres along with Django.

From the project root, create the images and spin up the Docker containers:

$ docker-compose up -d --build

Next, apply the migrations and create a superuser:

$ docker-compose exec web python manage.py makemigrations
$ docker-compose exec web python manage.py migrate
$ docker-compose exec web python manage.py createsuperuser

Once done, navigate to http://127.0.0.1:8011/quotes/ to ensure the app works as expected. You should see the following:

Quote Home Page

Want to learn how to work with Django and Postgres? Check out the Dockerizing Django with Postgres, Gunicorn, and Nginx article.

Take note of the Quote model in quotes/models.py:

from django.db import models

class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)

    def __str__(self):
        return self.quote

Next, run the following management command to add 10,000 quotes to the database:

$ docker-compose exec web python manage.py add_quotes

This will take a couple of minutes. Once done, navigate to http://127.0.0.1:8011/quotes/ to see the data.

The output of the view is cached for five minutes, so you may want to comment out the @method_decorator in quotes/views.py to load the quotes. Make sure to remove the comment once done.

Quote Home Page

In the quotes/templates/quote.html file, you have a basic form with a search input field:

<form action="{% url 'search_results' %}" method="get">
  <input
    type="search"
    name="q"
    placeholder="Search by name or quote..."
    class="form-control"
  />
</form>

On submit, the form sends the data to the backend. A GET request is used rather than a POST so that way we have access to the query string both in the URL and in the Django view, allowing users to share search results as links.

Before proceeding further, take a quick look at the project structure and the rest of the code.

Basic Search

When it comes to search, with Django, you'll typically start by performing search queries with contains or icontains for exact matches. The Q object can be used as well to add AND (&) or OR (|) logical operators.

For instance, using the OR operator, override theSearchResultsList's default QuerySet in quotes/views.py like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(
            Q(name__icontains=query) | Q(quote__icontains=query)
        )

Here, we used the filter method to filter against the name or quote fields. Furthermore, we also used the icontains extension to check if the query is present in the name or quote fields (case insensitive). A positive result will be returned if a match is found.

Don't forget the import:

from django.db.models import Q

Try it out:

Search Page

For small data sets, this is a great way to add basic search functionality to your app. If you're dealing with a large data set or want search functionality that feels like an Internet search engine, you'll want to move to full-text search.

Full-text Search

The basic search that we saw earlier has several limitations especially when you want to perform complex lookups.

As mentioned, with basic search, you can only perform exact matches.

Another limitation is that of stop words. Stop words are words such as "a", "an", and "the". These words are common and insufficiently meaningful, therefore they should be ignored. To test, try searching for a word with "the" in front of it. Say you searched for "the middle". In this case, you'll only see results for "the middle", so you won't see any results that have the word "middle" without "the" before it.

Say you have these two sentences:

  1. I am in the middle.
  2. You don't like middle school.

You'll get the following returned with each type of search:

QueryBasic SearchFull-text Search
"the middle"11 and 2
"middle"1 and 21 and 2

Another issue is that of ignoring similar words. With basic search, only exact matches are returned. However, with full-text search, similar words are accounted for. To test, try to find some similar words like "pony" and "ponies". With basic search, if you search for "pony" you won't see results that contain "ponies" -- and vice versa.

Say you have these two sentences.

  1. I am a pony.
  2. You don't like ponies

You'll get the following returned with each type of search:

QueryBasic SearchFull-text Search
"pony"11 and 2
"ponies"21 and 2

With full-text search, both of these issues are mitigated. However, keep in mind that depending on your goal, full-text search may actually decrease precision (quality) and recall (quantity of relevant results). Typically, full-text search is less precise than basic search, since basic search yields exact matches. That said, if you're searching through large data sets with large blocks of text, full-text search is preferred since it's usually much faster.

Full-text search is an advanced searching technique that examines all the words in every stored document as it tries to match the search criteria. In addition, with full-text search, you can employ language-specific stemming on the words being indexed. For instance, the word "drives", "drove", and "driven" will be recorded under the single concept word "drive". Stemming is the process of reducing words to their word stem, base, or root form.

It suffices to say that full-text search is not perfect. It's likely to retrieve many documents that are not relevant (false positives) to the intended search query. However, there are some techniques based on Bayesian algorithms that can help reduce such problems.

To take advantage of Postgres full-text search with Django, add django.contrib.postgres to your INSTALLED_APPS list:

INSTALLED_APPS = [
    ...

    "django.contrib.postgres",  # new
]

Next, let's look at two quick examples of full-text search, on a single field and on multiple fields.

Single Field Search

Update the get_queryset function under the SearchResultsList view function like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(quote__search=query)

Here, we set up full-text search against a single field -- the quote field.

Search Page

As you can see, it takes similar words into account. In the above example, "ponies" and "pony" are treated as similar words.

Multi Field Search

To search against multiple fields and on related models, you can use the SearchVector class.

Again, update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.annotate(search=SearchVector("name", "quote")).filter(
            search=query
        )

To search against multiple fields, you annotated the queryset using a SearchVector. The vector is the data that you're searching for, which has been converted into a form that is easy to search. In the example above, this data is the name and quote fields in your database.

Make sure to add the import:

from django.contrib.postgres.search import SearchVector

Try some searches out.

Stemming and Ranking

In this section, you'll combine several methods such as SearchVector, SearchQuery, and SearchRank to produce a very robust search that uses both stemming and ranking.

Again, stemming is the process of reducing words to their word stem, base, or root form. With stemming, words like "child" and "children" will be treated as similar words. Ranking, on the other hand, allows us to order results by relevancy.

Update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", "quote")
        search_query = SearchQuery(query)
        return (
            Quote.objects.annotate(
                search=search_vector, rank=SearchRank(search_vector, search_query)
            )
            .filter(search=search_query)
            .order_by("-rank")
        )

What's happening here?

  1. SearchVector - again you used a search vector to search against multiple fields. The data is converted into another form since you're no longer just searching the raw text like you did when icontains was used. Therefore, with this, you will be able to search plurals easily. For example, searching for "flask" and "flasks" will yield the same search because they are, well, basically the same thing.
  2. SearchQuery - translates the words provided to us as a query from the form, passes them through a stemming algorithm, and then it looks for matches for all of the resulting terms.
  3. SearchRank - allows us to order the results by relevancy. It takes into account how often the query terms appear in the document, how close the terms are on the document, and how important the part of the document is where they occur.

Add the imports:

from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank

Search Page

Compare the results from the basic search to that of the full-text search. There's a clear difference. In the full-text search, the query with the highest results is shown first. This is the power of SearchRank. Combining SearchVector, SearchQuery, and SearchRank is a quick way to produce a much more powerful and precise search than the basic search.

Adding Weights

Full-text search gives us the ability to add more importance to some fields in our table in the database over other fields. We can achieve this by adding weights to our queries.

The weight should be one of the following letters D, C, B, A. By default, these weights refer to the numbers 0.1, 0.2, 0.4, and 1.0, respectively.

Update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", weight="B") + SearchVector(
            "quote", weight="A"
        )
        search_query = SearchQuery(query)
        return (
            Quote.objects.annotate(rank=SearchRank(search_vector, search_query))
            .filter(rank__gte=0.3)
            .order_by("-rank")
        )

Here, you added weights to the SearchVector using both the name and quote fields. Weights of 0.4 and 1.0 were applied to the name and quote fields, respectively. Therefore, quote matches will prevail over name content matches. Finally, you filtered the results to display only the ones that are greater than 0.3.

Adding a Preview to the Search Results

In this section, you'll add a little preview of your search result via the SearchHeadline method. This will highlight the search result query.

Update SearchResultsList again:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", "quote")
        search_query = SearchQuery(query)
        search_headline = SearchHeadline("quote", search_query)
        return Quote.objects.annotate(
            search=search_vector,
            rank=SearchRank(search_vector, search_query)
        ).annotate(headline=search_headline).filter(search=search_query).order_by("-rank")

The SearchHeadline takes in the field you want to preview. In this case, this will be the quote field along with the query, which will be in bold.

Make sure to add the import:

from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank, SearchHeadline

Before trying out some searches, update the <li></li> in quotes/templates/search.html like so:

<li>{{ quote.headline | safe }} - <b>By <i>{{ quote.name }}</i></b></li>

Now, instead of showing the quotes as you did before, only a preview of the full quote field is displayed along with the highlighted search query.

Boosting Performance

Full-text search is an intensive process. To combat slow performance, you can:

  1. Save the search vectors to the database with SearchVectorField. In other words, rather than converting the strings to search vectors on the fly, we'll create a separate database field that contains the processed search vectors and update the field any time there's an insert or update to either the quote or name fields.
  2. Create a database index, which is a data structure that enhances the speed of the data retrieval processes on a database. It, therefore, speeds up the query. Postgres gives you several indexes to work with that might be applicable for different situations. The GinIndex is arguably the most popular.

To learn more about performance with full-text search, review the Performance section from the Django docs.

Search Vector Field

Start by adding a new SearchVectorField field to the Quote model in quotes/models.py:

from django.contrib.postgres.search import SearchVectorField  # new
from django.db import models


class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)
    search_vector = SearchVectorField(null=True)  # new

    def __str__(self):
        return self.quote

Create the migration file:

$ docker-compose exec web python manage.py makemigrations

Now, you can only populate this field when the quote or name objects already exists in the database. Thus, we need to add a trigger to update the search_vector field whenever the quote or name fields are updated. To achieve this, create a custom migration file in "quotes/migrations" called 0003_search_vector_trigger.py:

from django.contrib.postgres.search import SearchVector
from django.db import migrations


def compute_search_vector(apps, schema_editor):
    Quote = apps.get_model("quotes", "Quote")
    Quote.objects.update(search_vector=SearchVector("name", "quote"))


class Migration(migrations.Migration):

    dependencies = [
        ("quotes", "0002_quote_search_vector"),
    ]

    operations = [
        migrations.RunSQL(
            sql="""
            CREATE TRIGGER search_vector_trigger
            BEFORE INSERT OR UPDATE OF name, quote, search_vector
            ON quotes_quote
            FOR EACH ROW EXECUTE PROCEDURE
            tsvector_update_trigger(
                search_vector, 'pg_catalog.english', name, quote
            );
            UPDATE quotes_quote SET search_vector = NULL;
            """,
            reverse_sql="""
            DROP TRIGGER IF EXISTS search_vector_trigger
            ON quotes_quote;
            """,
        ),
        migrations.RunPython(
            compute_search_vector, reverse_code=migrations.RunPython.noop
        ),
    ]

Depending on your project structure, you may need to update the name of the previous migration file in dependencies.

Apply the migrations:

$ docker-compose exec web python manage.py migrate

To use the new field for searches, update SearchResultsList like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(search_vector=query)

Update the <li></li> in quotes/templates/search.html again:

<li>{{ quote.quote | safe }} - <b>By <i>{{ quote.name }}</i></b></li>

Index

Finally, let's set up a functional index, GinIndex.

Update the Quote model:

from django.contrib.postgres.indexes import GinIndex  # new
from django.contrib.postgres.search import SearchVectorField
from django.db import models


class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)
    search_vector = SearchVectorField(null=True)

    def __str__(self):
        return self.quote

    # new
    class Meta:
        indexes = [
            GinIndex(fields=["search_vector"]),
        ]

Create and apply the migrations one last time:

$ docker-compose exec web python manage.py makemigrations
$ docker-compose exec web python manage.py migrate

Test it out.

Conclusion

In this tutorial, you were guided through adding basic and full-text search to a Django application. We also took a look at how to optimize the full-text search functionality by adding a search vector field and a database index.

Grab the complete code from the django-search repo.

This is an intermediate-level tutorial. It assumes that you're familiar with both Django and Docker. Review the Dockerizing Django with Postgres, Gunicorn, and Nginx tutorial for more info.

Objectives

By the end of this tutorial, you will be able to:

  1. Set up basic search functionality in a Django app with the Q object module
  2. Add full-text search to a Django app
  3. Sort full-text search results by relevance using stemming, ranking and weighting techniques
  4. Add a preview to your search results
  5. Optimize full-text search with a search vector field and a database index

Project Setup and Overview

Clone down the base branch from the django-search repo:

$ git clone https://github.com/testdrivenio/django-search --branch base --single-branch
$ cd django-search

You'll use Docker to simplify setting up and running Postgres along with Django.

From the project root, create the images and spin up the Docker containers:

$ docker-compose up -d --build

Next, apply the migrations and create a superuser:

$ docker-compose exec web python manage.py makemigrations
$ docker-compose exec web python manage.py migrate
$ docker-compose exec web python manage.py createsuperuser

Once done, navigate to http://127.0.0.1:8011/quotes/ to ensure the app works as expected. You should see the following:

Quote Home Page

Want to learn how to work with Django and Postgres? Check out the Dockerizing Django with Postgres, Gunicorn, and Nginx article.

Take note of the Quote model in quotes/models.py:

from django.db import models

class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)

    def __str__(self):
        return self.quote

Next, run the following management command to add 10,000 quotes to the database:

$ docker-compose exec web python manage.py add_quotes

This will take a couple of minutes. Once done, navigate to http://127.0.0.1:8011/quotes/ to see the data.

The output of the view is cached for five minutes, so you may want to comment out the @method_decorator in quotes/views.py to load the quotes. Make sure to remove the comment once done.

Quote Home Page

In the quotes/templates/quote.html file, you have a basic form with a search input field:

<form action="{% url 'search_results' %}" method="get">
  <input
    type="search"
    name="q"
    placeholder="Search by name or quote..."
    class="form-control"
  />
</form>

On submit, the form sends the data to the backend. A GET request is used rather than a POST so that way we have access to the query string both in the URL and in the Django view, allowing users to share search results as links.

Before proceeding further, take a quick look at the project structure and the rest of the code.

Basic Search

When it comes to search, with Django, you'll typically start by performing search queries with contains or icontains for exact matches. The Q object can be used as well to add AND (&) or OR (|) logical operators.

For instance, using the OR operator, override theSearchResultsList's default QuerySet in quotes/views.py like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(
            Q(name__icontains=query) | Q(quote__icontains=query)
        )

Here, we used the filter method to filter against the name or quote fields. Furthermore, we also used the icontains extension to check if the query is present in the name or quote fields (case insensitive). A positive result will be returned if a match is found.

Don't forget the import:

from django.db.models import Q

Try it out:

Search Page

For small data sets, this is a great way to add basic search functionality to your app. If you're dealing with a large data set or want search functionality that feels like an Internet search engine, you'll want to move to full-text search.

Full-text Search

The basic search that we saw earlier has several limitations especially when you want to perform complex lookups.

As mentioned, with basic search, you can only perform exact matches.

Another limitation is that of stop words. Stop words are words such as "a", "an", and "the". These words are common and insufficiently meaningful, therefore they should be ignored. To test, try searching for a word with "the" in front of it. Say you searched for "the middle". In this case, you'll only see results for "the middle", so you won't see any results that have the word "middle" without "the" before it.

Say you have these two sentences:

  1. I am in the middle.
  2. You don't like middle school.

You'll get the following returned with each type of search:

QueryBasic SearchFull-text Search
"the middle"11 and 2
"middle"1 and 21 and 2

Another issue is that of ignoring similar words. With basic search, only exact matches are returned. However, with full-text search, similar words are accounted for. To test, try to find some similar words like "pony" and "ponies". With basic search, if you search for "pony" you won't see results that contain "ponies" -- and vice versa.

Say you have these two sentences.

  1. I am a pony.
  2. You don't like ponies

You'll get the following returned with each type of search:

QueryBasic SearchFull-text Search
"pony"11 and 2
"ponies"21 and 2

With full-text search, both of these issues are mitigated. However, keep in mind that depending on your goal, full-text search may actually decrease precision (quality) and recall (quantity of relevant results). Typically, full-text search is less precise than basic search, since basic search yields exact matches. That said, if you're searching through large data sets with large blocks of text, full-text search is preferred since it's usually much faster.

Full-text search is an advanced searching technique that examines all the words in every stored document as it tries to match the search criteria. In addition, with full-text search, you can employ language-specific stemming on the words being indexed. For instance, the word "drives", "drove", and "driven" will be recorded under the single concept word "drive". Stemming is the process of reducing words to their word stem, base, or root form.

It suffices to say that full-text search is not perfect. It's likely to retrieve many documents that are not relevant (false positives) to the intended search query. However, there are some techniques based on Bayesian algorithms that can help reduce such problems.

To take advantage of Postgres full-text search with Django, add django.contrib.postgres to your INSTALLED_APPS list:

INSTALLED_APPS = [
    ...

    "django.contrib.postgres",  # new
]

Next, let's look at two quick examples of full-text search, on a single field and on multiple fields.

Single Field Search

Update the get_queryset function under the SearchResultsList view function like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(quote__search=query)

Here, we set up full-text search against a single field -- the quote field.

Search Page

As you can see, it takes similar words into account. In the above example, "ponies" and "pony" are treated as similar words.

Multi Field Search

To search against multiple fields and on related models, you can use the SearchVector class.

Again, update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.annotate(search=SearchVector("name", "quote")).filter(
            search=query
        )

To search against multiple fields, you annotated the queryset using a SearchVector. The vector is the data that you're searching for, which has been converted into a form that is easy to search. In the example above, this data is the name and quote fields in your database.

Make sure to add the import:

from django.contrib.postgres.search import SearchVector

Try some searches out.

Stemming and Ranking

In this section, you'll combine several methods such as SearchVector, SearchQuery, and SearchRank to produce a very robust search that uses both stemming and ranking.

Again, stemming is the process of reducing words to their word stem, base, or root form. With stemming, words like "child" and "children" will be treated as similar words. Ranking, on the other hand, allows us to order results by relevancy.

Update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", "quote")
        search_query = SearchQuery(query)
        return (
            Quote.objects.annotate(
                search=search_vector, rank=SearchRank(search_vector, search_query)
            )
            .filter(search=search_query)
            .order_by("-rank")
        )

What's happening here?

  1. SearchVector - again you used a search vector to search against multiple fields. The data is converted into another form since you're no longer just searching the raw text like you did when icontains was used. Therefore, with this, you will be able to search plurals easily. For example, searching for "flask" and "flasks" will yield the same search because they are, well, basically the same thing.
  2. SearchQuery - translates the words provided to us as a query from the form, passes them through a stemming algorithm, and then it looks for matches for all of the resulting terms.
  3. SearchRank - allows us to order the results by relevancy. It takes into account how often the query terms appear in the document, how close the terms are on the document, and how important the part of the document is where they occur.

Add the imports:

from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank

Search Page

Compare the results from the basic search to that of the full-text search. There's a clear difference. In the full-text search, the query with the highest results is shown first. This is the power of SearchRank. Combining SearchVector, SearchQuery, and SearchRank is a quick way to produce a much more powerful and precise search than the basic search.

Adding Weights

Full-text search gives us the ability to add more importance to some fields in our table in the database over other fields. We can achieve this by adding weights to our queries.

The weight should be one of the following letters D, C, B, A. By default, these weights refer to the numbers 0.1, 0.2, 0.4, and 1.0, respectively.

Update SearchResultsList:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", weight="B") + SearchVector(
            "quote", weight="A"
        )
        search_query = SearchQuery(query)
        return (
            Quote.objects.annotate(rank=SearchRank(search_vector, search_query))
            .filter(rank__gte=0.3)
            .order_by("-rank")
        )

Here, you added weights to the SearchVector using both the name and quote fields. Weights of 0.4 and 1.0 were applied to the name and quote fields, respectively. Therefore, quote matches will prevail over name content matches. Finally, you filtered the results to display only the ones that are greater than 0.3.

Adding a Preview to the Search Results

In this section, you'll add a little preview of your search result via the SearchHeadline method. This will highlight the search result query.

Update SearchResultsList again:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        search_vector = SearchVector("name", "quote")
        search_query = SearchQuery(query)
        search_headline = SearchHeadline("quote", search_query)
        return Quote.objects.annotate(
            search=search_vector,
            rank=SearchRank(search_vector, search_query)
        ).annotate(headline=search_headline).filter(search=search_query).order_by("-rank")

The SearchHeadline takes in the field you want to preview. In this case, this will be the quote field along with the query, which will be in bold.

Make sure to add the import:

from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank, SearchHeadline

Before trying out some searches, update the <li></li> in quotes/templates/search.html like so:

<li>{{ quote.headline | safe }} - <b>By <i>{{ quote.name }}</i></b></li>

Now, instead of showing the quotes as you did before, only a preview of the full quote field is displayed along with the highlighted search query.

Boosting Performance

Full-text search is an intensive process. To combat slow performance, you can:

  1. Save the search vectors to the database with SearchVectorField. In other words, rather than converting the strings to search vectors on the fly, we'll create a separate database field that contains the processed search vectors and update the field any time there's an insert or update to either the quote or name fields.
  2. Create a database index, which is a data structure that enhances the speed of the data retrieval processes on a database. It, therefore, speeds up the query. Postgres gives you several indexes to work with that might be applicable for different situations. The GinIndex is arguably the most popular.

To learn more about performance with full-text search, review the Performance section from the Django docs.

Search Vector Field

Start by adding a new SearchVectorField field to the Quote model in quotes/models.py:

from django.contrib.postgres.search import SearchVectorField  # new
from django.db import models


class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)
    search_vector = SearchVectorField(null=True)  # new

    def __str__(self):
        return self.quote

Create the migration file:

$ docker-compose exec web python manage.py makemigrations

Now, you can only populate this field when the quote or name objects already exists in the database. Thus, we need to add a trigger to update the search_vector field whenever the quote or name fields are updated. To achieve this, create a custom migration file in "quotes/migrations" called 0003_search_vector_trigger.py:

from django.contrib.postgres.search import SearchVector
from django.db import migrations


def compute_search_vector(apps, schema_editor):
    Quote = apps.get_model("quotes", "Quote")
    Quote.objects.update(search_vector=SearchVector("name", "quote"))


class Migration(migrations.Migration):

    dependencies = [
        ("quotes", "0002_quote_search_vector"),
    ]

    operations = [
        migrations.RunSQL(
            sql="""
            CREATE TRIGGER search_vector_trigger
            BEFORE INSERT OR UPDATE OF name, quote, search_vector
            ON quotes_quote
            FOR EACH ROW EXECUTE PROCEDURE
            tsvector_update_trigger(
                search_vector, 'pg_catalog.english', name, quote
            );
            UPDATE quotes_quote SET search_vector = NULL;
            """,
            reverse_sql="""
            DROP TRIGGER IF EXISTS search_vector_trigger
            ON quotes_quote;
            """,
        ),
        migrations.RunPython(
            compute_search_vector, reverse_code=migrations.RunPython.noop
        ),
    ]

Depending on your project structure, you may need to update the name of the previous migration file in dependencies.

Apply the migrations:

$ docker-compose exec web python manage.py migrate

To use the new field for searches, update SearchResultsList like so:

class SearchResultsList(ListView):
    model = Quote
    context_object_name = "quotes"
    template_name = "search.html"

    def get_queryset(self):
        query = self.request.GET.get("q")
        return Quote.objects.filter(search_vector=query)

Update the <li></li> in quotes/templates/search.html again:

<li>{{ quote.quote | safe }} - <b>By <i>{{ quote.name }}</i></b></li>

Index

Finally, let's set up a functional index, GinIndex.

Update the Quote model:

from django.contrib.postgres.indexes import GinIndex  # new
from django.contrib.postgres.search import SearchVectorField
from django.db import models


class Quote(models.Model):
    name = models.CharField(max_length=250)
    quote = models.TextField(max_length=1000)
    search_vector = SearchVectorField(null=True)

    def __str__(self):
        return self.quote

    # new
    class Meta:
        indexes = [
            GinIndex(fields=["search_vector"]),
        ]

Create and apply the migrations one last time:

$ docker-compose exec web python manage.py makemigrations
$ docker-compose exec web python manage.py migrate

Test it out.

Conclusion

In this tutorial, you were guided through adding basic and full-text search to a Django application. We also took a look at how to optimize the full-text search functionality by adding a search vector field and a database index.

Grab the complete code from the django-search repo.

Original article source at: https://testdriven.io/

#django #postgres #search 

How to Basic and Full-text Search with Django and Postgres
Hermann  Frami

Hermann Frami

1668014280

Space-cloud: Develop, Deploy and Secure Serverless Apps on Kubernetes

Space cloud

Develop, Deploy and Secure Serverless Apps on Kubernetes.

Space Cloud is a Kubernetes based serverless platform that provides instant, realtime APIs on any database, with event triggers and unified APIs for your custom business logic.

Space Cloud helps you build modern applications without having to write any backend code in most cases.

It provides GraphQL and REST APIs which can be consumed directly by your frontend in a secure manner.

Features

View complete feature set here.

  • Powerful CRUD: Flexible queries, transactions, aggregations and cross-database joins
  • Realtime: Make live queries to your database
  • File storage: Upload/download files to scalable file stores (e.g., Amazon S3, Google Cloud Storage)
  • Extensible: Unified APIs for your custom HTTP services
  • Event-driven: Trigger webhooks or serverless functions on database or file storage events
  • Fine-grained access control: Dynamic access control that integrates with your auth system (e.g., auth0, firebase-auth)
  • Scalable: Written in Golang, it follows cloud-native practices and scales horizontally
  • Service Mesh: Get all the capabilities of a service mesh without having to learn any of that!
  • Scale down to zero: Auto scale your http workloads including scaling down to zero

Supported databases :heart::

  • MongoDB
  • PostgreSQL and PostgreSQL compatible databases (For eg. CockroachDB, Yugabyte, etc.)
  • MySQL and MySQL compatible databases (For eg. TiDB, MariaDB, etc.)
  • SQL Server

Quick start

If you are new to Space Cloud, we strongly recommend following our step-by-step guide to get started

Other guides

View the installation guides for Docker and Kubernetes.

Client-side tooling

Space Cloud exposes GraphQL and REST APIs. See setting up project guide to choose a client and set it up.

GraphQL APIs

GraphQL is the recommended way to use Space cloud, and it works with any GraphQL client. However, we recommend using Apollo Client. See awesome-graphql for a list of clients.

REST APIs

You can use the REST APIs of Space Cloud if you are more comfortable with REST.

To make it easy to consume the REST APIs in web projects, we have created a Javascript SDK for you.

How it works

Space Cloud is meant to replace any backend php, nodejs, java code you may write to create your endpoints. Instead, it exposes your database over an external API that can be consumed directly from the frontend. In other words, it allows clients to fire database queries directly.

However, it's important to note that the client does not send database (SQL) queries to Space Cloud. Instead, it sends an object describing the query to be executed. This object is first validated by Space Cloud (using security rules). Once the client is authorized to make the request, a database query is dynamically generated and executed. The results are sent directly to the concerned client.

We understand that not every app can be built using only CRUD operations. Sometimes it's necessary to write business logic. For such cases, Space Cloud allows you to access your custom HTTP servers via the same consistent APIs of Space Cloud. In this scenario, Space Cloud acts merely as an API gateway between your services and the client. However, the cool part is that you can even perform joins on your microservices and database via the GraphQL API of Space Cloud.

Detailed Space Cloud architecture

Space Cloud integrates with Kubernetes and Istio natively to bring to you a highly scalable Serverless Platform. It encrypts all traffic by default and lets you describe communication policies to protect your microservices.

With that, it also provides autoscaling functionality out of the box including scaling down to zero.

Support & Troubleshooting

The documentation and community should help you troubleshoot most issues. If you have encountered a bug or need to get in touch with us, you can contact us using one of the following channels:

Contributing

Space Cloud is a young project. We'd love to have you onboard if you wish to contribute. To help you get started, here are a few areas you can help us with:

  • Writing the documentation
  • Making sample apps in React, Angular, Android, and any other frontend tech you can think of
  • Deciding the road map of the project
  • Creating issues for any bugs you find
  • And of course, with code for bug fixes and new enhancements

Download Details:

Author: Spacecloud-io
Source Code: https://github.com/spacecloud-io/space-cloud 
License: Apache-2.0 license

#serverless #mysql #graphql #kubernetes #postgres #firebase 

Space-cloud: Develop, Deploy and Secure Serverless Apps on Kubernetes
Hermann  Frami

Hermann Frami

1667736060

Nhost: The Open Source Firebase Alternative with GraphQL

Nhost

The Open Source Firebase Alternative with GraphQL


Nhost is an open source Firebase alternative with GraphQL, built with the following things in mind:

  • Open Source
  • GraphQL
  • SQL
  • Great Developer Experience

Nhost consists of open source software:

Architecture of Nhost

Visit https://docs.nhost.io for the complete documentation.

Get Started

Option 1: Nhost Hosted Platform

  1. Sign in to Nhost.
  2. Create Nhost app.
  3. Done.

Option 2: Self-hosting

Since Nhost is 100% open source, you can self-host the whole Nhost stack. Check out the example docker-compose file to self-host Nhost.

Sign In and Make a Graphql Request

Install the @nhost/nhost-js package and start build your app:

import { NhostClient } from '@nhost/nhost-js'

const nhost = new NhostClient({
  subdomain: '<your-subdomain>',
  region: '<your-region>'
})

await nhost.auth.signIn({ email: 'elon@musk.com', password: 'spaceX' })

await nhost.graphql.request(`{
  users {
    id
    displayName
    email
  }
}`)

Frontend Agnostic

Nhost is frontend agnostic, which means Nhost works with all frontend frameworks.

Resources

Nhost libraries and tools

Community ❤️

First and foremost: Star and watch this repository to stay up-to-date.

Also, follow Nhost on GitHub Discussions, our Blog, and on Twitter. You can chat with the team and other members on Discord and follow our tutorials and other video material at YouTube.

Nhost is Open Source

This repository, and most of our other open source projects, are licensed under the MIT license.

ROSS Index - Fastest Growing Open-Source Startups in Q1 2022 | Runa Capital 

How to contribute

Here are some ways of contributing to making Nhost better:

Download Details:

Author: nhost
Source Code: https://github.com/nhost/nhost 
License: MIT license

#serverless #graphql #postgres #database #authentication

Nhost: The Open Source Firebase Alternative with GraphQL
Hermann  Frami

Hermann Frami

1667732160

Neon: Serverless Postgres

Neon

Neon is a serverless open-source alternative to AWS Aurora Postgres. It separates storage and compute and substitutes the PostgreSQL storage layer by redistributing data across a cluster of nodes.

The project used to be called "Zenith". Many of the commands and code comments still refer to "zenith", but we are in the process of renaming things.

Quick start

Join the waitlist for our free tier to receive your serverless postgres instance. Then connect to it with your preferred postgres client (psql, dbeaver, etc) or use the online SQL editor.

Alternatively, compile and run the project locally.

Architecture overview

A Neon installation consists of compute nodes and a Neon storage engine.

Compute nodes are stateless PostgreSQL nodes backed by the Neon storage engine.

The Neon storage engine consists of two major components:

  • Pageserver. Scalable storage backend for the compute nodes.
  • WAL service. The service receives WAL from the compute node and ensures that it is stored durably.

Pageserver consists of:

  • Repository - Neon storage implementation.
  • WAL receiver - service that receives WAL from WAL service and stores it in the repository.
  • Page service - service that communicates with compute nodes and responds with pages from the repository.
  • WAL redo - service that builds pages from base images and WAL records on Page service request

Running local installation

Installing dependencies on Linux

Install build dependencies and other applicable packages

  • On Ubuntu or Debian, this set of packages should be sufficient to build the code:
apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang pkg-config libpq-dev etcd cmake postgresql-client
  • On Fedora, these packages are needed:
dnf install flex bison readline-devel zlib-devel openssl-devel \
  libseccomp-devel perl clang cmake etcd postgresql postgresql-contrib

Install Rust

# recommended approach from https://www.rust-lang.org/tools/install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Installing dependencies on OSX (12.3.1)

Install XCode and dependencies

xcode-select --install
brew install protobuf etcd openssl

Install Rust

# recommended approach from https://www.rust-lang.org/tools/install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Install PostgreSQL Client

# from https://stackoverflow.com/questions/44654216/correct-way-to-install-psql-without-full-postgres-on-macos
brew install libpq
brew link --force libpq

Rustc version

The project uses rust toolchain file to define the version it's built with in CI for testing and local builds.

This file is automatically picked up by rustup that installs (if absent) and uses the toolchain version pinned in the file.

rustup users who want to build with another toolchain can use rustup override command to set a specific toolchain for the project's directory.

non-rustup users most probably are not getting the same toolchain automatically from the file, so are responsible to manually verify their toolchain matches the version in the file. Newer rustc versions most probably will work fine, yet older ones might not be supported due to some new features used by the project or the crates.

Building on Linux

Build neon and patched postgres ```

Note: The path to the neon sources can not contain a space.

git clone --recursive https://github.com/neondatabase/neon.git cd neon

The preferred and default is to make a debug build. This will create a

demonstrably slower build than a release build. For a release build,

use "BUILD_TYPE=release make -jnproc"

make -jnproc


#### Building on OSX

1. Build neon and patched postgres

Note: The path to the neon sources can not contain a space.

git clone --recursive https://github.com/neondatabase/neon.git cd neon

The preferred and default is to make a debug build. This will create a

demonstrably slower build than a release build. For a release build,

use "BUILD_TYPE=release make -jsysctl -n hw.logicalcpu"

make -jsysctl -n hw.logicalcpu


#### Dependency installation notes
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `pg_install/bin` and `pg_install/lib`, respectively.

To run the integration tests or Python scripts (not required to use the code), install
Python (3.9 or higher), and install python3 packages using `./scripts/pysync` (requires [poetry](https://python-poetry.org/)) in the project directory.


#### Running neon database
1. Start pageserver and postgres on top of it (should be called from repo root):
```sh
# Create repository in .neon with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/neon_local init
Starting pageserver at '127.0.0.1:64000' in '.neon'

Pageserver started
Successfully initialized timeline 7dd0907914ac399ff3be45fb252bfdb7
Stopping pageserver gracefully...done!

# start pageserver and safekeeper
> ./target/debug/neon_local start
Starting etcd broker using /usr/bin/etcd
Starting pageserver at '127.0.0.1:64000' in '.neon'

Pageserver started
Starting safekeeper at '127.0.0.1:5454' in '.neon/safekeepers/sk1'
Safekeeper started

# start postgres compute node
> ./target/debug/neon_local pg start main
Starting new postgres main on timeline de200bd42b49cc1814412c7e592dd6e9 ...
Extracting base backup to create postgres instance: path=.neon/pgdatadirs/tenants/9ef87a5bf0d92544f6fafeeb3239695c/main port=55432
Starting postgres node at 'host=127.0.0.1 port=55432 user=cloud_admin dbname=postgres'

# check list of running postgres instances
> ./target/debug/neon_local pg list
 NODE  ADDRESS          TIMELINE                          BRANCH NAME  LSN        STATUS
 main  127.0.0.1:55432  de200bd42b49cc1814412c7e592dd6e9  main         0/16B5BA8  running

Now, it is possible to connect to postgres and run some queries: ```text

psql -p55432 -h 127.0.0.1 -U cloud_admin postgres postgres=# CREATE TABLE t(key int primary key, value text); CREATE TABLE postgres=# insert into t values(1,1); INSERT 0 1 postgres=# select * from t; key | value

-----+------- 1 | 1 (1 row)


3. And create branches and run postgres on them:
```sh
# create branch named migration_check
> ./target/debug/neon_local timeline branch --branch-name migration_check
Created timeline 'b3b863fa45fa9e57e615f9f2d944e601' at Lsn 0/16F9A00 for tenant: 9ef87a5bf0d92544f6fafeeb3239695c. Ancestor timeline: 'main'

# check branches tree
> ./target/debug/neon_local timeline list
(L) main [de200bd42b49cc1814412c7e592dd6e9]
(L) ┗━ @0/16F9A00: migration_check [b3b863fa45fa9e57e615f9f2d944e601]

# start postgres on that branch
> ./target/debug/neon_local pg start migration_check --branch-name migration_check
Starting new postgres migration_check on timeline b3b863fa45fa9e57e615f9f2d944e601 ...
Extracting base backup to create postgres instance: path=.neon/pgdatadirs/tenants/9ef87a5bf0d92544f6fafeeb3239695c/migration_check port=55433
Starting postgres node at 'host=127.0.0.1 port=55433 user=cloud_admin dbname=postgres'

# check the new list of running postgres instances
> ./target/debug/neon_local pg list
 NODE             ADDRESS          TIMELINE                          BRANCH NAME      LSN        STATUS
 main             127.0.0.1:55432  de200bd42b49cc1814412c7e592dd6e9  main             0/16F9A38  running
 migration_check  127.0.0.1:55433  b3b863fa45fa9e57e615f9f2d944e601  migration_check  0/16F9A70  running

# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 -U cloud_admin postgres
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

postgres=# insert into t values(2,2);
INSERT 0 1

# check that the new change doesn't affect the 'main' postgres
> psql -p55432 -h 127.0.0.1 -U cloud_admin postgres
postgres=# select * from t;
 key | value
-----+-------
   1 | 1
(1 row)

If you want to run tests afterward (see below), you must stop all the running of the pageserver, safekeeper, and postgres instances you have just started. You can terminate them all with one command:

> ./target/debug/neon_local stop

Running tests

Ensure your dependencies are installed as described here.

git clone --recursive https://github.com/neondatabase/neon.git

CARGO_BUILD_FLAGS="--features=testing" make

./scripts/pytest

Documentation

Now we use README files to cover design ideas and overall architecture for each module and rustdoc style documentation comments. See also /docs/ a top-level overview of all available markdown documentation.

To view your rustdoc documentation in a browser, try running cargo doc --no-deps --open

Postgres-specific terms

Due to Neon's very close relation with PostgreSQL internals, numerous specific terms are used. The same applies to certain spelling: i.e. we use MB to denote 1024 * 1024 bytes, while MiB would be technically more correct, it's inconsistent with what PostgreSQL code and its documentation use.

To get more familiar with this aspect, refer to:

Join the development

Download Details:

Author: Neondatabase
Source Code: https://github.com/neondatabase/neon 
License: Apache-2.0 license

#serverless #rust #postgres #database 

Neon: Serverless Postgres
Daniel  Hughes

Daniel Hughes

1666525500

PostgresML: An End-to-end Machine Learning System

PostgresML 

Simple machine learning with PostgreSQL  

Train and deploy models to make online predictions using only SQL, with an open source extension for Postgres. Manage your projects and visualize datasets using the built in dashboard.

PostgresML in practice

The dashboard makes it easy to compare different algorithms or hyperparameters across models and datasets.

PostgresML dashboard

See it in action — demo.postgresml.org

What's in the box

See the documentation for a complete list of functionality.

All your favorite algorithms

Whether you need a simple linear regression, or extreme gradient boosting, we've included support for all classification and regression algorithms in Scikit Learn and XGBoost with no extra configuration.

Managed model deployements

Models can be periodically retrained and automatically promoted to production depending on their key metric. Rollback capability is provided to ensure that you're always able to serve the highest quality predictions, along with historical logs of all deployments for long term study.

Online and offline support

Predictions are served via a standard Postgres connection to ensure that your core apps can always access both your data and your models in real time. Pure SQL workflows also enable batch predictions to cache results in native Postgres tables for lookup.

Instant visualizations

Run standard analysis on your datasets to detect outliers, bimodal distributions, feature correlation, and other common data visualizations on your datasets. Everything is cataloged in the dashboard for easy reference.

Hyperparameter search

Use either grid or random searches with cross validation on your training set to discover the most important knobs to tweak on your favorite algorithm.

SQL native vector operations

Vector operations make working with learned emebeddings a snap, for things like nearest neighbor searches or other similarity comparisons.

The performance of Postgres

Since your data never leaves the database, you retain the speed, reliability and security you expect in your foundational stateful services. Leverage your existing infrastructure and expertise to deliver new capabilities.

Open source

We're building on the shoulders of giants. These machine learning libraries and Postgres have recieved extensive academic and industry use, and we'll continue their tradition to build with the community. Licensed under MIT.

Quick Start

  1. Clone this repo:
$ git clone git@github.com:postgresml/postgresml.git
  1. Start dockerized services. PostgresML will run on port 5433, just in case you already have Postgres running:
$ cd postgresml && docker-compose up
  1. Connect to PostgreSQL in the Docker container with PostgresML installed:
$ psql postgres://postgres@localhost:5433/pgml_development
  1. Validate your installation:
pgml_development=# SELECT pgml.version();
 
 version
---------
 0.8.1
(1 row)

See the documentation for a complete guide to working with PostgresML.


Download Details:

Author: postgresml
Source Code: https://github.com/postgresml/postgresml

License: MIT license

#python #postgres 

PostgresML: An End-to-end Machine Learning System
Joseph  Norton

Joseph Norton

1664499602

Create a B2B App with Stripe, Postgres and REST API Backend

Ecommerce Website Tutorial – Create a B2B App with Stripe + Postgres + REST API Backend

Learn how to create three SaaS internal business tools with Postgres, Stripe API, and the Retool low-code platform. You will build an order management dashboard, an employee dashboard, and a developer portal.

You will use the Retool platform to build the business tools. Retool is a drag-and-drop no-code editor with many of pre-built components to build internal CRUD (create, read, update, delete) apps as fast as possible.

⭐️ Contents ⭐️
⌨️ (00:03:40) The Employee App
⌨️ (01:06:20) The Manager/Admin App
⌨️ (01:43:12) The Developer App

Code: https://github.com/kubowania/mobee-psql-data

 

#stripe #postgres #api #webdev 

Create a B2B App with Stripe, Postgres and REST API Backend
Python  Library

Python Library

1662030060

A Template for Telegram Bot using Postgres, Redis, Python Asyncio

🚀 Getting Started

Running on Local Machine

  • install dependencies using Poetry
poetry install
  • configure environment variables in .env file
  • start bot in virtual environment
poetry run python -m bot

Launch in Docker

  • configure environment variables in .env file
  • start virtual environment
poetry shell
  • building the docker image
docker-compose build
  • start service
docker-compose up -d

🌍 Environment variables

  • BOT_TOKEN — Telegram bot token
  • PG_HOST — hostname or an IP address PostgreSQL database
  • PG_NAME — the name of the PostgreSQL database
  • PG_PASSWORD — password used to authenticate
  • PG_PORT — connection port number (defaults to 5432 if not provided)
  • PG_USER — the username used to authenticate
  • REDIS_HOST — hostname or an IP address Redis database
  • REDIS_PASSWORD — Redis database password, empty by default
  • REDIS_PORT — port from Redis database

I use Redis for Finite State Machine, and PostgreSQL as Database

🔧 Tech Stack

  • aiogram — asynchronous framework for Telegram Bot API
  • asyncpg — asynchronous PostgreSQL database client library
  • poetry — development workflow
  • loguru — third party library for logging in Python
  • docker — to automate deployment
  • postgres — powerful, open source object-relational database system
  • redis — an in-memory data structure store

Download details:

Author: donBarbos
Source code: https://github.com/donBarbos/telegram-bot-template 
License: GPL-3.0 license

#python #bot #postgres #redis #docker

A Template for Telegram Bot using Postgres, Redis, Python Asyncio

Postgres.jl: Postgres Database interface for The Julia Language

Postgres

Postgres Database Interface for the Julia language.

Basic Usage

julia> using Postgres
julia> conn = connect(PostgresServer, db="julia_test", host="localhost")
julia> #conn = connect(PostgresServer, "postgresql://localhost/julia_test")
julia> #empty strings will cause the server to use defaults.
julia> #connect(interface, user, db, host, passwd, port)
julia> #conn = connect(PostgresServer, "", "julia_test", "localhost", "", "")
julia> curs = cursor(conn)
julia> df = query(curs, "select 1 from generate_series(1,5) as s")
5x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
| 2   | 1  |
| 3   | 1  |
| 4   | 1  |
| 5   | 1  |

Iteration

Memory management is automatic for the cursor interface.

Buffered (Normal) Cursor

julia> execute(curs, "select 1 from generate_series(1, 10)")
julia> for res in curs; println(res); end;
10x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
| 2   | 1  |
| 3   | 1  |
| 4   | 1  |
| 5   | 1  |
| 6   | 1  |
| 7   | 1  |
| 8   | 1  |
| 9   | 1  |
| 10  | 1  |
julia> for res in curs; println(res); end;
# nothing (memory already freed from server)

Streamed (Paged) Cursor

julia> streamed = cursor(conn, 3)
julia> execute(streamed, "select 1 from generate_series(1, 10)")
julia> for res in streamed; println(res); end;
3x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
| 2   | 1  |
| 3   | 1  |
3x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
| 2   | 1  |
| 3   | 1  |
3x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
| 2   | 1  |
| 3   | 1  |
1x1 DataFrames.DataFrame
| Row | x1 |
|-----|----|
| 1   | 1  |
0x1 DataFrames.DataFrame

Each iteration allocs and frees memory.

Result Interface

Cursor must be closed (or unreachable) to release server resources.

julia> using Postgres.Results
julia> result = execute(curs, "select 1, null::int, 'HI'::text, 1.2::float8  
            from generate_series(1, 5)")
5x4{Int32, Int32, UTF8String, Float64} PostgresResult
julia> result[1,1]     # array
Nullable(1)

julia> result[1, :]    # row; also row(curs, 1)
4-element Array{Any,1}:
 Nullable(1)      
 Nullable{Int32}()
 Nullable("HI")   
 Nullable(1.2) 

# columns are a lot faster to create
julia> result[:, 1]    # columns; also column(curs, 1)
5-element DataArrays.DataArray{Int32,1}:
 1
 1
 1
 1
 1
#row iteration
julia> for row in result; println(row); end
Any[Nullable(1),Nullable{Int32}(),Nullable("HI"),Nullable(1.2)]
# ...
close(curs) # free postgres resources

Transactions

julia> begin_!(curs)
INFO: BEGIN 
julia> rollback!(curs)
INFO: ROLLBACK 
julia> commit!(curs)
WARNING: WARNING:  there is no transaction in progress
INFO: COMMIT 
# transaction already ended by rollback

Base Types supported as Julia Types:

julia> for v in values(Postgres.Types.base_types)
            println(v)
       end

text -> UTF8String
varchar -> UTF8String
bpchar -> UTF8String
unknown -> UTF8String
bit -> BitArray{1}
varbit -> BitArray{1}
bytea -> Array{UInt8,1}
bool -> Bool
int2 -> Int16
int4 -> Int32
int8 -> Int64
float4 -> Float32
float8 -> Float64
numeric -> BigFloat
date -> Date
json -> UTF8String
jsonb -> UTF8String

Others supported as UTF8String.

Extended Types

Automatically determined on connection start up.

julia> types = collect(values(conn.pgtypes))
julia> enum_test = filter(x->x.name==:enum_test, types)[1]
enum_test ∈ Set(UTF8String["happy","sad"])
# pg def:
# Schema │   Name    │ Internal name │ Size │ Elements │
#────────┼───────────┼───────────────┼──────┼──────────┼
# public │ enum_test │ enum_test     │ 4    │ happy   ↵│
#        │           │               │      │ sad      │

julia> domain_test = filter(x->x.name==:domain_test, types)[1]
(domain_test <: int4) -> Int32
# pg def:
# Schema │    Name     │  Type   │ Modifier │               Check                │
#────────┼─────────────┼─────────┼──────────┼────────────────────────────────────┼
# public │ domain_test │ integer │          │ CHECK (VALUE >= 0 AND VALUE <= 10) │

Enum types will use PooledDataArrays!

Escaping

julia> user_input="1';select 'powned';"
julia> escape_value(conn, user_input)
"'1'';select ''powned'';'"

Error Info

julia> try query(curs, "select xxx")
        catch err PostgresServerError
           println(err.info)
       end
PostgresResultInfo(
            msg:ERROR:  column "xxx" does not exist
LINE 1: select xxx
               ^
            severity:ERROR
            state:syntax_error_or_access_rule_violation
            code:42703
            primary:column "xxx" does not exist
            detail:
            hint:
            pos:8
)

see Appendix A. in the Postgres manual for error code/state lists.

Copy Support

# Commands use the same interface as selects.
# Messages are passed through to Julia as you are used to seeing them in psql.
julia> println(query(curs, """
    drop table if exists s; 
    drop table if exists news; 
    create table s as select 1 as ss from generate_series(1,10)"""))
NOTICE:  table "news" does not exist, skipping
INFO: SELECT 10 10
0x0 DataFrames.DataFrame

julia> df = query(curs, "select * from s")
julia> copyto(curs, df, "s")
INFO: COPY 10 10
0x0{} PostgresResult

julia> copyto(curs, df, "news", true)
INFO: table 'news' not found in database. creating ...
INFO: CREATE TABLE 
INFO: COPY 10 10
0x0{} PostgresResult

Custom Types

julia> using Postgres.Types

julia> type Point
        x::Float64
        y::Float64
       end

# find the oid (600 in this case) in the pg_type table in Postgres.
# Then instance the type.
julia> base_types[600] = PostgresType{Point}(:point, Point(0, 0))
point -> Point

# create the _in_ function from the database
julia> function Postgres.Types.unsafe_parse{T <: Point}(::PostgresType{T}, value::UTF8String)
    x, y = split(value, ",")
    x = parse(Float64, x[2:end])
    y = parse(Float64, y[1:end-1])
    Point(x, y)
end
unsafe_parse (generic function with 15 methods)

# create the _out_ function to the database
julia> Postgres.Types.PostgresValue{T <: Point}(val::T) =
    Postgres.Types.PostgresValue{T}(base_types[600], "($(val.x),$(val.y))")
Postgres.Types.PostgresValue

#reload conn so it picks up the new type
julia> close(conn)
PostgresConnection(@ 0 : not_connected)
julia> conn = connect(PostgresServer, db="julia_test", host="localhost")
PostgresConnection(@ 0x0b41b818 : ok)
julia> curs = cursor(conn)
Postgres.BufferedPostgresCursor(
    PostgresConnection(@ 0x0b41b818 : ok),
    Nullable{Postgres.Results.PostgresResult}())

julia> p1 = Point(1.1, 1.1)
Point(1.1,1.1)
julia> start = repr(PostgresValue(p1))
"'(1.1,1.1)'::point"
julia> p2 = query(curs, "select $start")[1][1]
Point(1.1,1.1)
julia> p1.x == p2.x && p1.y == p2.y
true

Control-C cancels the query at the server

julia> query(curs, "select 1 from generate_series(1, (10^9)::int)")
# oops; this will take forever
^CINFO: canceling statement due to user request
ERROR: PostgresError: No results to fetch
 in fetch at /home/xxx/.julia/v0.4/Postgres/src/postgres.jl:383
  in query at /home/xxx/.julia/v0.4/Postgres/src/postgres.jl:405

#no need to chase down zombie process with ps or top :) :)

Download Details:

Author: NCarson
Source Code: https://github.com/NCarson/Postgres.jl 
License: View license

#julia #postgres 

Postgres.jl: Postgres Database interface for The Julia Language

LibPQ.jl: A Julia Wrapper for Libpq

LibPQ

LibPQ.jl is a Julia wrapper for the PostgreSQL libpq C library.   

Features

Current

  • Build
    • Installs libpq via BinaryBuilder.jl for MacOS, GNU Linux, and Windows
  • Connections
    • Connect via DSN
    • Connect via PostgreSQL connection string
    • UTF-8 client encoding
  • Queries
    • Create and execute queries with or without parameters
    • Execute queries asynchronously
    • Stream results using Tables
    • Configurably convert a variety of PostgreSQL types to corresponding Julia types (see the Type Conversions section of the docs)
  • Prepared Statements
    • Create and execute prepared statements with or without parameters
    • Stream table of parameters to execute the same statement multiple times with different data

Goals

Note that these are goals and do not represent the current state of this package

LibPQ.jl aims to wrap libpq as documented in the PostgreSQL documentation, including all non-deprecated functionality and handling all documented error conditions. Where possible, asynchronous functionality will be wrapped in idiomatic Julia control flow. All Oids returned in query results will have type conversions (to String by default) defined, as long as I can find documentation on their structure. Some effort will be made to integrate with other packages (e.g., Tables, already implemented) to facilitate conversion from query results to a malleable format.

Non-Goals

LibPQ.jl will not attempt to conform to a standard database interface, though anyone is welcome to write a PostgreSQL.jl library to wrap this package.

This package will not:

  • parse SQL
  • emit SQL
  • provide an interface for handling transactions or cursors
  • provide abstractions over common SQL patterns

Possible Goals

This package may not:

  • test on multiple install configurations
  • aim to support any particular versions of libpq or PostgreSQL
  • support conversion from some Oid to some type
  • provide easy access to every possible connection method
  • be as memory-efficient as possible

While I may never get to any of these, I welcome tested, documented contributions!

Download Details:

Author: invenia
Source Code: https://github.com/invenia/LibPQ.jl 
License: MIT license

#julia #postgres #database 

LibPQ.jl: A Julia Wrapper for Libpq

PGstore: A Postgres Session Store Backend for Gorilla/sessions

pgstore

A session store backend for gorilla/sessions - src.

Installation

make get-deps

Documentation

Available on godoc.org.

See http://www.gorillatoolkit.org/pkg/sessions for full documentation on underlying interface.

Example

package examples

import (
    "log"
    "net/http"
    "time"

    "github.com/antonlindstrom/pgstore"
)

// ExampleHandler is an example that displays the usage of PGStore.
func ExampleHandler(w http.ResponseWriter, r *http.Request) {
    // Fetch new store.
    store, err := pgstore.NewPGStore("postgres://user:password@127.0.0.1:5432/database?sslmode=verify-full", []byte("secret-key"))
    if err != nil {
        log.Fatalf(err.Error())
    }
    defer store.Close()

    // Run a background goroutine to clean up expired sessions from the database.
    defer store.StopCleanup(store.Cleanup(time.Minute * 5))

    // Get a session.
    session, err := store.Get(r, "session-key")
    if err != nil {
        log.Fatalf(err.Error())
    }

    // Add a value.
    session.Values["foo"] = "bar"

    // Save.
    if err = session.Save(r, w); err != nil {
        log.Fatalf("Error saving session: %v", err)
    }

    // Delete session.
    session.Options.MaxAge = -1
    if err = session.Save(r, w); err != nil {
        log.Fatalf("Error saving session: %v", err)
    }
}

Breaking changes

  • 2016-07-19 - NewPGStore and NewPGStoreFromPool now returns (*PGStore, error)

Thanks

I've stolen, borrowed and gotten inspiration from the other backends available:

Thank you all for sharing your code!

What makes this backend different is that it's for PostgreSQL.

We've recently refactored this backend to use the standard database/sql driver instead of Gorp. This removes a dependency and makes this package very lightweight and makes database interactions very transparent. Lastly, from the standpoint of unit testing where you want to mock the database layer instead of requiring a real database, you can now easily use a package like go-SQLMock to do just that.

Download Details:

Author: Antonlindstrom
Source Code: https://github.com/antonlindstrom/pgstore 
License: MIT license

#go #golang #postgres 

PGstore: A Postgres Session Store Backend for Gorilla/sessions