Docker Internals | Understanding Docker #02

Welcome to my channel Code Labs Javascript

Learn Geek stuff [Angular, React, Redux, Nodejs, Web development, Docker, AWS, Vue JS, All about Javascript]

Hi, I’m Tarun - a full-stack software developer based out of India. I build open-source projects and write about modern JavaScript, Node.js, design and web development. If you like my stuff Please subscribe My channel and Fell free to hit One-click unsubscribe anytime. I have 2000 youtube videos on latest technologies.

If you have any comments, ideas, critiques, or you just want to say hi, don’t hesitate to send me an email at tarun.

#docker

What is GEEK

Buddha Community

Docker Internals | Understanding Docker #02
Iliana  Welch

Iliana Welch

1595249460

Docker Explained: Docker Architecture | Docker Registries

Following the second video about Docker basics, in this video, I explain Docker architecture and explain the different building blocks of the docker engine; docker client, API, Docker Daemon. I also explain what a docker registry is and I finish the video with a demo explaining and illustrating how to use Docker hub

In this video lesson you will learn:

  • What is Docker Host
  • What is Docker Engine
  • Learn about Docker Architecture
  • Learn about Docker client and Docker Daemon
  • Docker Hub and Registries
  • Simple demo to understand using images from registries

#docker #docker hub #docker host #docker engine #docker architecture #api

August  Murray

August Murray

1615016400

Docker: Installing Docker and Understanding basic docker commands

Now since we have understood the basic architecture of Docker in my previous tutorial titled Docker: Understanding Docker Architecture and Components, lets now learn how to install Docker and run some basic commands.

Pre-requisites

  1. For our demonstration, we will be using centos-07.
  2. We will be using 1 machine for our lab with the below IP details:

192.168.33.61 docker.unixlab.com

3. The memory should be at least 2 GB and there should be at least 2 core CPU.

Understanding Basic docker commands:

The First thing we are going to do is to run the **“docker run hello-world” **command.

This command tries to find the “hello-world” image locally and if not found, it then downloads an image from the docker hub and runs the container out of this image.

#automation #containerization #docker-container #docker #docker-image

Docker Architecture Overview & Docker Components [For Beginners]

If you have recently come across the world of containers, it’s probably not a bad idea to understand the underlying elements that work together to offer containerisation benefits. But before that, there’s a question that you may ask. What problem do containers solve?

After building an application in a typical development lifecycle, the developer sends it to the tester for testing purposes. However, since the development and testing environments are different, the code fails to work.

Now, predominantly, there are two solutions to this – either you use a Virtual Machine or a containerised environment such as Docker. In the good old times, organisations used to deploy VMs for running multiple applications.

So, why did they started adopting containerisation over VMs? In this article, we will provide detailed explanations of all such questions.

#docker containers #docker engine #docker #docker architecture

August  Murray

August Murray

1615023900

Docker: Manage Data in Docker -Understanding “Docker Volumes” and “Bind Mounts”

Introduction

By Design, Docker containers don’t hold persistent data. Any data you write inside the docker’s writable layer is no longer available once the container is stopped. It can be difficult to get the data out of the container if another process needs it.

Also, a container’s writable layer is tightly coupled to the host machine where the container is running. You can’t easily move the data somewhere else.

Docker has two options for containers to store files in the host machine, so that the files are persisted even after the container stops: volumes, and bind mounts.

  • Volumes are stored in a part of the host filesystem which is managed by Docker (/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
  • Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.

Let’s understand them in detail one by one.

#docker-container #docker #docker-volume #containerization

Cayla  Erdman

Cayla Erdman

1599914520

Apache/Airflow and PostgreSQL with Docker and Docker Compose

Hello, in this post I will show you how to set up official Apache/Airflow with PostgreSQL and LocalExecutor using docker and docker-compose. In this post, I won’t be going through Airflow, what it is, and how it is used. Please checktheofficial documentation for more information about that.

Before setting up and running Apache Airflow, please install Docker and Docker Compose.

For those in hurry…

In this chapter, I will show you files and directories which are needed to run airflow and in the next chapter, I will go file by file, line by line explaining what is going on.

Firstly, in the root directory create three more directories: dagslogs, and scripts. Further, create following files: **.env, docker-compose.yml, entrypoint.sh **and **dummy_dag.py. **Please make sure those files and directories follow the structure below.

#project structure

root/
├── dags/
│   └── dummy_dag.py
├── scripts/
│   └── entrypoint.sh
├── logs/
├── .env
└── docker-compose.yml

Created files should contain the following:

#docker-compose.yml

version: '3.8'
services:
    postgres:
        image: postgres
        environment:
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
    scheduler:
        image: apache/airflow
        command: scheduler
        restart_policy:
            condition: on-failure
        depends_on:
            - postgres
        env_file:
            - .env
        volumes:
            - ./dags:/opt/airflow/dags
            - ./logs:/opt/airflow/logs
    webserver:
        image: apache/airflow
        entrypoint: ./scripts/entrypoint.sh
        restart_policy:
            condition: on-failure
        depends_on:
            - postgres
            - scheduler
        env_file:
            - .env
        volumes:
            - ./dags:/opt/airflow/dags
            - ./logs:/opt/airflow/logs
            - ./scripts:/opt/airflow/scripts
        ports:
            - "8080:8080"
#entrypoint.sh
#!/usr/bin/env bash
airflow initdb
airflow webserver
#.env
AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__EXECUTOR=LocalExecutor
#dummy_dag.py
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
from datetime import datetime
with DAG('example_dag', start_date=datetime(2016, 1, 1)) as dag:
    op = DummyOperator(task_id='op')

Positioning in the root directory and executing “docker-compose up” in the terminal should make airflow accessible on localhost:8080. Image bellow shows the final result.

If you encounter permission errors, please run “chmod -R 777” on all subdirectories, e.g. “chmod -R 777 logs/”


For the curious ones...

In Leyman’s terms, docker is used when managing individual containers and docker-compose can be used to manage multi-container applications. It also moves many of the options you would enter on the docker run into the docker-compose.yml file for easier reuse. It works as a front end "script" on top of the same docker API used by docker. You can do everything docker-compose does with docker commands and a lot of shell scripting.

Before running our multi-container docker applications, docker-compose.yml must be configured. With that file, we define services that will be run on docker-compose up.

The first attribute of docker-compose.yml is version, which is the compose file format version. For the most recent version of file format and all configuration options click here.

Second attribute is services and all attributes one level bellow services denote containers used in our multi-container application. These are postgres, scheduler and webserver. Each container has image attribute which points to base image used for that service.

For each service, we define environment variables used inside service containers. For postgres it is defined by environment attribute, but for scheduler and webserver it is defined by .env file. Because .env is an external file we must point to it with env_file attribute.

By opening .env file we can see two variables defined. One defines executor which will be used and the other denotes connection string. Each connection string must be defined in the following manner:

dialect+driver://username:password@host:port/database

Dialect names include the identifying name of the SQLAlchemy dialect, a name such as sqlite, mysql, postgresql, oracle, or mssql. Driver is the name of the DBAPI to be used to connect to the database using all lowercase letters. In our case, connection string is defined by:

postgresql+psycopg2://airflow:airflow@postgres/airflow

Omitting port after host part denotes that we will be using default postgres port defined in its own Dockerfile.

Every service can define command which will be run inside Docker container. If one service needs to execute multiple commands it can be done by defining an optional .sh file and pointing to it with entrypoint attribute. In our case we have entrypoint.sh inside the scripts folder which once executed, runs airflow initdb and airflow webserver. Both are mandatory for airflow to run properly.

Defining depends_on attribute, we can express dependency between services. In our example, webserver starts only if both scheduler and postgres have started, also the scheduler only starts after postgres have started.

In case our container crashes, we can restart it by restart_policy. The restart_policy configures if and how to restart containers when they exit. Additional options are condition, delay, max_attempts, and window.

Once service is running, it is being served on containers defined port. To access that service we need to expose the containers port to the host's port. That is being done by ports attribute. In our case, we are exposing port 8080 of the container to TCP port 8080 on 127.0.0.1 (localhost) of the host machine. Left side of : defines host machines port and the right-hand side defines containers port.

Lastly, the volumes attribute defines shared volumes (directories) between host file system and docker container. Because airflows default working directory is /opt/airflow/ we need to point our designated volumes from the root folder to the airflow containers working directory. Such is done by the following command:

#general case for airflow
- ./<our-root-subdir>:/opt/airflow/<our-root-subdir>
#our case
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./scripts:/opt/airflow/scripts
           ...

This way, when the scheduler or webserver writes logs to its logs directory we can access it from our file system within the logs directory. When we add a new dag to the dags folder it will automatically be added in the containers dag bag and so on.

Originally published by Ivan Rezic at Towardsdatascience

#docker #how-to #apache-airflow #docker-compose #postgresql