Container Basics — Container Image
Images are the basis of containers. The container engine service can use different images to launch different containers. After a container becomes faulty, the service can be promptly restored by deleting the faulty container and launching a new one thanks to the underlying technique of container images[i].
- What Is a Container Image?
A container image is a package comprising a file system encapsulated by layer and metadata that describes the image. It contains all systems, environments, and configurations necessary for the application, and the application per se. An image, after being created, is uploaded to an image repository. Users can obtain such image and use it to directly build their own applications.
The Linux Foundation sponsored the Open Container Initiative (OCI), which completed the first versions of the container runtime and image specifications in 2017[ii]. Docker has made great contributions to the OCI by developing and donating a majority of the OCI code and has been instrumental in defining the OCI runtime and image specifications as a maintainer of the project.
Compared with system images used by virtual machines (VMs), container images do not have the Linux kernel and they have a distinctive format. A VM image is a file encapsulated from an entire system, while a container image is not simply a file, but a file system featuring layered storage.
- Characteristics of a Container Image
(1) Layered storage
Layered storage characterizes container images. As shown in Figure 2.1, each image is built up from a series of layers. When a file in an image needs to be revised, the operation is performed only at the uppermost read-write layer, without overwriting contents of the file system at lower layers. After being revised, the file needs to be submitted to generate a new image. In this case, only changes made at the read-write layer are saved, thus achieving the purpose of sharing the image layer between different container images. The following figure provides a rough illustration of a container image, where the uppermost layer is the read-write layer and other layers are read-only.
[i] Docker – Containers and Container Cloud, V2.0
[ii] Open Container Initiative, https://github.com/opencontainers/image-spec/releases/tag/v1.0.0
Figure 2.1 Container image structure
Container images use the copy-on-write (CoW) strategy to share images between containers. With this strategy, a container, when launched, does not need a separately copied image file. Instead, all image layers are mounted to a mount point as read-only ones, which are overlaid with a read-write layer. When no change is made to the file, all containers have shared access to the exact same data. But when the file system is changed during container execution, the changes are written to the read-write layer and at the same time the earlier version of the file in the read-only layer is hidden. CoW, in combination with the layered mechanism, minimizes images’ disk usage and containers’ launch time.
(3) Content-addressable storage
Docker 1.10 introduces the mechanism of content-addressable storage (CAS) to allow retrieval of images and image layers based on file contents. Checksum calculation is performed for the content stored at the image layer to generate a hash value, which is taken as the unique ID of the image layer. This mechanism improves the security of images, and guarantees data integrity after pull, push, load, and save operations.
(4) Union mount
The union mount technique allows mounting of multiple file systems onto the same mount point. After the original directory at the mount point is integrated with the mounted ones, the final file system contains files and directories at all layers. A file system implementing the union mount technique is usually called a union file system (UFS).
Union mount is an approach to forming a union file system by mounting file systems of multiple image layers onto the same mount point. It can be seen as a method of amalgamating lower-layer storage drivers in a layered manner.
- Image Building
Images, as a basis for container execution, can be obtained through various channels. One of them is to obtain existing images from image repositories, including public and private ones. Another method is to create a new image. Please read on to learn how to build and generate images in these two ways.
(1) docker build
The docker build command builds images automatically from a Dockerfile, which is readable and easy to understand. The mechanism goes as follows: Each line runs a corresponding modification command based on the upper-layer intermediary container before being submitted through the docker build command. These operations are repeated once and again until the desired image is produced. Following is an example of image building from a Dockerfile:
In a Dockerfile, the command (COPY, RUN, or CMD) at each line generates a new layer, which will overlay the file system previously generated with the last command. Finally, all image layers are combined to form a file system of the new image.
(2) docker commit
With this method, first, an existing image is used to launch the container; then, all required operations are performed in this container; finally, the docker commit command is executed on the host to package the container into a new image. This method is good in that it is convenient to make changes and easy to troubleshoot image faults. However, it also has disadvantages as the image building process is not transparent enough and images created this way are not easy to maintain.
- Image Repository
An image repository is a place to store images. It is also an important channel to obtain images. Image repositories can be divided into public ones and private ones, depending on the usage.
(1) Public repository
A public repository is available for all Internet users. A typical example of public repositories is Docker Hub[i], which currently houses more than 15,000 images. Most common applications have images stored here and can be directly downloaded for use.
In addition, users can upload self-built images to a public repository for other people’s use. Software vendors can also release their software in the form of images.
(2) Private repository
Not all images can be released and shared over the Internet. Sometimes, images can only be shared within a team because of containing sensitive information.
The sensitivity of images as well as the availability and stability of the Internet necessitates the use of private or local repositories. A private repository is one that can be accessed by a specified scope of users. A typical example of private repositories is Harbor.[ii]
[i] Docker Hub, https://hub.docker.com/
[ii] VMware Harbor, https://vmware.github.io/harbor/cn/
Figure 2.2 Harbor architecture
Harbor, developed by the R&D team of VMware China, aims to help users quickly set up an enterprise-grade registry service. Based on Docker Registry, Harbor provides such functions as management UIs, role-based access control, AD/LDAP integration, audit log, and image vulnerability scanning.
Table 2.1 Components of the Harbor service
|Registry||Responsible for storing Docker images and processing Docker push/pull commands|
|UI||A graphical user interface to help users manage images on the Registry and to grant permissions to users|
|Job Service||Task management service of Harbor|
|DB||Database service that is responsible for storing data of user permissions, audit logs, image groups, and so on|
|AD/LDAP||Providing unified user identity authentication and permissions control|
|Log Collector||Responsible for collecting logs of other modules for future analysis|
|Notary||Auditing image contents to ensure the authenticity of images; optional for integration|
|Clair||Responsible for vulnerability scanning of images; optional for integration|
- Use of Images
Besides docker pull/push, there are other common commands in Docker for image-related operations, as listed below. For details, see the related official document of Docker[i].
(1) docker export
Exports a container to a local disk drive.
[i] Docker images command-line reference, https://docs.docker.com/engine/reference/commandline/images/
(2) docker import
Imports a local container as an image.
(3) docker save
Saves one or more images to a local disk drive.
(To be continued)