C++ Microservices in Docker: Optimizing the Container Image
March 29, 2021

Optimizing the Docker Container Image for C++ Microservices

DevOps

In previous posts, we covered the basics of a C++ Microservices deployment including:

With those basics in place, this blog will focus on optimization of the container in a C++ Microservices deployment. We'll examine how to structure the Dockerfile and the resulting Docker image to reduce the number of layers and disk space used.

Optimizing a Docker Environment

Start with the Dockerfile that we created in the last post, and build it in a clean environment (no cached items) results in an image (including all of its dependent layers) that takes ~905 MB of disk space.

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
hydraexpress        latest              3626de2aaf27        About a minute ago   905MB

The image is composed of 19 layers, of varying sizes as can be seen with docker history:

$ docker history hydraexpress
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
3626de2aaf27        3 minutes ago       /bin/sh -c #(nop)  ENTRYPOINT ["/entrypoint.…   0B                  
ffcff24b7483        3 minutes ago       /bin/sh -c #(nop) COPY file:3ea4017ec3012e11…   63B                 
b1d244ce4919        3 minutes ago       /bin/sh -c mkdir -p ${RWSF_HOME}/apps/servle…   296B                
e9c0c9190a67        3 minutes ago       /bin/sh -c mkdir -p ${RWSF_HOME}/apps-lib &&…   16kB                
156707926d63        3 minutes ago       /bin/sh -c cd /build && cmake3 ../src && make   187kB               
7241437ae952        4 minutes ago       /bin/sh -c mkdir -p /build                      0B                  56
d6fc9a947aa2        4 minutes ago       /bin/sh -c #(nop) COPY dir:993b3c8a984443a43…   1.52kB              
b45339df3de8        4 minutes ago       /bin/sh -c yum install -y gcc-c++ make cmake3   287MB               
fa30f220f666        4 minutes ago       /bin/sh -c yum install -y epel-release          91.6MB              
378b34919a3f        4 minutes ago       /bin/sh -c /opt/download/hydraexpress.run   …   160MB               
27b2cc792760        4 minutes ago       /bin/sh -c #(nop)  ENV RWSF_HOME=/opt/perfor…   0B                  
4823d877e6a8        4 minutes ago       /bin/sh -c #(nop) COPY file:1cdab928b8ba6c2e…   122B                
2465aefd3096        4 minutes ago       /bin/sh -c chmod a+x /opt/download/hydraexpr…   35.2MB              
98d08398c045        4 minutes ago       /bin/sh -c wget -q -O /opt/download/hydraexp…   35.2MB              
ff84decf28b1        4 minutes ago       /bin/sh -c mkdir -p /opt/download               0B                  
6d662ccc0723        4 minutes ago       /bin/sh -c yum install -y wget                  92.2MB              
7e6257c9f8d8        5 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           5 weeks ago         /bin/sh -c #(nop)  LABEL org.label-schema.sc…   0B                  
<missing>           5 weeks ago         /bin/sh -c #(nop) ADD file:61908381d3142ffba…   203MB   

While some of that would be compressed in transit to and from the hosts where it would be deployed, the distribution size is still pretty large. Let’s see what we can do to reduce the overall size and number of layers in the final image.

Reduce the Size and Number of Layers

Starting at the top of the Dockerfile, we’re downloading and installing HydraExpress into the container, however we aren’t removing the installation media, or even the tool we used to download the installation media, wget. This download is also associated with seven of the layers in the image.

Let’s see if we can shrink this down using a multi-stage build of our Dockerfile. We’ll split the build into two steps.

First, we’ll create an image for downloading and installing HydraExpress. Second, we’ll create a fresh image that copies the HydraExpress installation space from the first image to the second. Since we’re only copying the installed product, all of the artifacts from the installation will be left behind, reducing the space required for the final image.

We’ll start by adjusting our Dockerfile by naming the first image:

Dockerfile
…
FROM centos:7 AS hydraexpress_install

…

Next, we’ll introduce a second image in our Dockerfile, after HydraExpress has been installed. This image will serve as the basis for the rest of the Dockerfile. We’ll then copy the HydraExpress installation directory from the previous image, and set the appropriate environment variables as we did before:

Dockerfile
…
RUN /opt/download/hydraexpress.run  \
    --mode unattended  \
    --prefix /opt/perforce/hydraexpress  \
    --license-file /opt/download/license.key

FROM centos:7

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_install ${RWSF_HOME} ${RWSF_HOME}

RUN yum install -y epel-release
…

Note that the COPY command in our Dockerfile is specifying the --from argument, referring back to the label that we assigned to the first image.

With those changes in place, we've optimized the container by reducing:

  • Overall image size was reduced to 742 MB.
  • Number of layers in the final image dropped to 14.

That’s a nice start, but let's move onto the next stage of the build and see if we can find similar savings there.

Compile and Link Servlet

The next stage compiles and links our servlet instance. This requires a C++ compiler and related tools and produces a number of build artifacts that aren’t needed in the final image. Let’s split this out into its own build stage as well and see how that affects our final image.

First, as we did before we’ll add a label to the second stage so that we can reference it later on in our Dockerfile.

Dockerfile
…
RUN /opt/download/hydraexpress.run  \
    --mode unattended  \
    --prefix /opt/perforce/hydraexpress  \
    --license-file /opt/download/license.key

FROM centos:7 AS servlet_build

ENV RWSF_HOME /opt/perforce/hydraexpress
…

Second, we’ll introduce a new image after the build is complete. This will serve as the basis for the final image that will be produced. Similar to the build stage, we’ll set up the environment and copy HydraExpress to our new image:

Dockerfile
…
RUN cd /build && cmake3 ../src && make

FROM centos:7

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_install ${RWSF_HOME} ${RWSF_HOME}

RUN mkdir -p ${RWSF_HOME}/apps-lib &&  \
    cp -f /build/hello/libhello.so ${RWSF_HOME}/apps-lib
…

Finally, we need to copy the servlet files from the servlet_build stage to our final image. Since these files were already being copied into the appropriate locations under HydraExpress, we’ll simplify the steps by coping the files directly to their final locations:

Dockerfile
…
COPY --from=hydraexpress_install ${RWSF_HOME} ${RWSF_HOME}

COPY --from=servlet_build /build/hello/libhello.so ${RWSF_HOME}/apps-lib/
COPY --from=servlet_build /src/hello/WEB-INF ${RWSF_HOME}/apps/servlets/hello/WEB-INF/

COPY entrypoint.sh /entrypoint.sh
…

With those changes in place, the optimization provides substantial savings: 

  • Layer count drop to 9.
  • Overall image size reduced to 363 MB.

There’s still more that can be done to optimize the container.

Copy Only Required Files

We’ll focus next on where HydraExpress is copied into the container. Our HydraExpress installation is a full deployment, including debug libraries and development tools that aren’t required for deployment. Instead of copying everything from the HydraExpress installation, let’s target our copies to just those files that are required.

Similar to before, we’ll create a new staging image to pull the components that we need into our final image:

Dockerfile
…
RUN cd /build && cmake3 ../src && make

FROM centos:7 AS hydraexpress_deploy

FROM centos:7
…

Next, we’ll specify the specific files that are needed from the base HydraExpress installation:

Dockerfile
…
FROM centos:7 AS hydraexpress_deploy

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_install ${RWSF_HOME} /tmp

COPY --from=hydraexpress_install ${RWSF_HOME}/bin/rwagent  \
                                 ${RWSF_HOME}/bin/rwsfserver*  \
                                 ${RWSF_HOME}/bin/rwsfvars*  \
                                 ${RWSF_HOME}/bin/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/loggers.xml  \
                                 ${RWSF_HOME}/conf/rwagent.xml  \
                                 ${RWSF_HOME}/conf/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/locale/  \
                                 ${RWSF_HOME}/conf/locale/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/servlet/  \
                                 ${RWSF_HOME}/conf/servlet/
COPY --from=hydraexpress_install ${RWSF_HOME}/lib/libcrypto.so.1.1  \
                                 ${RWSF_HOME}/lib/libicu*.so.58.2  \
                                 ${RWSF_HOME}/lib/librwsf_agent_methods20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_core20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_handlers20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_icu20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_message20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_net20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_rwagent*20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_servlet20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_servlet_xml20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_ssl20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_transport_http20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_xmlbinding20012d.so  \
                                 ${RWSF_HOME}/lib/libssl.so.1.1  \
                                 ${RWSF_HOME}/lib/
COPY --from=hydraexpress_install ${RWSF_HOME}/license/  \
                                 ${RWSF_HOME}/license/

FROM centos:7
…

Since we’re only exposing the HTTP interface from HydraExpress, we’ll also disable the other protocols. Not only does this reduce the number of libraries that we need to deploy, but it also reduces the startup time for HydraExpress, and reduces the number of potential attack vectors on the container (fewer open ports). 

Dockerfile
…
COPY --from=hydraexpress_install ${RWSF_HOME}/license/  \
                                 ${RWSF_HOME}/license/

RUN sed -i '//d' ${RWSF_HOME}/conf/rwagent.xml

FROM centos:7
…

Finally, we’ll update our final image to leverage our new deployment image instead of the original installation image:

Dockerfile
…
ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_deploy ${RWSF_HOME} ${RWSF_HOME}

COPY --from=servlet_build /build/hello/libhello.so ${RWSF_HOME}/apps-lib/
…

With those changes in place our build time has crept up to ~77s, however our image size has dropped to just 243 MB:

…
$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
hydraexpress        latest              1487eb5f6c68        3 minutes ago       243MB

We’ve also seen our layer count drop from a high of 19 layers down to 9.

$ docker history hydraexpress
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
1487eb5f6c68        5 minutes ago       /bin/sh -c #(nop)  ENTRYPOINT ["/entrypoint.…   0B                  
f58f257b5410        5 minutes ago       /bin/sh -c #(nop) COPY file:3ea4017ec3012e11…   63B                 
3202357ad0a2        5 minutes ago       /bin/sh -c #(nop) COPY dir:188a8bfd4cab69ae4…   296B                
fc10d0cfd32a        5 minutes ago       /bin/sh -c #(nop) COPY file:9a0d4bd22e3242de…   16kB                
4225ac6d2ca6        5 minutes ago       /bin/sh -c #(nop) COPY dir:9072495080d86316c…   39.9MB              
576d25cd2ccd        6 minutes ago       /bin/sh -c #(nop)  ENV RWSF_HOME=/opt/perfor…   0B                  
7e6257c9f8d8        6 weeks ago         /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B                  
<missing>           6 weeks ago         /bin/sh -c #(nop)  LABEL org.label-schema.sc…   0B                  
<missing>           6 weeks ago         /bin/sh -c #(nop) ADD file:61908381d3142ffba…   203MB   

We’ve significantly reduced the size of our docker image, however there’s one part we haven’t tackled, the base OS.

Selecting a Base OS

We originally chose CentOS 7 as it is a supported platform with HydraExpress. Unfortunately, the base image for CentOS is relatively large (203 MB), and other supported operating systems have similarly large footprints.

While deploying on another operating system is technically unsupported (any reported issues will need to be reproducible on a supported OS), there are many distributions that are compatible enough with CentOS that we can deploy our HydraExpress container on them. Since our goal is to reduce our image size as much as possible, let’s try deploying HydraExpress on Alpine Linux, which boasts a base OS image of just 5 MB.

To run HydraExpress on Alpine Linux, we need to deploy some additional packages beyond the base operating system. Namely we’ll need bash (to support the HydraExpress environment scripts), libstdc++, and libc6-compat. We’ll replace the FROM statement in our final image to reflect these changes:

Dockerfile
…
RUN sed -i '/<rwsf:connector name="AJP/,/<\/rwsf:connector>/d' ${RWSF_HOME}/conf/rwagent.xml

FROM alpine:latest

RUN apk update --no-cache && apk upgrade --no-cache && apk add --no-cache bash libstdc++ libc6-compat

ENV RWSF_HOME /opt/perforce/hydraexpress
…

With those changes in place, we see our build times climb to ~86s, however the size of our image drops to only 51 MB:

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
hydraexpress        latest              88530e028da6        About a minute ago   51MB

We can still start our HydraExpress container and verify that the “hello” service we wrote in the previous blog post is still up and running.

Successfully Reduce Container Size

With the steps outlined above we’ve successfully reduced the size of our HydraExpress container by over 70% on a supported operating system, and if we’re willing to deploy on an unsupported OS, by over 90%.

A similar process can be applied to other Docker containers to reduce their footprint as well. Next time we’ll continue to evolve our HydraExpress deployment, incorporating another C/C++ library.

Want to try this yourself? Contact us for an evaluation version.

Contact Us

For Reference and Further Reading

For reference, here is the complete Dockerfile after refactoring:

Dockerfile
FROM centos:7 AS hydraexpress_install

RUN yum install -y wget

RUN mkdir -p /opt/download

RUN wget -q -O /opt/download/hydraexpress.run \
    https://dslwuu69twiif.cloudfront.net/hydraexpress/2020/hydraexpress_2020_eval_linux_x86-64_gcc_4.8.run

RUN chmod a+x /opt/download/hydraexpress.run

COPY license.key /opt/download/license.key

ENV RWSF_HOME /opt/perforce/hydraexpress

RUN /opt/download/hydraexpress.run  \
    --mode unattended  \
    --prefix /opt/perforce/hydraexpress  \
    --license-file /opt/download/license.key

FROM centos:7 AS servlet_build

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_install ${RWSF_HOME} ${RWSF_HOME}

RUN yum install -y epel-release
RUN yum install -y gcc-c++ make cmake3

COPY src/ /src/

RUN mkdir -p /build
RUN cd /build && cmake3 ../src && make

FROM centos:7 AS hydraexpress_deploy

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_install ${RWSF_HOME} /tmp

COPY --from=hydraexpress_install ${RWSF_HOME}/bin/rwagent  \
                                 ${RWSF_HOME}/bin/rwsfserver*  \
                                 ${RWSF_HOME}/bin/rwsfvars*  \
                                 ${RWSF_HOME}/bin/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/loggers.xml  \
                                 ${RWSF_HOME}/conf/rwagent.xml  \
                                 ${RWSF_HOME}/conf/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/locale/  \
                                 ${RWSF_HOME}/conf/locale/
COPY --from=hydraexpress_install ${RWSF_HOME}/conf/servlet/  \
                                 ${RWSF_HOME}/conf/servlet/
COPY --from=hydraexpress_install ${RWSF_HOME}/lib/libcrypto.so.1.1  \
                                 ${RWSF_HOME}/lib/libicu*.so.58.2  \
                                 ${RWSF_HOME}/lib/librwsf_agent_methods20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_core20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_handlers20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_icu20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_message20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_net20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_rwagent*20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_servlet20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_servlet_xml20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_ssl20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_transport_http20012d.so  \
                                 ${RWSF_HOME}/lib/librwsf_xmlbinding20012d.so  \
                                 ${RWSF_HOME}/lib/libssl.so.1.1  \
                                 ${RWSF_HOME}/lib/
COPY --from=hydraexpress_install ${RWSF_HOME}/license/  \
                                 ${RWSF_HOME}/license/

RUN sed -i '/<rwsf:connector name="HTTPS/,/<\/rwsf:connector>/d' ${RWSF_HOME}/conf/rwagent.xml
RUN sed -i '/<rwsf:connector name="AJP/,/<\/rwsf:connector>/d' ${RWSF_HOME}/conf/rwagent.xml

FROM alpine:latest

RUN apk update --no-cache && apk upgrade --no-cache && apk add --no-cache bash libstdc++ libc6-compat

ENV RWSF_HOME /opt/perforce/hydraexpress

COPY --from=hydraexpress_deploy ${RWSF_HOME} ${RWSF_HOME}

COPY --from=servlet_build /build/hello/libhello.so ${RWSF_HOME}/apps-lib/
COPY --from=servlet_build /src/hello/WEB-INF ${RWSF_HOME}/apps/servlets/hello/WEB-INF/

COPY entrypoint.sh /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]