75

I'm trying to use docker to automate maven builds. The project I want to build takes nearly 20 minutes to download all the dependencies, so I tried to build a docker image that would cache these dependencies, but it doesn't seem to save it. My Dockerfile is

FROM maven:alpine
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
ADD pom.xml /usr/src/app
RUN mvn dependency:go-offline

The image builds, and it does download everything. However, the resulting image is the same size as the base maven:alpine image, so it doesn't seem to have cached the dependencies in the image. When I try to use the image to mvn compile it goes through the full 20 minutes of redownloading everything.

Is it possible to build a maven image that caches my dependencies so they don't have to download everytime I use the image to perform a build?

I'm running the following commands:

docker build -t my-maven .

docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -w /usr/src/mymaven my-maven mvn compile

My understanding is that whatever RUN does during the docker build process becomes part of the resulting image.

8
  • Make a data container which contains the downloaded artifacts...How many modules are you building? How many tests do you run? Cause 20 minutes sounds very long?
    – khmarbaise
    Commented Feb 13, 2017 at 16:07
  • Can you explain what you mean by data container? I thought I would end up with a maven image that had that data. Doesn't "mvn dependency:go-offline" save those dependencies on the local filesystem? Commented Feb 13, 2017 at 16:21
  • If you have changes on the local file system those will be thrown away if you restart your container...
    – khmarbaise
    Commented Feb 13, 2017 at 16:31
  • 2
    I get that, but I'm not talking about a container. I'm talking about the docker build process. My understanding is that the state of the filesystem at the end of docker build is part of the image. Commented Feb 13, 2017 at 16:35
  • You might consider this: stackoverflow.com/a/49891339/1054322 Commented Apr 18, 2018 at 4:55

16 Answers 16

69

You should also consider using mvn dependency:resolve or mvn dependency:go-offline accordingly as other comments & answers suggest.


Usually, there's no change in pom.xml file but just some other source code changes when you're attempting to start docker image build. In such circumstance you can do this:

FROM maven:3-jdk-8

ENV HOME=/home/usr/app

RUN mkdir -p $HOME

WORKDIR $HOME

# 1. add pom.xml only here

ADD pom.xml $HOME

# 2. start downloading dependencies

RUN ["/usr/local/bin/mvn-entrypoint.sh", "mvn", "verify", "clean", "--fail-never"]

# 3. add all source code and start compiling

ADD . $HOME

RUN ["mvn", "package"]

EXPOSE 8005

CMD ["java", "-jar", "./target/dist.jar"]

So the key is:

  1. add pom.xml file.

  2. then mvn verify --fail-never it, it will download maven dependencies.

  3. add all your source file then, and start your compilation(mvn package).

When there are changes in your pom.xml file or you are running this script for the first time, docker will do 1 -> 2 -> 3. When there are no changes in pom.xml file, docker will skip step 1、2 and do 3 directly.

This simple trick can be used in many other package management circumstances(gradle, yarn, npm, pip).

7
  • 2
    Nice, wrote an article on this approach along with using squash, to reduce the final image size: medium.com/pismolabs/…
    – andriosr
    Commented Jul 18, 2018 at 15:32
  • 2
    This is a brilliant and elegant solution, thanks. This answer should be accepted. Came here expecting some sort of hacky workaround solution, but this solution works with docker caching to give exactly the expected behaviour. Awesome.
    – davnicwil
    Commented May 31, 2019 at 22:44
  • 1
    dependency:resolve will not download plugins. And unfortunately dependency:resolve-plugins also misses lifecycle plugins. Commented Jul 2, 2019 at 20:29
  • 2
    @AndrewTFinnell Therefore, use dependency:go-offline
    – timomeinen
    Commented Sep 2, 2019 at 8:00
  • 1
    I guess --fail-never is not a good idea when you are building docker images that are pushed to Docker Hub regestry via Autobuilds :)
    – Wlad
    Commented Sep 12, 2020 at 17:03
27

Using BuildKit

From Docker v18.03 onwards you can use BuildKit instead of volumes that were mentioned in the other answers. It allows mounting caches that can persist between builds and you can avoid downloading contents of the corresponding .m2/repository every time.

Assuming that the Dockerfile is in the root of your project:

# syntax = docker/dockerfile:1.0-experimental

FROM maven:3.6.0-jdk-11-slim AS build
COPY . /home/build
RUN mkdir /home/.m2
WORKDIR /home/.m2
USER root
RUN --mount=type=cache,target=/root/.m2 mvn -f /home/build/pom.xml clean compile

target=/root/.m2 mounts cache to the specified place in maven image Dockerfile docs.

For building you can run the following command:

DOCKER_BUILDKIT=1 docker build --rm --no-cache  .   

More info on BuildKit can be found here.

2
  • Dose it run on docker for windows without wsl2 ? Commented Feb 25, 2020 at 16:58
  • Haven't tried it on windows myself. But according to this the process of using it on windows is not that smooth. Commented Feb 25, 2020 at 23:00
12

It turns out the image I'm using as a base has a parent image which defines

VOLUME "$USER_HOME_DIR/.m2"

see: https://github.com/carlossg/docker-maven/blob/322d0dff5d0531ccaf47bf49338cb3e294fd66c8/jdk-8/Dockerfile

The result is that during the build, all the files are written to $USER_HOME_DIR/.m2, but because it is expected to be a volume, none of those files are persisted with the container image.

Currently in Docker there isn't any way to unregister that volume definition, so it would be necessary to build a separate maven image, rather than use the official maven image.

3
9

There are two ways to cache maven dependencies:

  1. Execute "mvn verify" as part of a container execution, NOT build, and make sure you mount .m2 from a volume.

    This is efficient but it does not play well with cloud build and multiple build slaves

  2. Use a "dependencies cache container", and update it periodically. Here is how:

    a. Create a Dockerfile that copies the pom and build offline dependencies:

    FROM maven:3.5.3-jdk-8-alpine
    WORKDIR /build
    COPY pom.xml .
    RUN mvn dependency:go-offline
    

    b. Build it periodically (e.g. nightly) as "Deps:latest"

    c. Create another Dockerfile to actually build the system per commit (preferably use multi-stage) - and make sure it is FROM Deps.

Using this system you will have fast, reconstruct-able builds with a mostly good-enough cache.

0
8

I don't think the other answers here are optimal. For example, the mvn verify answer executes the following phases, and does a lot more than just resolving dependencies:

validate - validate the project is correct and all necessary information is available

compile - compile the source code of the project

test - test the compiled source code using a suitable unit testing framework. These tests should not require the code be packaged or deployed

package - take the compiled code and package it in its distributable format, such as a JAR.

verify - run any checks on results of integration tests to ensure quality criteria are met

All of these phases and their associated goals don't need to be ran if you only want to resolve dependencies.

If you only want to resolve dependencies, you can use the dependency:go-offline goal:

FROM maven:3-jdk-12
WORKDIR /tmp/example/

COPY pom.xml .
RUN mvn dependency:go-offline

COPY src/ src/
RUN mvn package
1
  • that's right but in some cases (i.e. multi module project) mvn dependency:go-offline can breaks the build. i.e. I had to use mvn compile dependency:go-offline which brought me a step further but still some stuff was breaking and required me to make sure that this stuff is skipped. So sometimes a combination of dependency:go-offline + skipping stuff that breaks the build is the only solution.
    – Wlad
    Commented Sep 12, 2020 at 17:01
7

@Kim is closest, but it's not quite there yet. I don't think adding --fail-never is correct, even through it get's the job done.

The verify command causes a lot of plugins to execute which is a problem (for me) - I don't think they should be executing when all I want is to install dependencies! I also have a multi-module build and a javascript sub-build so this further complicates the setup.

But running only verify is not enough, because if you run install in the following commands, there will be more plugins used - which means more dependencies to download - maven refuses to download them otherwise. Relevant read: Maven: Introduction to the Build Lifecycle

You basically have to find what properties disable each plugin and add them one-by-one, so they don't break your build.

WORKDIR /srv

# cache Maven dependencies
ADD cli/pom.xml /srv/cli/
ADD core/pom.xml /srv/core/
ADD parent/pom.xml /srv/parent/
ADD rest-api/pom.xml /srv/rest-api/
ADD web-admin/pom.xml /srv/web-admin/
ADD pom.xml /srv/
RUN mvn -B clean install -DskipTests -Dcheckstyle.skip -Dasciidoctor.skip -Djacoco.skip -Dmaven.gitcommitid.skip -Dspring-boot.repackage.skip -Dmaven.exec.skip=true -Dmaven.install.skip -Dmaven.resources.skip

# cache YARN dependencies
ADD ./web-admin/package.json ./web-admin/yarn.lock /srv/web-admin/
RUN yarn --non-interactive --frozen-lockfile --no-progress --cwd /srv/web-admin install

# build the project
ADD . /srv
RUN mvn -B clean install

but some plugins are not that easily skipped - I'm not a maven expert (so I don't know why it ignores the cli option - it might be a bug), but the following works as expected for org.codehaus.mojo:exec-maven-plugin

<project>
    <properties>
        <maven.exec.skip>false</maven.exec.skip>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>exec-maven-plugin</artifactId>
                <version>1.3.2</version>
                <executions>
                    <execution>
                        <id>yarn install</id>
                        <goals>
                            <goal>exec</goal>
                        </goals>
                        <phase>initialize</phase>
                        <configuration>
                            <executable>yarn</executable>
                            <arguments>
                                <argument>install</argument>
                            </arguments>
                            <skip>${maven.exec.skip}</skip>
                        </configuration>
                    </execution>
                    <execution>
                        <id>yarn run build</id>
                        <goals>
                            <goal>exec</goal>
                        </goals>
                        <phase>compile</phase>
                        <configuration>
                            <executable>yarn</executable>
                            <arguments>
                                <argument>run</argument>
                                <argument>build</argument>
                            </arguments>
                            <skip>${maven.exec.skip}</skip>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

please notice the explicit <skip>${maven.exec.skip}</skip> - other plugins pick this up from the cli params but not this one (neither -Dmaven.exec.skip=true nor -Dexec.skip=true work by itself)

Hope this helps

4
  • Skipping the stuff that breaks the build while trying to just get the dependencies offline was the trick for my, too. In my case the compile phase required a DB running :O and even just mvn dependency:go-offline was breaking if no DB was running. What is the -B in mvn -B ... good for? (from --help I know it's --batch-mode)
    – Wlad
    Commented Sep 12, 2020 at 16:55
  • 1
    @Wlad IMHO it only affects how maven downloads dependencies (and prints progress), but I'm not 100% sure. Commented Sep 21, 2020 at 8:41
  • This is very likely the answer, but I really hope it isn't. I've been turning off things for plugins this entire afternoon. :(
    – Nephilim
    Commented Nov 14, 2023 at 12:45
  • FYI, I've since given up on this and have redesigned the build to avoid this problem. Commented Nov 15, 2023 at 15:21
4

Similar with @Kim answer but I use dependency:resolve mvn command. So here's my complete Dockerfile:

FROM maven:3.5.0-jdk-8-alpine

WORKDIR /usr/src/app

# First copy only the pom file. This is the file with less change
COPY ./pom.xml .

# Download the package and make it cached in docker image
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml dependency:resolve

# Copy the actual code
COPY ./ .

# Then build the code
RUN mvn -B -f ./pom.xml -s /usr/share/maven/ref/settings-docker.xml package

# The rest is same as usual
EXPOSE 8888

CMD ["java", "-jar", "./target/YOUR-APP.jar"]
4
  • After adding the dependency:resolve argument and adopting multi-stage images improved my build times considerably. Thank you very much! Commented Dec 22, 2018 at 22:28
  • great solution! but my modules have another dependencies as a sibling projects, so I need to somehow exclude them from downloading. is thene any way to do this? Commented Dec 30, 2018 at 6:59
  • Hi @Elessar.perm sorry I don't have any idea for that.
    – ikandars
    Commented Dec 31, 2018 at 9:04
  • 3
    dependency:go-offline worked better for me as it downloads plugins as well, whereas dependency:resolve downloads dependencies only. maven.apache.org/plugins/maven-dependency-plugin
    – Nufail
    Commented Mar 24, 2020 at 10:54
2

After a few days of struggling, I managed to do this caching later using intermediate contrainer, and I'd like to summarize my findings here as this topic is so useful and being frequently shown in Google search frontpage:

  1. Kim's answer is only working to a certain condition: pom.xml cannot be changed, plus Maven do a regular update daily basis by default
  2. mvn dependency:go-offline -B --fail-never has a similar drawback, so if you need to pull fresh code from repo, high chances are Maven will trigger a full checkout every time
  3. Mount volume is not working as well because we need to resolve the dependencies during image being built
  4. Finally, I have a workable solution combined(May be not working to others):
    • Build an image to resolve all the dependencies first(Not intermediate image)
    • Create another Dockerfile with intermediate image, sample dockerfiles like this:
#docker build -t dependencies .
From ubuntu
COPY pom.xml pom.xml
RUN mvn dependency:go-offline -B --fail-never
From dependencies as intermediate

From tomcat
RUN git pull repo.git (whatsoever)
RUN mvn package

The idea is to keep all the dependencies in a different image that Maven can use immediately

It could be other scenarios I haven't encountered yet, but this solution relief me a bit from download 3GB rubbish every time I cannot imagine why Java became such a fat whale in today's lean world

1
  • 1
    Haven't you forgot to add something like COPY --from=intermediate /home/root/.m2 ? Because what you show is a multistage build and AFAIK everythink from first stage is thrown away before second stage starts. So you have to explicitly define what to take over from one stage to the other.
    – Wlad
    Commented Sep 12, 2020 at 17:11
2

I had to deal with the same issue.

Unfortunately, as just said by another contributor, dependency:go-offline and the other goals, don't fully solve the problem: many dependencies are not downloaded.

I found a working solution as follow.

# Cache dependencies

ADD settings.xml .
ADD pom.xml .

RUN mvn -B -s settings.xml -Ddocker.build.skip=true package test

# Build artifact

ADD src .
RUN mvn -B -s settings.xml -DskipTests package

The trick is to do a full build without sources, which produces a full dependency scan.

In order to avoid errors on some plugins (for example: OpenAPI maven generator plugin or Spring Boot maven plugin) I had to skip its goals, but letting it to download all the dependencies by adding for each one a configuration settings like follow:

<configuration>
    <skip>${docker.build.skip}</skip>
</configuration>

Regards.

1

I think the general game plan presented among the other answers is the right idea:

  1. Copy pom.xml
  2. Get dependencies
  3. Copy source
  4. Build

However, exactly how you do step #2 is the real key. For me, using the same command I used for building to fetch dependencies was the right solution:

FROM java/java:latest

# Work dir
WORKDIR /app
RUN mkdir -p .
# Copy pom and get dependencies
COPY pom.xml pom.xml
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single

# Copy and build source
COPY . .
RUN mvn -Dmaven.repo.local=./.m2 install assembly:single

Any other command used to fetch dependencies resulted in many things needing to be download during the build step. It makes sense the running the exact command you plan on running will you get you the closest to everything you need to actually run that command.

1
  • How would I prevent random plugins from executing though? My project has a bunch, some of which can't be turned on and off by flags. Is there a blanket statement to just not execute anything, just build? I got minimizing steps, source generating steps, etc. and those fail for obvious reasons
    – Nephilim
    Commented Nov 14, 2023 at 12:40
1

Here my working solution. The tricks are:

  • use docker multi-stage build
  • don't copy the project source in the image created in the first stage, but only pom (or poms in case your project is multi-module)

Here my solution for a multi-module project using openjdk11

## stage 1
FROM adoptopenjdk/maven-openjdk11 as dependencies
ENV HOME=/usr/maven
ENV MVN_REPO=/usr/maven/.m3/repository
RUN mkdir -p $HOME
RUN mkdir -p $MVN_REPO
WORKDIR $HOME
## copy all pom files of the modules tree with the same directory tree of the project
#reactor
ADD pom.xml $HOME
## api module
RUN mkdir -p $HOME/api
ADD api/pom.xml $HOME/api
## application module
RUN mkdir -p $HOME/application
ADD application/pom.xml $HOME/application
## domain module
RUN mkdir -p $HOME/domain
ADD domain/pom.xml $HOME/domain
## service module
RUN mkdir -p $HOME/service
ADD service/pom.xml $HOME/service

## download all dependencies in this docker image. The goal "test" is needed to avoid download of dependencies with <scope>test</scope> in the second stage
RUN mvn -Dmaven.repo.local=$MVN_REPO dependency:go-offline test

## stage 2
FROM adoptopenjdk/maven-openjdk11 as executable
ENV APP_HOME=/usr/app
ENV MVN_REPO=/usr/maven/.m3/repository
ENV APP_MVN_REPO=$MVN_REPO
RUN mkdir -p $APP_HOME
RUN mkdir -p $APP_MVN_REPO
WORKDIR $APP_HOME
ADD . $APP_HOME
## copy the dependecies tree from "stage 1" dependencies image to this image
COPY --from=dependencies $MVN_REPO $APP_MVN_REPO
## package the application, skipping test
RUN mvn -Dmaven.repo.local=$APP_MVN_REPO package -DskipTests
## set ENV values
ENV NAME=VALUE

## copy the jar in the WORKDIR folder
RUN cp $APP_HOME/application/target/*.jar $APP_HOME/my-final-jar-0.0.1-SNAPSHOT.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar","/usr/app/my-final-jar-0.0.1-SNAPSHOT.jar" ,"--spring.profiles.active=docker"]
0

I had this issue just a litle while ago. The are many solutions on the web, but the one that worked for me is simply mount a volume for the maven modules directory:

mkdir /opt/myvolumes/m2

then in the Dockerfile:

...
VOLUME /opt/myvolumes/m2:/root/.m2
...

There are better solutions, but not as straightforward.

This blog post goes the extra mile in helping you to cache everything:

https://keyholesoftware.com/2015/01/05/caching-for-maven-docker-builds/

3
  • This does work, but it's not what I was trying to accomplish. I am trying to build an image which encapsulates the repository/dependencies so I can run the image anywhere, without needing to first prepare a mount point for the container volume with the dependencies. Commented Feb 13, 2017 at 20:18
  • Then build your project on the docker host once (to populate ~/.m2) and ADD the ~/.m2 directory before building the image. To put it differently, treat ~/.m2 as it was part of your source code.
    – Bruno9779
    Commented Feb 13, 2017 at 20:42
  • 2
    Does this work? I mean as long as I long you can't mount directly on a volume on the dockerfile using VOLUME <outdir>:<innerdir>
    – Avión
    Commented Apr 4, 2018 at 6:55
0

A local Nexus 3 Image running in Docker and acting as a local Proxy is an acceptable solution:

The idea is similar to Dockerize an apt-cacher-ng service apt-cacher-ng

here you can find a comprehensive step by step. github repo

Its really fast.

0

Another Solution would be using a repository manger such as Sonar Nexus or Artifactory. You can set a maven proxy inside the registry then use the registry as your source of maven repositories.

0

This one did the trick very well for me:

edit config.toml

[runner.docker]
...
volumes = ["/cache","m2:/root/.m2"]
...

it will create "m2" volume that will persists across builds and you guys knows the rest :)

-2

If the dependencies are downloaded after the container is already up, then you need to commit the changes on this container and create a new image with the downloaded artifacts.

1
  • I added some clarification that I'm just building the Dockerfile. I'm not creating a container that would need to be committed after the fact. Commented Feb 13, 2017 at 16:17

Not the answer you're looking for? Browse other questions tagged or ask your own question.