brgd.eu

Understanding Maven Better

2021-04-30

Initially, this article was planned to answer just three or four questions that I always had about Maven, and for which it bothered me that I didn't know the answer to them. While writing down these questions, new ones came up. Then, while answering the questions, the same happened.

After all, I almost managed to create a small Maven FAQ now. I feel like I know Maven much better now, and I hope that this writeup will help others as well.

To make it more accessible, I'm adding a small table of contents right at the start:

What does the -D stand for in -DskipTests?

-D defines a system property. In this particular case, that is, when the system property skipTests is set, the maven-surefire-plugin reads this property and decides to skip the tests.

Note that while -DskipTests is the one system property I use the most often, it obviously isn't the only one. Take for example these Maven command examples:

mvn exec:java -Dexec.mainClass="{{com.example.Main}}" -Dexec.args="{{arg1 arg2}}"

mvn package -Dmaven.test.skip=true

These are all different system properties, read by different plugins (or potentially also java files that don't belong to a plugin). The first one executes the main method a the Main class.

The second example, -Dmaven.test.skip=true is used also by maven-surefire-plugin. This one skips even compiling the tests, while the other one (-DskipTests) just skips running them.

Is there a shorthand for -DskipTests?

Unfortunately, no.

What is the difference between <plugins> and <pluginManagement>?

Both these tags can often be found one after another in POMs. However, at first, they are not the same.

It is important to understand pluginManagement better here: This tag configures plugins just as <plugins> does. However, if a plugin is only mentioned there, it is not actually run. pluginManagement just allows the children of the current POM to reuse this plugin configuration. To reference it in a child, the artifactId needs to be used, like for example this:

<plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-dependency-plugin</artifactId>
    </plugin>
</plugins>

Whereby a plugin with this exact artifactId needs to be specified (in more detail, with goals and phases) in a <pluginManagement> section.

More on this here and here.

Further note: I have found in some POMs at work, that some people seem to use both tags even though that specific POM doesn't have any child POMs. I.e., the plugin that is specified in pluginManagement is, in the same file, used in plugins. I don't know whether this makes a lot of sense like this. Maybe it is just done like this to separate plugin definition from plugin execution.

What is the difference between a phase and a goal?

When I use a plugin, i.e. specify it with the <plugin> tag, most of the time I would like to specify a phase and a goal. However, in my case most of the time I just copy this from the example that I have at hand.

I will start to explain simply goals. Maven goals are specified by a plugin. In other words, plugin authors specify the goals that this plugin can execute, so if I include a plugin in my POM, I also need to specify which goal I want to execute. Let's see for an example: Say I need to include resources from another module of my project. This can't be done just alone with standard maven, but needs a separate plugin. This plugin is called maven-remote-resources-plugin. This plugin needs to be specified in the plugins section of my POM. Now if you go to the page of the plugin, you will see very prominently the heading "Goals Overview". This is what we need to look out for when wanting to understand a new plugin. Here, this plugin has two possible goals: bundle and process. Note that even though the goals are written like remote-resources:bundle, I have written only bundle. The first part, before the colon, is the plugin name, and only the second part is the real goal name.

So if I want to use this plugin, I need to specify one of these two goals. This means that I will put that goal into my POM, where I also specify the plugin itself. But when exactly will this goal be called?

This leads us now to maven phases. The plugin author has specified standard phases for each goal. This means that if we don't override the phase, the standard phase will be used. Using the example from just before, if we click on the bundle goal (which leads us here), we can find this text:

Binds by default to the lifecycle phase: generate-resources.

Maven phases are pre-defined. What phases there are can be found here. Note however that while the phases listed here are the main phases, they are by far not all phases there are. In fact, if we scroll down at the linked page, we find this, which lists exactly all phases.

If we now specify a phase for a plugin next to its goal, we say that we want this goal to execute at this phase. If this plugin doesn't really depend on many other plugins, the standard (rough) phases will be fine. If not, we have a much more detailed list of phases at our disposal.

When I run a phase, which other phases are run then?

As mentioned above, there are these "standard phases", and there are the much more detailed phases.

To answer this question, it is important to know that Maven also has something called lifecycles. Hierarchically, it basically goes like this: lifecycle -> phase -> goal, more or less at least.

Now we need to know that there are only three lifecycles in Maven: clean, default, and site. Each of those lifecycles contains different phases. If some phase of a given lifecycle is run, then all phases that come before the given phase in that lifecycle are also run. No phases of other lifecycles are run.

This is why we need to run mvn clean install: Because clean is not already included in install because it comes from a different lifecycle. On the other hand, compile for example is included in install, so running mvn clean compile install would give no benefit to mvn clean install.

If I have multiple modules, do I need to define the 'parent' in each submodule, and each submodule in the parent?

Yes.

As for -U, the documentation states:

-U --update-snapshots - Forces a check for updated releases and snapshots on remote repositories

So, as I understand it, it updates the snapshots. But what exactly does that mean? Why would it need to be updated if the version didn't change? And if the version changes, is not automatically updated?

The answer is that actually both are true: The version won't change for that, and if it does, it will automatically update it. But as I understand it from this article, this option is for dependencies where no version number is specified. Combined with this StackOverflow question, I conclude that if no version number is given to a dependency in Maven, it is normally checked daily for updates, or when using the CLI option -U. (I have however not tested this)

For -am, the documentation states the following:

-am --also-make - Build the specified projects, and any of their dependencies in the reactor

This is related to multi-module projects. These are talked about in more detail in [this section]({{< relref "#multimodule-projects" >}}).

What is the difference between -P and -pl?

They are something completely different, and have really nothing to do with each other.

-P specifies which build profile to run. More about that in [this question]({{<relref "#build-profiles">}}).

-pl specifies a project list, which has to do with multi-module projects, as talked about in [this question]({{<relref "#multimodule-projects">}}).

What is the difference between mvn package and mvn install?

On the internet, I often find the first, but I always use the latter one, actually.

Their difference is actually fairly easy to guess once one knows a little more about Maven: mvn package will package the project, i.e. create a jar file for example, and put in the local target directory (normally that directory is at the same place as the POM). mvn install is a step further: It takes this package and "installs" it in the local repository, therefore making it available for other local projects.

Note that the next step here, that I personally never use just because I don't need it, is mvn deploy. This one deploys the package also in the remote repository. Note further that mvn deploy is the same as mvn package install deploy, or, in other words, all phases that come before are also executed.

Now, the last thing that I mentioned is however just true for a given Maven lifecycle. clean is a different lifecycle, so that phase won't be affected by this. More on that in [this question]({{<relref "#lifecycles">}}).

Sometimes there is talk about a "reactor". What is meant by that?

The reactor is responsible for computing how to compile multi-module projects. That is, if a project has submodules etc., and all of these are to be run somehow, the reactor computes how exactly they are run. This includes when dependencies are built or when specific goals are executed. More on this topic can be found on this StackOverflow question. o

What is there to know about multi-module projects?

Maven has something called the reactor that is responsible for building multi-module projects. I have explained this in slightly more detail in [this question]({{<relref "#reactor">}}).

When we talk about multi-module projects, we often talk about a project that has as master POM, at the root of the project, and which in the same directory has multiple folders which again each have its own POM. These then are the submodules. (I am not sure if it is possible to create sub-sub-modules. I don't think so, but in any case I have never seen that.)

As explained in the online book Maven: The Complete Reference, there are some command line options that are important in regard to multi-module projects:

-rf, --resume-from
    Resume reactor from specified project 
-pl, --projects
    Build specified reactor projects instead of all projects 
-am, --also-make
    If project list is specified, also build projects required by the list 
-amd, --also-make-dependents
    If project list is specified, also build projects that depend on projects on the list 

The first two should be fairly obvious. The third, -am, never was very clear to me, but seeing it in this light it is somehow very clear to me now. I think to explain I will use an example from work. There, we have a module that is responsible for all the code in the local docker containers, which are mostly used for integration tests. Now, if I want to build the docker containers, I just need to build the POM of this docker submodule. However, my actual code most likely won't be in that module. If I now rebuild only this module after having changed my code, it will not be available in the docker containers. For this to work, I need to add the command line option -am. And if I added a new dependency for my code, and want to make that dependency also available in the docker containers, I use -amd.

What are build profiles?

Build profiles in Maven can be used to override default values in Maven. "Default" here means the default values of the project. A profile can for example be used to change the workflow of building the project for production, to disable debugging for instance. If this profile had the id "production" we would call it like this:

mvn install -Pproduction

For me, this is currently all that I need in regard to build profiles. For more on it, see the Maven documentation or Maven: The Complete Reference.

Remaining Questions

Some questions still remain. Once I find the time, I want to come back to this article and update it.

Currently, only one question remains actually:

  • What variables are there, like ${project.build.directory}, and what is their content?