Java for Intermediate Developers

Apache Maven from 100,000 Feet

A Quick and Easy-to-Understand Guide to Java’s Most Popular Build Tool

Matt Speake
Java Easily

--

In this article we’re going to give an overview of what Apache Maven is, what it does and how it does it. So let’s jump in!

Photo by Michał Kubalczyk on Unsplash

What is Apache Maven?

Apache Maven is the industry-standard build tool used by the majority of Java-based projects out there in industry right now and this means that you can’t afford not to know it.

Now there are two other build tools you may have heard of — Apache Ant and Gradle — and while these are (or have been in Ant’s case) great tools to use, Maven is more popular for the major reason that it’s easier to use, has less of a learning curve and, frankly, gives you more bang for the buck (in that you have to write less code and/or do less brainwork to get your builds up and running compared with Ant and Gradle).

What is a Build Tool?

So now we know that Maven is important and we need to know it. What is a build tool anyway?

Well a build tool is a tool that allows us to “build” our projects: that is, take our source code artefacts and turn them into the finished product: the deployable unit (aka build artifact) — this is the end product that the Java project — the collection of all the source code artefacts — is destined to turn into. This might be a jar file if the Java project is designed to be a library used by other Java projects, or a war file if that Java project forms a web application which would run in an application server like JBoss or Tomcat for example.

Think of a build like a process:

  • the input is the source files of the project
  • the output is the thing you want to pass to others, run or deploy

What Does “Building a Project” Mean?

So this is what people mean when they say they’re “doing a build” or “building the project”, although it can sometimes just mean to compile the source code and/or run the tests too (i.e. without actually building the end build artefact-that jar or war file we just spoke about), depending upon who you’re talking to and under what context.

But either way, when we’re talking about “building”, we’re taking those source code artefacts and processing them in some way to either create the end product or validate that the project we currently have is in a buildable state (i.e. it compiles (aka “builds”) and runs the tests ok).

So what’s in these source code artefacts then? Well these are the files within a Java project. Things like:

  • the Java source code of the application itself (.java files)
  • the classpath resources the application needs when running
  • the property files needed at runtime to configure our application
  • the third party libraries the application needs — those are the jar files you’ve seen in your projects)

A project can consist of other items too, but these are the most common that you’ll find.

What Does Maven Do?

A build tool like Maven let’s you perform different types of build tasks (which is why the term “build the project” is sometimes overloaded to mean different things). For example, you can:

  • clean the project (delete the project files which are generated by the build)
  • compile the source code (pass the .java files to javac — the Java compiler — to produce the bytecode .class files)
  • run the tests (i.e. run code which exercises the so-called “production code” (that’s the real code that your application consists of) to make sure that production code does what it’s supposed to do)
  • generate project documentation (to create a kind of mini website or wiki of the project, which contains test reports, source code metrics, user guides etc.)
  • deploy the build artefact (i.e. install it in an application server so it’s a deployed, running webapp)

Maven let’s you do all of these things, and a whole lot more besides once you dig into it’s documentation, quickly and easily. And what’s not to love with easy and quick? 😃

How Does Maven Do It?

So in order to deliver this “quick and easy” promise, Maven pioneered some pretty awesome ideas. These are:

  • Convention over Configuration
  • Dependencies and their resolution
  • Repositories
  • Build lifecycles
  • Plugins

We’ve got a lot to cover — but we’ll make it as easy as we can. Let’s dive in!

Convention over Configuration

Maven uses the idea of Convention over Configuration (CoC). The big idea is that instead of you defining (that’s the “configure” part, where you’d be configuring these properties) aspects of your build, instead of that the Maven framework itself decides for you (that’s the “convention” part — Maven defines the convention of what these properties should be).

So usually in a build we’d have to define things we now take for granted like

  • file paths for where the source code lives in the project and where the output files should be placed
  • filenames like what’s the name of our generated jar or war file etc

With Maven, we don’t have to do this.

This is different from Ant: with Ant you’d have to set all those properties up at the start of the build file.

The benefit of CoC is that you have to write less now (you don’t need to define all those properties anymore) which is great. A bigger benefit though is that you now have a transferable skill: once you’ve worked on one Maven project, you can work on them all because the projects “look the same”! Awesome — less time writing redundant code, more standardisation meaning less things to consider and worry about and breaking, and the ability to migrate to new Maven projects and be productive from the get-go. It’s a beautiful thing!

Some standards we see in Maven projects are:

  • src/main/java is where the production source code lives
  • src/main/resources is where the production classpath resources live
  • src/test/java is where the test source code lives

There are a ton more, but you get the idea.

Dependencies and Their Resolution

In the Maven world, you declare which libraries your project needs and Maven will take care of the rest: it will download the so-called dependency (i.e. the jar file you want to use in your project and have on your classpath) and make it available for your application to use. This means it will be available at compile time (when the code is being compiled — so that’s when you’re either coding in the IDE or when you’re doing a build as we saw before) and, if needed, at runtime too (i.e. it will be packaged up with the end build artefact — for example, if you’re working on a web application, the dependency would end up in your war file in the WEB-INF/lib directory inside it).

This is quite revolutionary: in the days of Ant (and worse still, even pre-Ant), you had to manually download the jarfiles you needed yourself and then set the classpaths of the project. And worse still, you had to download other jarfiles that that library needed too. Before long, your project was a mess!

Now this scenario of having to download the additional jarfiles that the library you want to include in your project itself depends on, is known as dependency resolution. Again, Maven handles this for you.

Let’s say you need Spring MVC in your application. Well, it’s not just the Spring MVC jarfile you need. Oh no. And that’s because Spring MVC itself needs libraries that it too depends on. And so Maven will “magically” download those too. And in turn, if any of those need extra libraries, they’re pulled down too. This dependency resolution forms a kind of “tree” of jar files that the project requires.

So how does Maven do this? Well it views everything, both projects and dependencies, as having a so-called GAV coordinate. GAV stands for the three components of how a project or dependency is named, declared and located. Don’t worry what those are for now (but spoiler alert: CoC brings much more than just conventions for properties). The three components are:

  • Group ID — this is kind of like the package concept in Java; it defines “where” (in which logical namespace or folder if in a physical repository) the dependency lives
  • Artefact ID — this is the name of the dependency
  • Version — this is the version the dependency is at

(Note that the first letters of this holy trinity of Maven projects and dependencies forms that “GAV” acronym we just saw).

So to go back to our Spring MVC example, if we wanted to include that library in our project, we’d add the following dependency declaration to our Project Object Model, or POM file (this is the one and only file you need to include and fill in to configure how Maven should build your project, in a file at the root of the project called pom.xml):

<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-web</artifactId>
<version>5.3.4</version>
</dependency>

(Spring MVC happens to be called spring-web in a Maven repository but don’t let that phase you. Most dependencies are named like you’d expect and you can always search for a dependency you want to include in your project by hopping on over to the Maven Central Repository Search site)

Repositories

So where do these dependencies come from? Well Maven dependencies live in a repository. The largest of which is Maven Central: this is a humongous web-based archive of thousands of artefacts (java libraries) that can be just included in your projects declaratively like we’ve just seen, and magically pulled down through the dependency resolution mechanism we just described. And as you can see, it’s not magic — but it is elegant.

That’s because the project you work on — in fact any Maven project — is also “located” at its own GAV coordinates too. So just as you can refer to dependencies by their GAV coordinates, you also declare your project too to have its own GAV coordinates. And here’s where the elegance comes in: because you can not only build your project to spit out that jar file for example, but you can then deploy that jarfile — that end build artefact — to a Maven repository for other projects to “consume” (i.e. use in their projects). So you both include what you need in terms of the GAV coordinates, but other projects can inlucde your project too in terms of your GAV coordinates.

So both projects and dependencies now have this common way of defining where they live so that they can be used harmoniously together. And where they ultimately live of course, is inside a Maven repository.

Build Lifecycles

The next part of the secret sauce is build lifecycles. A build lifecycle is a sequence of steps that Maven has laid out to form a “build”. There are 3 build lifecycles in Maven, but we’ll just look at the most common one: the default lifecycle.

Ok, Maven with it’s CoC approach decides “you know what, most builds are gonna consist of these steps, so let’s specify these (we’ll call them phases) and call it a lifecycle” and it does so like this: the default lifecycle is composed of these phases which Maven deems important for virtually all types of builds you’d do (remember a build might be to just compile, it might be to compile and test and it might be to do all of those and create the end build artefact too…. and any combination in between):

  • validate
  • compile
  • test
  • package

(There are more than these, but let’s stick with this simple model for now).

Let’s see how we can do some of the builds we saw before:

  • To compile the code: mvn compile
  • To compile and test the code: mvn test
  • To compile, test and create the end build artefact (it’s called “packaging” in Maven-speak): mvn package

In Maven, you say what phase you want to have happen, and it will execute all phases up to and including that phase. That’s because there’s a logical ordering to the phases: you can’t package the code until you tested it (you don’t want buggy code after all, do you!); you can’t test the code until you’ve compiled the project (otherwise, there’d be no bytecode to run the tests against!) etc.

Maven also optimises the steps too: if it knows that it’s compiled the production source code, and that code hasn’t changed since then — it will just tick off that step as done, without actually having to redo it. So your builds become much quicker too.

Powerful stuff the notion of build lifecycles. But we still have one more piece of the jigsaw to cover: and that’s plugins…

Plugins

So as we’ve seen a build lifecycle defines phases to a build. These phases are basically callbacks which are available for so-called plugins to do something at that point in the build. So a plugin contains functions which can be invoked (these are called the plugin’s goals) during a build.

There are lots of out of the box plugins to do essential things:

  • The clean plugin cleans the project (i.e. removes build-generated files like classfiles and the end build artefacts)
  • The compiler plugin compiles source code
  • The jar plugin packages compiled code into a jarfile

Not all plugins are used in all lifecycles.

Now binding a plugin’s goal to a phase in a build lifecycle is known as creating a plugin execution, and Maven comes with some default executions out of the box (which is how it can compile the code just by you typing mvn compile and without you having to configure anything more in the POM file — we haven’t seen the details of the POM file, but it’s very simple and is usually quite minimal with respect to what you actually get to do buildwise thanks to Maven’s big bang for the buck we stated before).

Some default plugin executions which Maven ships with are things like binding the compiler plugin’s compile goal to the compile and test-compile phases (so it can compile both production source code and test source code). In fact without getting into the specifics, just know that Maven can do all the main things you’d expect out of a build without you having to configure a plugin at the logical phase it execute’s it’s expected goal. So Maven can clean, compile and package etc. all out of the box.

So, as you’ve seen Maven has a whole lot to offer, it makes your developer life so much simpler and takes the headache out of taming your Java builds. Start using it today and you’ll soon be, just like me, very glad you did!

If you’d like to learn more about Maven then check out our course Apache Maven Essentials where we get hands-on with the tool and get you up and running with it quickly and easily!

Have a great day and good luck in your Java travels! 😉

--

--

Matt Speake
Java Easily

Matt Speake is a principal trainer at Java Easily where you can find him writing and recording about all things Java.