Maven 2 introduced the notion of transitive dependencies. This feature lets the user specify a dependency on some library say Struts, and by a resolution mechanism all the dependencies of Struts will be retrieved and made available to the build process. Maven v1 on the other hand, required users to be explicit about each and every library that would be retrieved, including dependencies.
However, I have found that there are a number of cases in which the Maven 2.0.x dependency resolution is not sufficient for many customer scenarios.
Maven 2.0.x Dependency Features
Currently, Maven2 lets you specify a dependency as
<dependency> <groupId/> <artifactId/> <version/> <type/> <classifier/> <scope/> <systemPath/> <exclusions> <exclusion> <artifactId/> <groupId/> </exclusion> </exclusions> <optional/> </dependency>
groupId and artifactId are pretty straightforward and are fixed. They represent the identifier of the dependency.
Variability is introduced with the other fields.
- Version - Currently this is very overloaded. Version can represent a fixed version or a range. Version can also be represented with build numbers and by keywords like (SNAPSHOT for date) (RELEASE for latest). Finally version can also be represented as free text (e.g. 3.0-RC2). The resolution of which artifact matches a version spec is deterministic but some combinations are not possible to specify such as the 4th build of 3.0-BETA (i.e. 3.0-BETA-4).
- Type - For specifying JAR, EJB, POM, etc.
- Classifier - This might be new for 2.1. Currently the M2 resolver doesn't use it. The current documentation suggest its use for additional matching (perhaps for vendor, JDK version, etc.)
- Scope - (compile,provided,runtime,test,system) Scope controls how deep to traverse the dependency chain and affects the classpath for the build tools
- Exclusions - Specifically putting these into a dependency will remove those artifacts from the chain. Is this done just by identifier (group/artifact)
- Optional - Marks the dependency for retrieval if it is the root, but the path will now be followed transitively otherwise.
I have encountered a number of cases which the current dependency mechanism is not sufficient to meet. Some of these are requirements for applications using M2 both in development and production, and one is to provide better tool integration.
This is sometimes known by the community as 'spec JARs'. Essentially many frameworks have a dependency on standard APIs, normally J2SE or J2EE, and access to the interface classes is needed at compile time. These APIs can have multiple providers with different licenses (e.g. Apache vs Sun) as well as different implementations (open source, app server specifice, etc.) Furthermore, in some cases, these APIs are delivered as part of the J2SE environment (e.g. JAXP comes part of J2SE 1.4). Basically I need for POMs to be able to specify a dependency on something virtual (e.g. JAXP 1.2) and then specify the chosen implementation separately.
Related to virtual dependencies, the artifact resolution process needs to take in environmental parameters as context. As an example, J2SE 1.4 contains interfaces and implementations for standard APIs. Or, the web application may deployed to a known server environment. These characteristics should be made available to resolve the dependencies, rather than requiring the resolved artifact be present in the local or remote repository (i.e. they might be present in the CLASSPATH already)
Pinning Dependencies (Overrides)
A mechanism is needed for being able to override an artifact in the dependency chain. A primary use of this is for being able to provide hotfixes and patches to release distributions. In many cases the dependency chain for a given release is fixed, and in fact for production use, variation in these dependencies is not accessible. However when there are critical updates to an artifact in the dependency chain, it needs to be possible to rebuild a given application, using the same dependencies, but slotting in or overriding a given artifact with one that is user specified.
While Maven 2.0.x provides some filters (namely version ranges) that enable resolution of a concrete artifact for a dependency, there are additional parameters which need to be passed into the resolution mechanism and perhaps should also be present in the artifact metadata in the repository. The primary ones I have found to be of interest are
- Required J2SE Version - What is the minimum (and maximum?) version required for a given artifact. JARs are currently stored in Maven repositories without indication of the Java version of the classfiles. This will cause trouble for artifacts compiled with Java 5 that a user expects to run in a Java 1.4 environment.
- License - (Apache, GPL, LGPL, etc.) Customers often have specific open source policies in place that define the acceptable open source licenses.
- Signing - Artifacts must be signed by a given authority
- Vendor - Artifacts must be retrieved from a given repository or provided by a given vendor.
- Tags/Release - For commerically backed distributions, it would be very helpful to be able to filter by vendor version (mycompany-distro-2.1) and tag the artifacts as appropriate. Defining an all-encompassing POM is not the same as tagging an artifact.
- Debug/Obfuscation - Want to be able to specify to use debug enabled vs not or obfuscated vs not.
All of these parameters should be able to be configured as settings for the application or on a system wide basis that would be applied to overlay the resolution mechanism. They should be able to be either hard requirements (stop if they can't be satisfied), or soft preferences e.g. (+debug or ==JDK1.4)
Related to filtering, in some cases an artifact deployed to a repository may have problems and the specific artifact needs to be blacklisted, like in the case of a bad build. Currently repositories can be blacklisted, but this feature could be extended to exclude specific artifacts and versions, possibly by MD5 checksum also, so that they can be removed as candidates for dependency resolution.
This is useful in the case where if A depends on B depends on C. Then if B uses a version range that would match a new build of C but the developer only has control over A's POM and not B's, then the dev could blacklist C's latest build.
Human License Acknowledgment
There are two primary aspects for this requirement. One is the case where a vendor requires a user to confirm a license before downloading. This is the case for example with Sun and the BCL for many specification JARs. Eclipse has implemented this feature in the plug-in manager to require license acceptance. Another case is when through transitive resolution a user is actually importing more code than they think. For example, a user may specify an artifact with an Apache license that has a dependency on LGPL. The user should be notified that by using the code they are accepting not only the Apache license through the prime artifact, but also the LGPL through a dependency.
Exposing Dependency Resolution to Clients
Maven 2.0.x provides an ArtifactResolver whose purpose is to download the JAR associated with an artifact and an ArtifactCollector which does the dependency resolution. There are cases for client tools when a user would like to be able to visualize the dependency resolution mechanism in order to determine how applying different filters (for example) affect the traversal of the dependency graph. I'd like an API for being able to get the dependency graph of a given root artifact, with environmental parameters. The graph should highlight the potential artifacts for a given node, as well as which artifact would actually be selected. This would also seriously help with debugging a dependency chain.
Maven 2.1.x design suggestions
This might be out of place for a wiki document discussing needed features but after diving into maven-artifact 2.0.x for not only searching to see if the features could be done using the current APIs and tracing through the code to fix various resolution issues, maven-artifact seems designed as an extension of artifact handling for Maven 1 plus the addition of a transitive dependency mechanism. I would suggest a fresh look at the dependency handling, given the added features.
As an example, the Artifact object actually means two different things, a dependency and a resolved dependency. Likewise, the ArtifactResolver actually means both dependency resolution, and JAR downloading resolution. I would suggest that Artifact handling be refactored into both a maven-artifact-dependency package and maven-artifact. A dependency can be resolved into multiple artifacts, and in fact a dependency chain has a number of different successful 'walks'. Separating this from artifact resolution (which is obtaining the JAR) would help not only client tools, but also make it clearer for integrating new features, and debugging.