Skip to end of metadata
Go to start of metadata

Background

Maven 2 introduced the notion of transitive dependencies. This feature lets the user specify a dependency on some library say Struts, and by a resolution mechanism all the dependencies of Struts will be retrieved and made available to the build process. Maven v1 on the other hand, required users to be explicit about each and every library that would be retrieved, including dependencies.

However, I have found that there are a number of cases in which the Maven 2.0.x dependency resolution is not sufficient for many customer scenarios.

Maven 2.0.x Dependency Features

Currently, Maven2 lets you specify a dependency as

 

groupId and artifactId are pretty straightforward and are fixed. They represent the identifier of the dependency.

Variability is introduced with the other fields.

  • Version - Currently this is very overloaded. Version can represent a fixed version or a range. Version can also be represented with build numbers and by keywords like (SNAPSHOT for date) (RELEASE for latest). Finally version can also be represented as free text (e.g. 3.0-RC2). The resolution of which artifact matches a version spec is deterministic but some combinations are not possible to specify such as the 4th build of 3.0-BETA (i.e. 3.0-BETA-4).
  • Type - For specifying JAR, EJB, POM, etc.
  • Classifier - This might be new for 2.1. Currently the M2 resolver doesn't use it. The current documentation suggest its use for additional matching (perhaps for vendor, JDK version, etc.)
  • Scope - (compile,provided,runtime,test,system) Scope controls how deep to traverse the dependency chain and affects the classpath for the build tools
  • Exclusions - Specifically putting these into a dependency will remove those artifacts from the chain. Is this done just by identifier (group/artifact)
  • Optional - Marks the dependency for retrieval if it is the root, but the path will now be followed transitively otherwise.

Requested Features

I have encountered a number of cases which the current dependency mechanism is not sufficient to meet. Some of these are requirements for applications using M2 both in development and production, and one is to provide better tool integration.

Virtual Dependencies

This is sometimes known by the community as 'spec JARs'. Essentially many frameworks have a dependency on standard APIs, normally J2SE or J2EE, and access to the interface classes is needed at compile time. These APIs can have multiple providers with different licenses (e.g. Apache vs Sun) as well as different implementations (open source, app server specifice, etc.) Furthermore, in some cases, these APIs are delivered as part of the J2SE environment (e.g. JAXP comes part of J2SE 1.4). Basically I need for POMs to be able to specify a dependency on something virtual (e.g. JAXP 1.2) and then specify the chosen implementation separately.

Environmental Context

Related to virtual dependencies, the artifact resolution process needs to take in environmental parameters as context. As an example, J2SE 1.4 contains interfaces and implementations for standard APIs. Or, the web application may deployed to a known server environment. These characteristics should be made available to resolve the dependencies, rather than requiring the resolved artifact be present in the local or remote repository (i.e. they might be present in the CLASSPATH already)

Pinning Dependencies (Overrides)

A mechanism is needed for being able to override an artifact in the dependency chain. A primary use of this is for being able to provide hotfixes and patches to release distributions. In many cases the dependency chain for a given release is fixed, and in fact for production use, variation in these dependencies is not accessible. However when there are critical updates to an artifact in the dependency chain, it needs to be possible to rebuild a given application, using the same dependencies, but slotting in or overriding a given artifact with one that is user specified.

Filtering Parameters

While Maven 2.0.x provides some filters (namely version ranges) that enable resolution of a concrete artifact for a dependency, there are additional parameters which need to be passed into the resolution mechanism and perhaps should also be present in the artifact metadata in the repository. The primary ones I have found to be of interest are

  • Required J2SE Version - What is the minimum (and maximum?) version required for a given artifact. JARs are currently stored in Maven repositories without indication of the Java version of the classfiles. This will cause trouble for artifacts compiled with Java 5 that a user expects to run in a Java 1.4 environment.
  • License - (Apache, GPL, LGPL, etc.) Customers often have specific open source policies in place that define the acceptable open source licenses.
  • Signing - Artifacts must be signed by a given authority
  • Vendor - Artifacts must be retrieved from a given repository or provided by a given vendor.
  • Tags/Release - For commerically backed distributions, it would be very helpful to be able to filter by vendor version (mycompany-distro-2.1) and tag the artifacts as appropriate. Defining an all-encompassing POM is not the same as tagging an artifact.
  • Debug/Obfuscation - Want to be able to specify to use debug enabled vs not or obfuscated vs not.

All of these parameters should be able to be configured as settings for the application or on a system wide basis that would be applied to overlay the resolution mechanism. They should be able to be either hard requirements (stop if they can't be satisfied), or soft preferences e.g. (+debug or ==JDK1.4)

Blacklisted Artifacts

Related to filtering, in some cases an artifact deployed to a repository may have problems and the specific artifact needs to be blacklisted, like in the case of a bad build. Currently repositories can be blacklisted, but this feature could be extended to exclude specific artifacts and versions, possibly by MD5 checksum also, so that they can be removed as candidates for dependency resolution.

This is useful in the case where if A depends on B depends on C. Then if B uses a version range that would match a new build of C but the developer only has control over A's POM and not B's, then the dev could blacklist C's latest build.

Human License Acknowledgment

There are two primary aspects for this requirement. One is the case where a vendor requires a user to confirm a license before downloading. This is the case for example with Sun and the BCL for many specification JARs. Eclipse has implemented this feature in the plug-in manager to require license acceptance. Another case is when through transitive resolution a user is actually importing more code than they think. For example, a user may specify an artifact with an Apache license that has a dependency on LGPL. The user should be notified that by using the code they are accepting not only the Apache license through the prime artifact, but also the LGPL through a dependency.

Exposing Dependency Resolution to Clients

Maven 2.0.x provides an ArtifactResolver whose purpose is to download the JAR associated with an artifact and an ArtifactCollector which does the dependency resolution. There are cases for client tools when a user would like to be able to visualize the dependency resolution mechanism in order to determine how applying different filters (for example) affect the traversal of the dependency graph. I'd like an API for being able to get the dependency graph of a given root artifact, with environmental parameters. The graph should highlight the potential artifacts for a given node, as well as which artifact would actually be selected. This would also seriously help with debugging a dependency chain.

Maven 2.1.x design suggestions

This might be out of place for a wiki document discussing needed features but after diving into maven-artifact 2.0.x for not only searching to see if the features could be done using the current APIs and tracing through the code to fix various resolution issues, maven-artifact seems designed as an extension of artifact handling for Maven 1 plus the addition of a transitive dependency mechanism. I would suggest a fresh look at the dependency handling, given the added features.

As an example, the Artifact object actually means two different things, a dependency and a resolved dependency. Likewise, the ArtifactResolver actually means both dependency resolution, and JAR downloading resolution. I would suggest that Artifact handling be refactored into both a maven-artifact-dependency package and maven-artifact. A dependency can be resolved into multiple artifacts, and in fact a dependency chain has a number of different successful 'walks'. Separating this from artifact resolution (which is obtaining the JAR) would help not only client tools, but also make it clearer for integrating new features, and debugging.

  • No labels

1 Comment

  1. virtual dependencies: these have been in the maven 2.1 design plans since about last august and proposed as a feature since about 2003. We should get around to it "real soon now" (smile) More specific implementation details are welcome. I don't imagine they are going to be too complicated.

    environmental context: in some ways, I'd figured that'd be covered by spec dependencies + the provided scope. I'm not entirely sure how you are anticipating these are used (at build time, when the provided scope is in play, or at runtime in some plugin/artifact client to detect the existence of libraries and filter them out of dependencies)

    pinning deps: there's some discussion if depMgmt should be utilised for this. Currently, including a dependency in the current POM that is building the final application can be used to set the version globally as nearest wins. Restricting it to a single version also works.

    filtering:
    required java version is definitely required, and can be used to populate plugin configuration (eg, source, target, etc) at build time.
    license is already there, but introducing known values is needed.
    Signing - depends on if you mean pgp signing or jar signing. Probably better to inspect the JAR or detached sig from the local repo rather than affect the resolution mechanism.
    vendor = groupID or organization, already in the pom
    tags/release - afraid I don't quite get this. Isn't this a new artifact that provides the other one? this is very similar to spec dependencies.
    debug/obfusc - traditionally the classifier. We are looking at adding platform/architecture explicitly, but for these I think the classifier sufficies. I'm not sure whether adding queryable metadata on this will be helpful since only the ones actually produced will be present anyway.

    blacklisted - there is already global excludes in jira, this would be global excludes with a version. BTW, blacklisting of a repository is an internal mechanism to stop using something that is not working, it is not a persistent part of the build.

    human license ack is already in jira. This is pretty dangerous in a build, so it would need to be backed by sufficient batchmode configurability.

    exposing dr: this can sort of be done by passing a listener in and re-resolving the artifact, but the api could be improved.

    the problems with maven-artifact's design are fairly well known, and is probably the oldest code in m2. Redesigning was considered, but not deemed important. The same consideration should keep being given through future releases, though binary compatibility is very important now for plugins to continue operating as expected.

    I'd suggest creating jira issues for the things not already in there and putting them in this doc for tracking.