Repository Layout Definition
This is the proposed layout for the Maven 2.x repository. This is current as at March 15, 2005 and all comments prior to that have been considered to be incorporated.
Issues with the old layout
- It doesn't scale physically. The disk at ibiblio is completly trashed during peak hours. I've had cases where it used a minute and a half for doing
lsin the root.
- For a user using browsing the repository through a browser it's hard to locate the artifacts the user want because of the number of files in the directory. This might not be a really big problem if we make a application on top of the repository for browsing.
- The layout doesn't differ between "primary" and "secondary" artifacts. ("secondary" artifacts beeing javadoc, sources etc that do not have a POM of their own).
The current layout looks like this:
The proposed layout:
Changes from the old way includes:
- Overall, the entire directory tree is much deeper. This is for both so that it will scale better in terms of number of files/directories in a single directory. This will ensure that the harddisks that are hosting the repository isn't thrashed like they are on ibiblio today and it will be easier for a user to get a overview over a artifact, the versions of a artifact and the secondary artifacts it has.
- For each primary artifact there is a pom file.
- No symlinks.
For primary artifacts:
For secondary artifacts:
$groupId is a array of strings made by splitting the groupId's on
"." into directories. The group org.apache.maven would then yield:
org/apache/maven. This should mostly mirror the package structure in Java, though can apply to any language.
For each primary artifact there will be a POM:
- A $artifactId-$version.$extension.pom
POMs that are exclusively parents (packaging = pom) will not have such a file however.
Secondary artifacts do not need a POM - they will reference the associated primary POM.
For each file that is the repository there must be a file containing the checksum of the file, typically md5 or sha1. There may also be a digital signature
A complete example
To ease the different uses and archival policies there will later be added 3 repository types:
- SNAPSHOT repository, where only snapshots are stored
- Archive repository, a read only repository where old releases can be downloaded from
- Release repository, where official, current releases go
Repositories can be any combination of these.
The POM will always reference the primary artifact.
The conflict ID of a dependnecy is:
and the full versioned ID is
Subtypes are created by particular mojos: eg ejb packaging will create both an ejb and an ejb-client. apidocs:package will create a JAR of the javadocs.
Dependencies will generally only reference the type corresponding to the packaging. However, in some cases, dependency on a subtype will also be required, eg:
This subtype will be mapped to the same artifact handler that deals with the ejb packaging so that main POM can be found.
Alternatively, we could drop the extension and always use:
This change is not currently planned.
This would require that one artifact ID only ever be associated to one packaging, which is a best practice. The only opposing use case seems to be something such as a TLD - however this is really a subtype of a taglib JAR.