Message-ID: <600290877.802736.1386882225588.JavaMail.firstname.lastname@example.org> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_802735_1322360240.1386882225588" ------=_Part_802735_1322360240.1386882225588 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Senior Software Engineer
BEA Systems, Inc.
Business Interaction Division
This document is long and terse; I apologize for that.=20
I've divided it up into four sections: one section of Definitions, one s= ection of Use Cases, and two "Requirement Chains".=20
The section on Definitions is very important, because it attempts to dis= cuss SCM "version numbers" while being totally agnostic as to whe= ther CVS, SVN or P4 is being used under the hood. I've made up a few terms = in an effort to remain agnostic; I want to especially discourage anyone fro= m using the term "Revision Number" while discussing this document= , because each provider uses that term differently.=20
After that, I define four uses cases. Each of them is followed by a &quo= t;Failure Use Case".=20
Finally, there are two "Requirement Chains". The Requirement C= hains are so-called because they discuss multiple requirements that flo= w from one another. If earlier Requirements aren't satisfied, later Re= quirements are irrelevant. If only earlier Requirements are satisf= ied, later Requirements become more important; in some cases it would even = be harmful to satisfy earlier Requirements without satisfying late= r Requirements.=20
Also, I've been very explicit about the logic that I used to come up wit= h these requirements. That's because I want to make sure we can all see why we want these things. My claim is that if you accept the top-leve= l Requirements and Assertions, you'll be forced to accept the Derived Requi= rements below, and you'll also know why they're necessary.=20
Finally, as a note, these are not the only requirements BEA nee= ds from Maven. There's still other stuff not mentioned here ("who depe= nds on me?" reports, VStudio integration, dependency conflict reports)= that don't require an elaborate requirements document to explain/describe,= and won't necessitate any changes to the Maven core or Continuum. (Some of= them are already on the Maven roadmap.) Still, issues like these are very = important to us; even if all of the issues in this doc were fixed today, we= 'd still have quite a bit of work ahead of us before we could actually = use Maven for everything we do.=20
OK, on with the show!=20
SCM - Source Configuration Management. We have to do th= is so we can keep track of our source files.=20
SCM Provider - CVS, SVN, Perforce, ClearCase, etc. Each= of these is a different provider of SCM. (You could call them vendors, but= nobody really "vends" CVS.)=20
SCM System - One particular SCM machine (or a cluster o= f machines that behave as if they were one machine) running some SCM softwa= re.=20
Consumer - A Maven project that "consumes" so=
me Producer as a dependency.
Producer - A Maven proj= ect that provides code for a Consumer to use.
Maven Version Number - The number that appears in the /= project/version tag in some project's pom.xml file.=20
Checked-in Maven Version Number - The Maven Version Num= ber that appears in the pom.xml file that appears in the SCM system.=20
Continuum Published Maven Version Number - The Maven Ve= rsion Number that appears in the pom.xml file that Continuum publishes to t= he remote repository. (Normally this is the same as the Checked-in Maven Ve= rsion Number, but these can be different if, say, Continuum changes the Mav= en Version Number as it is published.)=20
Suppose you have a directory layout like this:=20 =20
Single File SCM Number - A version number that applies = only to a particular file under source control. Each file has its own Singl= e File SCM Number. In the example above, One.java has its own Single File S= CM Number separate from Two.java's Single File SCM Number. Check-ins that o= nly modify One.java do not change the Single File SCM number of Two.java, a= nd vice-versa.=20
Global SCM Number - A version number that applies to th= e entire SCM system at once. There should be only one Global SCM Number at = a time for any given SCM system. (In CVS, this version number can simply be= the date of the last check-in.)=20
Directory Specific SCM Number - A version number that a= pplies only to a particular directory in the SCM system. In SCM systems tha= t have Global SCM Numbers, you can find a Branch Specific SCM Number by fin= ding the largest Global SCM Number that affected files in that directory. (= In CVS, this version number can simply be the date of the last check-in in = the specified directory.)=20
"Revision" Number (deprecated term) - Means different= things across various SCM systems, and should not be used in a technical d= iscussion that intends to remain agnostic to SCM systems. In particular, CV= S *only* has Single File SCM Numbers, called "Revision" numbers. = Subversion does not have Single File SCM numbers, but only has the Global S= CM Number, and calls it the "Revision" number. Perforce has Singl= e File SCM Numbers (called "Revision" numbers, just like CVS,) an= d it also has Global SCM Number, called a "Changelist" number.=20
Let's refer to Single File SCM Numbers by reference to a file name follo= wed by a # and a number. So One.java#4 means One.java's fourth Single File = SCM Number. Global SCM Numbers will just be called "G" followed b= y a number, e.g. G123.=20
Let's say this happens:=20
Alice checks in One.java#1: G1
Bob checks in Two.java#1 and Three.= java#1: G2
Alice checks in One.java#2: G3
Bob checks in Two.jav= a#2: G4
Alice checks in One.java#3 and Two.java#3: G5
Bob check= s in Two.java#4: G6
At this point, "bar"'s Directory Specific SCM Number is 5, bec= ause One.java was updated in G1, G3 and G5, and not since then. "baz&q= uot;'s Directory Specific SCM Number is 6, because Two.java was updated in = G6. (Note that Three.java never changed.)=20
Alice, a QA engineer, finds and files a bug in the "P" project= , whose Maven Version Number is "2.2". The developer Bob attempts= to fix the bug in Foo.java. He checks some changes into the SCM system, wi= thout modifying pom.xml. The SCM system gives Bob an SCM Number (of some ki= nd) that represents the check-in... let's call it "123".=20
The continuous build system automatically creates a build containing Bob= 's changes; the build is a success. Bob marks the bug as "Coded - Plea= se Verify." But Alice reports that the bug is "Not Fixed."= p>=20
Bob is skeptical, and asks Alice whether she tested with a binary contai= ning check-in "123". Without using a pom.xml file or the SCM syst= em itself, Alice examines the binary's metadata and verifies whether it con= tains that check-in number.=20
Failure Use Case: The Maven Version Number of P contain= s only "2.2", so P's binary metadata doesn't contain any more inf= ormation than that. Alice has no way to tell whether she has the correct co= py of P version "2.2".=20
Tuesday morning, the automated continuous build system generates a build= for project "P". Alice tests this project on Tuesday afternoon.<= /p>=20
A day passes, and no one checks-in any changes to P. On Wednesday, someo= ne kicks-off an "on demand" build of P. P builds successfully as = it did on Tuesday, and the P binaries are given to Alice to test. Alice exa= mines the binary metadata and discovers that she does not need to retest P.==20
Failure Use Case: P's builds are identified by "bu= ild numbers", which go up even when the source does not change. Alice = is unable to tell whether this binary needs to be retested or not.=20
Developer Carol works on the "C" project, which consumes Bob's= "P" project. Carol wishes to continously integrate with P, to ma= ke sure that she's constantly using the latest version of P.=20
On Tuesday, the automated continuous build system creates a build of C, = which runs a series of automated tests. The build+test process passes; Caro= l makes no further changes to C.=20
On Wednesday, Bob checks-in a change to project P, which builds successf= ully on the automated continuous build system, but it doesn't work correctl= y with C. The build system automatically updates Carol's C pom.xml file, po= inting at the new version of P, and then builds C using the new P.=20
Even though no other code changes went into C, C automatically gets a ne= w version number, because it is using a newer version of P. The continuous = build system therefore automatically builds the new C. This new version of = C fails to pass the automated tests. Carol remembers that she hasn't change= d C herself, so she looks in the SCM history to see what has changed in C. = She sees that the automated build system has modified C to point at a newer= version of P.=20
Carol synchronizes her source code back to the configuration it was in o= n Tuesday. She runs that build on her machine and finds that it works, just= like it did on Tuesday. She syncs to the latest Wednesday version of C and= attempts to build that. She finds that the automated tests no longer pass.= Realizing what has happened, she opens a bug against Bob. Bob fixes the bu= g in P and then kicks off a build of P. The automated build system automati= cally updates C to use the new version of P, and builds C. C now builds suc= cessfully.=20
Finally, later in the development cycle, Carol decides that she doesn't = want C to be automatically upgraded to the latest version of P. She makes a= change to her pom.xml file to indicate that automatic upgrades are not des= ired. Now, when a new version of P is released, C remains unaffected.= =20
Failure Use Case 1: Carol is using a generic or "s= oft" version number in her C pom.xml, instead of a hard literal versio= n number. Carol's code does not change when there's a newer version of P, s= o the automated build system does not build a new version. Later, when Caro= l changes C, she finds that the build is failing, but not due to any change= of her own. Carol rolls back her changes and finds that the build still do= esn't work. After investigation, she finds that P was to blame, so Bob rele= ases a new version of P. Then, C's build magically starts working again.=20
Failure Use Case 2: On Wednesday, C is using a newer ve= rsion of P than Tuesday, but C has not changed. Alice looks at C, observes = that it hasn't changed, and decides not to retest it, not realizing that th= ere are now bugs in that version of C that weren't there on Tuesday.=20
Bob, a developer, owns a project P that builds a Windows library file ca= lled "Z.dll"; Z.dll is consumed by another library Y.dll owned by= developer Carol, which is consumed by another library X.dll, which is cons= umed by an executable called "Foo.exe".=20
Carol declares in her Y pom.xml file that she depends on Z.dll version 1= .0. Later, Bob releases Z.dll version 1.1. (In this scenario, Carol is not = auto-updating as in the previous case.)=20
Dave, a customer, calls complaining of a critical problem in Foo.exe. Ev= entually, the problem is transferred to Carol. Carol reproduces the problem= on a test machine, and suspects that the problem may be remedied by replac= ing Z.dll with version 1.1. Without using Maven, Carol copies Z.dll version= 1.1 directly from the remote repository and copies it into her installatio= n directory. She finds that this fixes the problem.=20
Carol checks-in an update to Y's pom.xml and kicks off builds and tests.= Carol finds that no regressions are introduced, so she sends Z.dll version= 1.1 to Dave as a critical update, without sending a new X.dll or Foo.exe, = since these files were not changed. Dave copies only Z.dll into the install= ation directory, and finds that the problem is fixed.=20
*Failure Use Case 1: *The Windows library in the Maven repository is cal= led "Z-1.1.dll", but Y.dll declares in its linking metadata that = it depends on "Z-1.0.dll". Carol kicks off a new build of Y using= the new Z. Y now has a new version number, so she has to kick off a build = of X to consume the new Y. But now X has a new version number, so she has t= o rebuild Foo.exe as well to use the new X. When she's done, she finds that= this does fix the problem, but now Dave must replace not only Z.dll but al= so Y.dll, X.dll and Foo.exe to benefit from the fix.=20
Failure Use Case 2: The Windows library in the Maven re= pository is called "Z-1.1.dll", even though it was built as "= ;Z.dll". Carol copies this into her installation directory, but it doe= sn't work, and she doesn't know why.=20
At BEA, we have a facility called "J2C" which we can use to co= nvert (or "jump") Java code into C#, which we then compile into .= NET assembly DLLs. Jumpable code will thus have two sets of dependencies: o= ne set of "Java" dependencies, and another set of ".NET"= ; dependencies.=20
Normally all of the sources files are Java, though there may be some sou= rce files that are "non-jumpable", in which case you'll have thre= e sets of source: the jumpable Java, the non-jumpable Java, and the native = C#.=20
Bob, a developer, maintains a jumpable project, Producer P. Carol is a d= eveloper on another jumpable project, Consumer C. When Bob runs builds of P= , the build automatically runs J2C to convert his Java code into C#. It the= n builds the Java JAR output, as well as the C# DLL output. It also runs un= it tests against both the Java and the C#.=20
Bob then checks in his code and publishes it. When published, the C# and= the Java code appear in the repository as a single project with multiple a= rtifacts. Both of these artifacts appear in the same directory in the remot= e repository. They necessarily have the exact same version numbers (because= they both descend from the same source code in SCM).=20
Carol's Consumer C depends on P in both of its forms: the Java build of = C depends on the Java part of P; the .NET build of C depends on the .NET bu= ild of P. Carol declares this in her pom.xml file; she only has to declare = the version number of P once, because there is just one project and just on= e version number. Carol's build then generates Java and .NET artifacts.= =20
Failure Use Case 1: Bob creates a build that has multip= le artifacts, but only one of them is allowed to be "primary"; he= must declare the packaging (filetype) of the primary artifact explicitly. = He selects the Java artifact to be primary, so P's packaging is "jar&q= uot;. Carol then can't add the secondary .NET artifact to her .NET classpat= h, because her declaration of dependency on P means that she only gets the = "primary" artifact, whose packaging is "jar".=20
Failure Use Case 2: Bob must maintain two projects, &qu= ot;p-java" and "p-dotnet". Bob arranges his artifacts like t= his:=20 =20
The "p-dotnet" pom.xml declares that its source directory is &= quot;../p-java/src".=20
Bob checks in a change to One.java. This automatically changes the Direc= tory Specific SCM Number of "p-java", but it doesn't change the D= irectory Specific SCM Number of "p-dotnet". Therefore "p-dot= net"'s version number isn't kept up to date with "p-java"'s = version number. The build system therefore knows to automatically kick off = a build of p-java, but forgets to build p-dotnet, because p-dotnet apparent= ly hasn't changed.=20
Failure Use Case 3: Exactly as above in Failure Use Cas= e 2, but somehow, the build system is smart enough to know to rebuild p-dot= net. But p-dotnet's Directory Specific SCM Number hasn't changed, so its ve= rsion number doesn't change. Therefore Alice, a QA Engineer, can't tell whe= ther or not to retest p-dotnet.=20
Failure Use Case 4: As above in Failure Use Case 2. An = automated system automatically updates p-dotnet after Bob checks in changes= to p-java. Therefore p-dotnet is automatically rebuilt with a new version = number, but this version number is different from the version number of p-j= ava. Carol must declare her dependency on Bob's project twice, with two dif= ferent version numbers, even though the source is the same, doubling the re= porting workload, and introducing new opportunities for errors.=20
Here comes a bunch of logic. Simple "Requirements" are require= ments that I think everyone already accepts. Simple "Assertions" = are statements that I think everyone already knows.=20
"Derived Assertions" are Assertions that follow logically from= other Assertions. "Derived Requirements" are Requirements that f= ollow logically from other Requirements or Assertions.=20
Consider a Consumer build project called "C" and a Producer bu= ild project called "P".=20
1) Requirement: P's version number must always change if its source has =
changed, and should not change if its source has not changed.
&= nbsp;1a) Assertion: "Build numbers" generated by the build engine= (Continuum) increase even if the source has not changed.
= ;1b) Assertion: "Global SCM Numbers" increase even if the source = of P has not changed.
1c) Derived Requirement: On account= of 1a and 1b, official (Continuum) builds should be identified by their Di= rectory Specific SCM Number.
1d) Derived Requirement: On = account of 1c, Continuum should publish P's built artifacts using P's Direc= tory Specific SCM Number (appending it to P's Checked-in Maven Version Numb= er) when it does official builds.
2) Requirement: C's official build = must be reproducible using the source of C and the build repository.
= 2a) Clarification: In particular, if on Tuesday C's build succe= eds but on Wednesday C's build fails, it must be because there was some cha= nge in C that caused the failure.
2b) Derived Re= quirement: On account of 2a, I should be able to synchronize the source of = C back to Tuesday's source and rebuild that; that C build should build succ= essfully if it built successfully on Tuesday.
2c) Derived= Requirement: On account of 2a and 2b, C must always declare in C's source = code that it depends on a *specific* "hard" version of P, and nev= er use a generic or "soft" term like "LATEST" or "= SNAPSHOT" in an official build.
3) Assertion: Updating C manuall= y every time you want to use a slightly newer version of P is very tedious = and error-prone.
3a) Derived Requirement: On account of 2= c and 3, it should be possible (but not required) to automatically update a= nd check-in C to make it use a newer version of P. (These automatic updates= must change C's source code to ensure that C is reproducible.)
4) As= sertion: If you change C's source code during the build of C, then its Dire= ctory Specific SCM Number changes during its own build.
4= a) Assertion: Therefore, Directory Specific SCM Numbers are vague = if check-ins happen during the build: you may need to consider the differen= ce between the Directory Specific SCM Number when you began the bu= ild and the Directory Specific SCM Number when you finished the bu= ild.
4b) Assertion: Since all check-ins modify the la= test code in the repository, if C's build checks-in to C, it is imposs= ible to rebuild old versions of C without clobbering new versions of C.
4c) Derived Requirement: On account of 4a and 4b, C should n= ever check-in to itself during its own build.
5) Assertion: A builder= of C must poll the remote repository to see if there's a newer ve= rsion of P; the remote repository can't/won't send a just-in-time message t= o clients informing them when there is new code available.
&nbs= p;5a) Clarification: In other words, it can't be P's job to update all cons= umers of P. A consumer C must check on all of its dependency declarations t= o see if any of them need to be updated.
5b) Derived Requ= irement: On account of 4c and 5, C should (be able to) poll the remote repo= sitory for newer versions of its dependencies and check-in an updated versi= on of its POM immediately before C builds (in a pre-build= step).
6) Requirement: Single-sourced "jumpable" builds (a= nd other builds with multiple artifacts of various filetypes) must assign t= he same version number to all of the derived filetypes.
6a) Assertion= : If you try creating multiple POMs in multiple "sibling" directo= ries pointing to the same source files (as in Use Case 4/Failure Use Case 2= ), not all the various POMs' Directory Specific SCM Numbers will be changed= when code is checked-in.
6b) Derived Requirement: On account of 6 an= d 6a, a single POM needs to be able to generate multiple packaging types (p= erhaps via profiles).
1d) Continuum should publish P's Directory Specific SCM Number (appendin= g it to P's Checked-in Maven Version Number) when it does official builds.<= /p>=20
5b) Consumer C should (be able to) poll the remote repository for newer = versions of its dependencies and check-in an updated version of its POM imm= ediately before C builds (in a pre-build step).=20
6b) A single POM needs to be able to generate multiple packaging types (= perhaps via profiles).=20
Notice that these requirements depend on each other. There's little poin= t in coding requirements 5b/6b without coding requirement 1d. Furthermore, = it's actually worse to code requirement 1d without coding requirem= ents 5b/6b... it would be better to leave off 1a entirely than to code it w= ithout requirements 5b/6b.=20
Furthermore, note that if requirement 3a (automatic updating) is coded a= s part of the build (against requirement 5b, which explicitly requ= ires that updates happen pre-build), you'll actually violate repro= ducibility, again making the problem worse, not better.=20
6) Assertion: Windows has no symlink feature and no automated LD-style r=
6a) Assertion: Therefore, Windows libraries (DL= L files) must have the same name at build-time and at run-time; they will n= ot work if they are renamed after they are built (unless they are re-rename= d back to their build-time names).
7) Requirement: It should be possi= ble to test a Windows Consumer library Consumer.dll with a newer version of= the Producer library Producer.dll without rebuilding Consumer.dll.
&= nbsp; 7a) Assertion: Under the current Maven naming standard, Windows = libraries (like all other files) include version numbers in their names.
7b) Derived Assertion: On account of 6a and 7a, under the c= urrent Maven naming standard, requirement 7 is not fulfilled.
&= nbsp;7c) Derived Requirement: On account of 7, Windows libraries should not= include version numbers in their names, either at build-time or at run-tim= e.
8) Assertion: In Maven, you can use the /project/build/finalname t= ag in pom.xml to make files use a specified name at build-time. If you use = <finalname>, the file will be renamed to include the version number i= n the default repository layout.
8a) Derived Assertion: O= n account of 8 and 6a, Windows libraries will not work at run-time if users= copy them directly from the Maven default repository layout without renami= ng them.
9) Assertion: As it currently stands, Maven plug-ins that us= e Windows libraries need to rename them back to their "finalname"= before using them. Each Maven plug-in that can touch a Windows library mus= t be coded to handle this special case.
10) Requirement: If a file ha= s been built using Maven, it should be possible to tell what version it is = without access to any other file.
10a) Derived requiremen= t: On account of 10, if possible, Maven should add version numbers to built= file names.
10b) Assertion: Windows libraries and Java J= AR files can contain metadata containing version information. JARs contain = metadata in a meta-inf/manifest.mf file zipped into the JAR; Windows librar= ies embed VersionInfo metadata that you can see by right-clicking on the fi= le and viewing the file properties.
10c) Derived requirem= ent: On account of 10, if possible, Maven should burn version numbers into = files that it builds, in Windows file meta data or in Java "manifest.m= f" files, or by some other appropriate standard mechanism.
= ; 10d) Derived requirement: On account of 10, if a file's name does no= t include its version number, its version number should at least b= e somehow burned into the file itself.
11) Assertion: End-users use t= he <finalname> tag to indicate that it is important that the file hav= e a certain name at runtime; they assert that this requirement is more impo= rtant than the requirement 10a.
11a) On account of 6a and= 11, Windows libraries should have their "finalname" set to their= build-time name.
12) Requirement: It should be easy to consume files= built with Maven just by copying them out of the remote repository, even i= f the consumers aren't using Maven.
12a) Derived Requirem= ent: Therefore, files built with Maven should not spend any time u= nder a name that makes them unusable.
12b) Derived Requir= ement: On account of 11 and 12, files built with Maven that have a <fina= lname> tag should always have that name, even in the default repository = layout, because this is more important than requirement 10a.
&n= bsp;12c) Derived Requirement: On account of 6a, 11a and 12, Windows librari= es built with Maven should always have the same name they had at build time= , even in the default repository layout, because this is more important tha= n requirement 10a.
13) Requirement: Maven plug-ins that handle files = with special naming requirements must always use those files under a name t= hat works
13a) Derived Requirement: Therefore, the Maven = Core should always provide working filenames to plugins.
= 13b) Derived Requirement: On account of 11 and 13a, when Maven uses files t= hat have the <finalname> tag, the Maven Core should provide files to = plug-ins under the specified finalname.
13c) Derived Requ= irement: On account of 6a, 11a and 13a, when Maven uses Windows libraries, = the Maven Core should provide those library files to plugins under the same= name under which they were built.
7c) Windows libraries should not include version numbers in their names,= either at build-time or at run-time.=20
11a) Windows libraries should have their "finalname" set to th= eir build-time name.=20
12b) Files built with Maven that have a <finalname> tag should alw=
ays have that name, even in the default repository layout.
12c) Windo= ws libraries built with Maven should always have the same name they had at = build time, even in the default repository layout.
13b) When Maven uses files that have the <finalname> tag, the Mave=
n Core should provide files to plug-ins under the specified finalname.
13c) When Maven uses Windows libraries, the Maven Core should provide tho= se library files to plugins under the same name under which they were built= .
The problems I associate with Windows Libraries apply both to native win= 32 Windows libraries, as well as Windows COM libraries, as well as Microsof= t .NET assemblies. None of these files can be renamed after build time and = be expected to work. (Note that all of these files, each of which are very = different types of files, have the .DLL extension.)=20
I am aware that our proposed changes involve a change to the Maven Core.= I believe that this change is for the better and that everyone would benef= it from it.=20
I believe that there may be some temptation to avoid changing the Maven = Core and push the work of auto-re-renaming finally-named files onto the plu= g-ins. I believe that it would be a mistake to force the plug-ins to deal w= ith this, because it inappropriately distributes code that should be writte= n once in the Maven Core.=20
I also suspect there must be some temptation to ignore the point about &=
lt;finalname>, and just add this special case code to Windows library pl=
ug-ins as a special case. But Windows development is not and cannot be a &q=
uot;special case". Most developers are Windows developers. Win=
dows libraries must stand as first-class citizens in the Maven universe.
It's especially important that the Maven repository not just b= e a "jar repository," in which you could also (sort-of) put Windo= ws DLLs. It shouldn't even be a library repository. It should be an artifac= t repository, which includes shell scripts, settings files, installers, and= the whole menagerie of stuff that everyone needs to use their software.
Finally, note that the problem of including version numbers in library n= ames is bad now, but it would become much worse if Requir= ement Chain 1 were satisfied without satisfying the Requirements in Require= ment Chain 2, because it would require changing linking metadata much more = frequently.=20