Message-ID: <1770878602.2965.1369337695195.JavaMail.firstname.lastname@example.org> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_2964_675897655.1369337695195" ------=_Part_2964_675897655.1369337695195 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
You should explore the rest of the Boo Compiler so this is more understandable, especially the Compiler Steps, the Abstract Syntax Tree structures, and = the Boo Parser.
Say you have a line of code like:
What is m? What is x? What does something do? You're in the same boat t= hat the compiler is in when it processes code. To us and the compiler, tho= se are just words or names. They are references to something, but to what = we don't know yet.
Even without knowing what the names refer to, the parser can tell certai= n things that it uses to generate the Abstract Syntax Tree. The above code is a statement (not an e= num or class or import...). Seeing the "X =3D Y" form it knows t= he statement is a binary assignment expression. Seeing the "X.Y"= form it knows that "something" should refer to some member of &q= uot;x". Seeing the parentheses () it knows we are invoking either a m= ethod or other callable object named "something", or if "som= ething" is a type like a class, we are invoking its constructor (like = "m =3D SomeNameSpace.SomeClass()").
So if we wanted to generate the equivalent AST by hand, we would say som= ething like:
So now the compiler has to find out "what" everything in the A= ST is. By "what" I mean some type of code object that exists in = either some external assembly or that is defined somewhere else in your cod= e. Besides a "Name" property, AST nodes like ReferenceExpression= s have an "Entity" property that will store information about the= kind of type that needs to created for that node in the EmitAssembly step.
What are the different kinds of types a node can possibly be? See Entit= yType.cs. A name might refer to a system type (which means class, enum= , interface, or struct), or a method, field, property, whatever.
Now in our example we'll skip ahead to the ProcessMethodBodiesWithDuckTyping step in= the compiler pipeline (actually ProcessMethodBodies, its superclass). We = can do this because our example code doesn't define any new types on its ow= n (we have no "class" statement for example). Any new types that= have been created or imported were handled in earlier steps like BindTypeDefinitions, BindBaseTypes, and BindTypeMembers.
In ProcessMemberBodies.cs the compiler visits the 2 simple refer= ence expressions "m" and "x" in the OnReferenceExpressi= on method.
It retrieves the appropriate entity by calling the Resolve(name) method = in NameResolutionService.cs
The name resolution service asks the type system service, does this name= refer to a built-in primitive (like "int" or "date")? = If yes, we know the entity type because we have a hashtable mapping primit= ive names to their corresponding entity types (which correspond to real .NE= T/Mono types like System.Int32 or System.DateTime).
If no, then it starts a hierarchical search through each namespace conte=
xt in which the referenceexpression is enclosed. Each namespace may have i=
ts own hashtable mapping names to entity types. Let's say the line of code=
is in the global namespace (actually that code will have been moved inside=
a "Main" method inside a module, see the IntroduceModuleClasses step). To resolve &quo=
t;m" and "x" it has to start with the global or module-level=
Back in the Initia= lizeNameResolutionService step, a global namespace and module namespace= s were created. When asked to resolve a name, these namespaces will search= the external assembly references for a type matching that name, or the int= ernal modules in your code for any types you have created yourself, like ne= w classes.
The "something" memberreferenceexpression is processed in the = ProcessMemberReferenceExpression method of ProcessMemberBodies.cs. It asks= the target of the member reference ("x") for its namespace (unle= ss "something" is a type itself and so we ask for its constructor= ). Expressions or types have their own namespaces (see INamespace.cs a= nd IType and other interfaces contained in IEntity.cs), which may store a = list of child entities they contain, and can retrive an entity type given a= name.
To see the names being bound to their respective types, run the boo comp= iler (booc.exe) with the "-vvv" option. This very verbose option= spits all the references and their corresponding entities during the compi= le pipeline.
A little on Type Inference.<= /p>
You can see though that if "m" was not declared earlier in the= code, then the compiler cannot find out its type until it finds out the ty= pe of the "something" member reference. If "m" was dec= lared earlier (like, "m as MyClass"), then when the compiler visi= ts that declaration it will bind the type created for MyClass to "m&qu= ot;. If the compiler then visits the binary expression and finds the type = that "something" returns is not assignable to the type of "m= " then it will complain. In our sample line of code, it is mandatory = at least that "x" is defined elsewhere (perhaps it is a class typ= e or a namespace), "x" contains an entity matching the name "= ;something", and "something" refers to either a method or ca= llable entity (if for example "x" is a class), or else "some= thing" must be a type entity like a class itself with an accessible co= nstructor method ("m =3D SomeNameSpace.SomeClass()").
When you create a new class with a line like "class MyClass..."= ;, for example, the boo compiler creates a new instance of the InternalC= lass entity and sets the ClassDefinition's Entity property to that obje= ct. The ClassDefinition also stores the name you used ("MyClass"= ). The InternalClass handles name resolution and later is used in the EmitAssembly step to help generate the c= orrect IL assembly code to define the type you created.
See Syntactic Macros. Und= erstanding the type system and other features of the Boo Compiler, it is easier to understand how AST macros= work, and their limitations. Currently, macros are processed before the t= ypes are processed. A lot of times a macro is just rearranging the AST str= ucture or adding new references to save typing. But say you need to know t= he type of a parameter passed to your macro. The Entity property will be n= ull at that point. Boo may eventually incorporate "type-safe" ma= cros that are processed later in the compiler pipeline after the type syste= m has done its thing.
Look at the "with" macro on the Syntactic Macros page and you'll understand why it has to c= heck for an underscore "_" at the beginning of a referenceexpress= ion in order to know whether or not that reference should turned into a mem= berreferenceexpression targeting fooInstanceWithReallyLongName. It can't u= se a leading period like Visual Basic because that is an illegal name. And= we can't simply use no prefix (i.e., "f1" instead of "_f1&q= uot;) like some other languages do, because how would we distinguish which = references refer to members of fooInstanceWithReallyLongName and which do n= ot. We can't since we do not know the type of fooInstanceWithReallyLongNam= e at that point.
A type safe macro that is processed after the type system would have to = be more careful in how it processes the AST, so as to not break the name-ty= pe bindings. Remember that the correct type is determined according to the= hierarchical structure of a node's enclosing namespaces. If the nodes are= rearranged, the name might really refer to a completely different type if = the name exists in multiple namespaces. And if you move the AST node for &= quot;x.something()" before "x =3D MyClass()", then x is unde= fined at first and should have been a type error.------=_Part_2964_675897655.1369337695195--