Predicates and Expressions - Part 2
foreach assembly in assembly list
foreach type in assembly
clear list of invariance predicates
clear list of processed methods
foreach invariant attribute on type
transform pragmas on predicate
add pre/post analysis objs to list
generate code for predicate
append code to processedInvariants list
end foreach invariant
foreach member in type
clear list of requires attrs
foreach requires attr on member
transform pragmas on predicate
add pre/post analysis objs to list
generate code for predicate
append code to processedRequires list
end foreach requires attr
clear list of ensures attrs
foreach ensures attr on member
transform pragmas on predicate
extract list of vars to be snapshot
add to list processedSnapshots
add pre/post analysis objs to list
generate code for predicate
append code to processedEnsures list
end foreach ensures attr
insert snapshots into code template
insert pre/post analysis objs to template
insert code for requires guards into template
insert code for ensures guards into template
generate code
add generated code to processed methods list
end foreach member
insert code for members into type template
clear list of invariance predicates
generate code for type
append code to processed types list
end foreach type
setup c# compiler with config params
compile all types in processed types list
store assembly as per instructions
end foreach assembly
This algorithm conveniently glosses over the complexities of the depth first search event propagation mechanism described in previous posts, but the outcome is the same ultimately.
Predicates and Expressions - Part 1
An interesting proposition
Code Generation Patterns - Part 3
This time I'm going to describe the source snippet that I posted last time. In previous posts I showed the implementation that I use of the Observer pattern to fire off events notifying the code generator about code structures that it has found. The code generator was shown last time, and you can see that there are a whole bunch of event handlers that take note of the code structures and perform code generation tasks as required.
A typical example is the handler NewType that generated code for classes, structures and interfaces. As you will recall from my last post, I am using depth-first recursion in my scanner, to allow my code generator to generate in one pass. That means that there will be a lot of generated code knocking about from all of the invariant predicates, methods, events, fields and properties that were detected before the type event was invoked.
The first thing the CG does is add the namespace of the type to a list of Namespaces that is being kept for addition at the top of the code for the proxy. Obviously the proxy needs to know about the target object. As a general rule you should use the fully qualified name of any type that you refere to in argument lists, return types etc. Readability is not an issue here (very much) but not having to keep track of all the namespace references that could be needed is worth this stricture.
Following the namespace manipulations the generated code is added to the context variables for the NVelocity template. The generated code is kept in the context vars between lines 10 and 21. For common code structures such as assemblies, types and methods the generated code is added to an ArrayList and when the next most senior CG template needs to include them, it just iterates through the list, inserting everything that it finds. The HashTables are an implementation convenience that helps the CG to maintain a unique list of variables that need to have snapshots taken.
After the CG has added all of the snippets of code that have been generated so far, it invokes the template for the type itself. The template looks like this:
namespace ${type.Namespace}.Proxies{
#region Class ${type.FullName}Proxy
public class ${type.Name}Proxy
{
internal ${type.FullName} TheTargetObject = null;
public ${type.Name}Proxy(${type.FullName} target)
{
TheTargetObject = target;
}
#foreach($method in $methods)
$method
#end
#foreach($property in $properties)
$property
#end
#foreach($field in $fields)
$field
#end
#foreach($event in $events)
$event
#end
} // end class ${type.FullName}Proxy
} // end namespace ${type.Namespace}.Proxies
#endregion
As you can see the template is minimal. All of the interesting work has been done in other templates or "Pragma processors". I'll be going into pragma processors in depth next time. Work ahead of me includes making the proxy implement the same interfaces as the target object. In the template above the proxy class is generated with exactly the same name as the target class, but with "Proxy" appended. It has the same namespace with ".Proxies" appended. The proxy stores a reference to the target object called TheTargetObject. TheTargetObject is delegated to when all the predicate checks have been performed. The template for a method is where the action is:
#if(!$utils.IsProperty($method))
#if($utils.IsVoidType($method.ReturnType))
#set($returntype = "void")
// found a void return type
#else
#set($returntype = ${method.ReturnType.FullName})
// didnt find a void return type
#end## if isVoidType
public new $returntype ${method.Name}(#set($comma = "")
#foreach($arg in ${method.GetParameters()})
$comma ${arg.ParameterType.FullName} ${arg.Name}#set($comma = ", ")#end){
// take the 'before' snapshots
#foreach($snapshot in $ssbefore)
$snapshot.TypeName ${snapshot.After} = ${snapshot.Before};
#end
//TODO: Invariant code here
#foreach($inv in $invariants)
$inv
#end
//TODO: Require code here
#foreach($req in $requires)
$req
#end
## now the call the the real object
#if(!$utils.IsVoidType($method.ReturnType))
$returntype result =
#end##if is void
#if($method.IsStatic)
${fullclsname}.${method.Name}(#set($comma = "")
#else
TheTargetObject.${method.Name}(#set($comma = "")
#end
#foreach($arg in ${method.GetParameters()})
$comma${arg.Name}#set($comma = ", ")#end);
// take the 'after' snapshots
#foreach($snapshot in $ssafter)
$snapshot.TypeName ${snapshot.After} = ${snapshot.Before};
#end
//TODO: Ensure code here
#foreach($ens in $ensures)
$ens
#end
//TODO: Invariant code here
#foreach($inv in $invariants)
$inv
#end
#if(!$utils.IsVoidType($method.ReturnType))
return result;
#end##if is void
}#end ## if is not property
The templates are pre-loaded with a "utils" object that is able to perform various jobs related to analyzing types that are beyond the capabilities of NVelocity. We use this to check whether the method is a property or not. Properties in C# are handled as methods in the CLR. If we ignore properties and generate them as methods we lose a lot of the syntactic flexibility that C# makes available.
We need to check then whether the method is void or not. If so, we provide the short form of the void type, since the code will not compile if we use "System.Void". Not sure why that is the case. Answers on a postcode please!
Having determined what the return type is going to be, we can then create the method signature. We are only ever going to get notified about the public methods, since those are all that external users of the proxy or the target object would ever get to see. We don't need to wrap protected, private or internal methods so we can hard-code the public access specifier. Next, the method parameters are inserted. They are a complete copy of the arguments of the target objects, but with the type names expanded.
Now we get to the whole purpose of this exercise - to insert predicate checks into the method body to check parameters, members and other features before and after invocation of the method on the target object. the pattern of the invariants is as follows:
Invariants are predicates that must be true at all times - they must be true before and after the method has been called. Therefore we bracket the call with the invariant tests, and have requires checks before and the ensure checks after. For more information on these predicates take a look at the language documentation for Eiffel. Since the invariants are invoked before and after every method call you should make sure that they are not too expensive.
At the end of the method template the script looks again to check whether the method returns a void, and if so skips the returning of a result. It also doesn't bother to collect the result in the first place.
Each of the methods is built this way, converted into a string and then appended to the end of the ProcessedMethods properties in the CG. When the type event is finally emitted, the ProcessedMethods property contains a wrapper for each of the methods that the scanner found. That should be every public method that the target object implements or inherits. Obviously, the invariant properties of the object must be true for all properties, fields, events and methods, so it is not enough just to wrap the methods that have Require and Ensure attributes attached, since the Invariant attributes apply for the whole type at all times.
Next time I'll show you how I convert the assertions made in the invariant, require and ensure attributes into working code that can be inserted into the proxy. I'll leave you with this exercise - How would you convert this into a peice of working code. I'll show you how I do it. Let me know if your approach is better, because I'd be glad to use it!
Code Generation Patterns - Part 2
Last time I described typed and un-typed events, and multicasting events to several listeners. One of those listeners would be a code generator. This time I'll go on to describe some of the necessary infrastructure that such a design requires. Generally this information is all about tieing the events together. The context in which an event takes place is as important as the event itself.
When you're generating code progressively, you need to keep track of what you have done, and what still needs to be done. In this case that means the code already generated, and the messages already received, but have not generated code for. It also indicates what code structure(s) an incoming predicate applies to.
There are two complementary modes of operation that a scanner/code-generator system can use for broadcasting events. They are variations of tree search algorithms. In the first case, the most senior element is broadcast first. that means that an event would be generated for a type before the events for its methods are fired. I shall call this (for want of a better name) the "depth-last recursive search". The second approach is a true "depth-first search" since elements that are lowest in the object tree are announced sooner. These two modes support different code-generation plan. The choice will have an effect on what sort of state one has to keep, and how long it has to hang around. More on that later.
With depth-first recursion a method event will be broadcast before the corresponding type event , and a predicate such as an Ensure attribute, will be received before the method on which it was attached. Therefore, when you are generating code, you can't just slot what you get into what you already have, because you don't have anywhere to put it.
With a depth first search context variables track the predicates detected till an event for the method is broadcast and we can generate the whole code for the method. We still have to keep the generated code around till we get the type event! Needless to say, we could hunt for such information as the method event arrives. We could use reflection to navigate up the type object graph as far as the assembly if we wanted to. But if we rely too much on that sort of navigation we break the whole broadcast idiom, and morph the algorithm into a depth-last recursive search (which has it's own unique requirements that I'll come onto next).
In the depth-last search the event for the more senior object is fired before that of its subordinates. That means we get the event for the type before that of the method, and the event for the method before that of its predicates. That's helpful because we now have something to hang the subsequent events off of. If you were producing a whole object graph in memory then this would be ideal, since the tree would always be properly connected rather than fractured. This approach is not without its drawbacks, not least of which is that you have to build up an object graph in memory before you can generate the code for it! With depth-first recursion you know that when you get the event for the method that there are no more predicates coming. You know when it's safe to generate code. With the depth last approach you have to send a "finish" message that says when the stream of sub-objects has stopped. On the whole I've found for this project that depth-first navigation of the assembly object graph works fine, and simplifies the event interface of the listeners that I want to use. In other projects I've used this on I've done the opposite, and everything has gone OK, it really depends on the sizes of the data stream, and the performance characteristics required. There are drawbacks with either approach and you should probably decide on the basis of the following criteria:
The snippet below shows some of the code that I use to process events in the code generator. Much has been chopped out of course, but from this you should see how all the talk about depth-first searches and events translates into code generation.
public class DbcProxyCodeGenerator : DbcSupertype { public DbcProxyCodeGenerator() { InitialiseTemplates(); } #region Context Variables private ArrayList processedAssembly = null; private ArrayList processedTypes = null; private ArrayList processedMethods = null; private ArrayList processedProperties = null; private ArrayList processedFields = null; private ArrayList processedEvents = null; private Hashtable processedSnapshotsBefore = null; private Hashtable processedSnapshotsAfter = null; private ArrayList processedInvariants = null; private ArrayList processedEnsures = null; private ArrayList processedRequires = null; private Hashtable processedNamespaces; public void NewAssembly(object sender, NewAssemblyEventArgs e) { vtAssembly.SetAttr("assembly", e.TheAssembly); string[] namespaces = new string[ ProcessedNamespaces.Keys.Count]; ProcessedNamespaces.Keys.CopyTo(namespaces, 0); vtAssembly.SetAttr("namespaces", namespaces); vtAssembly.SetAttr("types", ProcessedTypes); ProcessedAssembly.Add(vtAssembly.Merge()); } public void NewType(object sender, NewTypeEventArgs e) { ProcessedNamespaces.Add(e.TheType.Namespace, null); vtType.SetAttr("type", e.TheType); vtType.SetAttr("methods", ProcessedMethods); vtType.SetAttr("fields", ProcessedFields); vtType.SetAttr("properties", ProcessedProperties); vtType.SetAttr("events", ProcessedEvents); ProcessedTypes.Add(vtType.Merge()); ProcessedMethods = null; ProcessedFields = null; ProcessedProperties = null; ProcessedEvents = null; ProcessedInvariants = null; } public void NewMethod(object sender, NewMethodEventArgs e) { vtMethod.SetAttr("method", e.Method); vtMethod.SetAttr("invariants", ProcessedInvariants); vtMethod.SetAttr("requires", ProcessedRequires); vtMethod.SetAttr("ensures", ProcessedEnsures); ArrayList beforeSnapshots = SnapshotProcessor.GetBeforeSnapshots (e.Method as MemberInfo, ProcessedSnapshotsBefore.Keys); ArrayList afterSnapshots = SnapshotProcessor.GetAfterSnapshots (e.Method as MemberInfo, ProcessedSnapshotsAfter.Keys); vtMethod.SetAttr("ssbefore", beforeSnapshots); vtMethod.SetAttr("ssafter", afterSnapshots); ProcessedMethods.Add(vtMethod.Merge()); ProcessedEnsures = null; ProcessedRequires = null; } public void NewInvariantAttribute(object sender, NewInvariantAttributeEventArgs e) { EnsureSpecification es = DbcPragmaProcessor. ProcessEnsure(e.Invariant.Predicate); SnapshotProcessor.RegisterSnapshots(es, ref this.processedSnapshotsBefore, ref this.processedSnapshotsAfter); vtInvariant.SetAttr("invariant", es); ProcessedInvariants.Add(vtInvariant.Merge()); } public void NewEnsuresAttribute(object sender, NewEnsuresAttributeEventArgs e) { EnsureSpecification es = DbcPragmaProcessor. ProcessEnsure(e.Ensure.Predicate); SnapshotProcessor.RegisterSnapshots(es, ref this.processedSnapshotsBefore, ref this.processedSnapshotsAfter); vtEnsure.SetAttr("ensure", es); ProcessedEnsures.Add(vtEnsure.Merge()); } }
A seemlingly simple decision like the order in which you notify interested parties leads to significant performance issues that must be known about in advance. To illustrate this, consider the following scenario:
I want to use this framework in a number of different scenarios. I want to be able to statically generate the code for a proxy object that I then use as though it were the real object. That proxy object will enforce all of my rules at runtime. Projects that use my proxies must include the proxy assembly as well as the assembly of the target object. This is the simple case, I also need to be able to dispense with that and use dynamic proxies. Thus situation is one where I use dynamic code generation at runtime. To do that I need to request access to the object through a class factory that will perform code generation of the proxy on-the-fly. In other situations I want the framework to dovetail with aspect oriented programming frameworks like ACA.NET. In that situation the generated code could be either static or dynamic, but the specifics need to abstracted into a configuration file.
As you can see from these requirements our needs can swing between static and dynamic code generation, and we may even want to use both in the same program. Performance must take precedence if we want to use dynamic code generation. Static code generation won't suffer over much if we choose a code generation strategy that is biased towards dynamic code generation since its costs are at compile time, and won't affect runtime performance.
Powered for Blogger by Blogger templates