Thursday, September 22, 2011

AST Transformations: The transformation itself

So far we have seen the different compile phases, the form that an annotation has to take to wire the AST transformation into the compiler process, had a look at the classes that create the AST structure and we have discussed the idea of a transformation that implements a precondition as an annotation. Now we get to the transformation itself.

Access to the AST
We know that the compiler, during its work, creates an abstract syntax tree that is the basis for the finally created bytecode. The compiler provides an ordered access by allowing us to register objects implementing the interface ASTTransformation for the execution in a specific phase of the compiler.

The idea is that for each phase a tree walker is executed that, for every node it encounters, calls the method visit() of the registered objects to add information to the nodes and / or change the respective subtree. For local AST transformations this method is always called with two arguments. The first argument is an ASTNode array containing an AnnotationNode that triggered the call and a node representing the source code element that has been annotated e.g., if you have annotated a method then the second element in that array is an object of type MethodNode. The second argument is the SourceUnit instance that represents the actual source code.

Interlude - Defensive Programming
Now, this is more or less the description of the informal contract between the compiler and your ASTTransformation. So nothing has to be done and we can directly start with modifying the abstract syntax tree. But wait, do you really trust the compiler? What if, through some mysterious ways (and I don't mean the X Files kind, but the "oh, someone upgraded the compiler to a new major version with different semantics for the interface" way) the compiler decides to call your transformation with some other arguments? Then your code fails and with it, the compiler as a whole, without giving any sensible information about what happened to the user.

In addition to that Hamlet D'Arcy pointed out "that the contract [between AST transformation and compiler] is enforced by unit tests, not the type system and not a language specification. Who knows what will change in Groovy 1.9 and 2.0. It is better to be defensive and fail *without causing an exception* because that will make the compilation proceed even if your transform is broken in the future."

So, if we are fooling around with the compiler, a defensive programming style is the first thing you should adopt. This means, that the following code,

def annotationType = Requires.class.name

private boolean checkNode(astNodes, annotationType) {
    if (! astNodes)    return false
    if (! astNodes[0]) return false
    if (! astNodes[1]) return false
    if (!(astNodes[0] instanceof AnnotationNode))        return false
    if (! astNodes[0].classNode?.name == annotationType) return false
    if (!(astNodes[1] instanceof MethodNode))            return false
    true
}

public void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
    if (!checkNode(astNodes, annotationType)) {
        // add an error message or a warning
        return
    }
    ...
}


while totally redundant and needless, still is a guard against the worst. Look up Byzantine faults if you truly want to become paranoid; this is actually a good thing if you are writing AST transformations (remember the old saying, that being paranoid does not mean that your are not being followed).

It is important to note, that in principle the visit()-method of your AST transformation could be called by the compiler in parallel for different parts of the source code that have been annotated with our annotation. While the current compiler is not parallelizing (at least with Groovy version until 1.8.2), you should be aware of this and not only program defensively, but allow for parallel execution as well.

The AST Transformation
Actually implementing the visit()-method is straightforward. First of all we verify that we have been called with the correct arguments, then we check that the value provided by the annotation is of the correct type (since we are using a simple String constant this will be a ConstantExpression), call a method createStatements() that creates the new AST subtree that we want to add at the beginning of the method, and insert the statements at the beginning of the subtree of statements that represents the method that has been annotated.

public void visit(ASTNode[] astNodes, SourceUnit sourceUnit) {
    if (!checkNode(astNodes, annotationType)) {
    
    // add an error message or a warning
        return
    }

    MethodNode annotatedMethod = astNodes[1]
    def annotationExpression = astNodes[0].members.value

    // add better error handling
    if (annotationExpression.class != ConstantExpression) return

    String annotationValueString = annotationExpression.value
    BlockStatement block = createStatements(annotationValueString)

    def methodStatements = annotatedMethod.code.statements
    methodStatements.add(0, block)
}


Piece of cake... If there wasn't the method createStatements(). In this method we could manually create the different node objects and assemble an AST representing the code that we want to add to the beginning of the method (see the previous blog entry). This works, but is tedious. Or...

Enter The ASTBuilder
The ASTBuilder has been written by Hamlet D'arcy expressly to easily create ASTs using a simple builder DSL. This is done using the method createFromSpec() that takes a closure (containing the DSL statements) as an argument. Additionally, the ASTBuilder provides a method buildFromCode() to create an AST directly from code expressed as a closure and a method buildFromString() that creates the AST from source code provided as a String. Interestingly the method buildFromCode() recreates a String representation from the supplied code and feeds that to the method buildFromString() to do its work.

Completing the Transformation
We will use the method buildFromString() to create our AST tree representing the if statement.


def createStatements(String clause) {
    def statements = """
        if(!($clause)) {
            throw new Exception('Precondition violated: {$clause}')
        }
    """


    AstBuilder ab = new AstBuilder()
     List<ASTNode> res = ab.buildFromString(CompilePhase.SEMANTIC_ANALYSIS, statements)
    BlockStatement bs = res[0]
    return bs
}


First of all we define a GString that contains the static parts of the code and into which we embed the precondition as the if-clause and in the message of the created exception. Then we use an ASTBuilder-object to call the buildFromString()-method on our GString and let it create the AST for the phase "Semantic Analysis" (as explained in the blog entry).
The AstBuilder returns a list of ASTNodes. The method buildFromString() offers an additional parameter statementsOnly (which is set to true by default). If this additional parameter is set to false, then the method, in addiition to the AST representing the statements, returns an AST representing the whole Script class surrounding the code fragment as the second entry in the list of ASTNodes. Regardless of the way you call it, the first entry is always the AST representing your statements. In our case this is a block statement that in turn contains our if statement.

Tying it all Together
The final step is to tell the compiler in which phase the AST transformation will run. We choose the same that we used for our AST builder, namely "Semantic Analysis". This is done with another annotation as follows:


@GroovyASTTransformation(phase = CompilePhase.SEMANTIC_ANALYSIS)
public class RequiresTransformation implements ASTTransformation {
...
}


End
What remains? Testing and better error signalling. Stay tuned for the next blog entries.

Sourcecode
Requires.groovyThe Requires-Annotation
RequiresTransformation.groovy    The AST Transformation for the Requires-Annotation

Interesting Classes
org.codehaus.groovy.ast.builder.AstBuilderThe Builder that allows to simply create ASTs.
org.codehaus.groovy.ast.builder.AstStringCompiler    The class responsible for creating ASTs from Strings.
org.codehaus.groovy.transform.ASTTransformation   The interface between compiler and transformation.

1 comment:

dustmite said...

Fantastic articles on AST! I can't wait to read the next one!