ExcelXMLWriter

Posted by Aaron Feng Fri, 27 Apr 2007 04:23:21 GMT

As Steve already mentioned, we are working on generation of Excel files this week. After doing some research, I found a free ExcelXMLWriter by Carlos Aguilar. It is written in C#, and does not require Excel to be installed in order to generate the files. We played around with it for a few days, it seems to do what we want. Carlos also offers a code generator tool that transforms Excel file into C# code.

Translation Part 2 via Abstract Syntax Tree

Posted by Aaron Feng Fri, 27 Apr 2007 03:27:00 GMT

Last post I created a simple translator by using ANTLR V3 that translated C# (or Java) like syntax into PL/SQL. The translation relied heavily on String Template and token rewrite. It generated the following PL/SQL statements from a single class declaration:

  • Drop table
  • Drop sequence
  • Create table
  • Create sequence
  • Create Trigger (sequence and trigger are for auto increment id column)

Notice there was not a one-to-one relationship between the input and the output. Output like this is ideal for template based translation, but not when you are constructing an Abstract Syntax Tree (AST). The reason is the tree nodes will have to be duplicated in order to render multiple items from a single input.

In this example, my language will stay the same, but output will only contain the creation of table. In the future post, I will change the language into something simpler, and the ability to output all the previous items (drop table, create auto-incremented column) separately. In this example I only want to focus on the basics on creating AST by keeping everything else as simple as possible.

Let's take look at the new parser and lexer grammar.

// CSharpSQL.g

grammar CSharpSQL;
options {
 output = AST;
 ASTLabelType = CommonTree;
}

tokens {
 CLASSDEF;
 VARDEF;
}

// parser
program : (declaration { System.out.println($declaration.tree.toStringTree()); } ) + ;

declaration : class_statement '{' (variable_statement)* '}'
                -> ^(class_statement variable_statement*) ;

class_statement : scope_modifier 'class' ID
                   -> ^(CLASSDEF ID) ;

variable_statement : scope_modifier type ID  ';'
                      -> ^(VARDEF type ID) ;

scope_modifier : 'public' ;

// more can be added
type : 'string'
     | 'int'
     | 'decimal'
     | 'DateTime' ;

// lexer
ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_') * ;

WS  :   ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;

The first thing to notice is the tokens declaration before the program rule. Since the the output will be a tree, tokens declaration defines a set of virtual tokens that can be used as tree nodes. Virtual token help the tree parser to understand the relationship between nodes. In this example there are two virtual tokens, CLASSDEF and VARDEF. We need a way to distinguish the difference between variable name and the name of the class since we cannot tell by the token itself. Class name will have CLASSDEF as its root node, and variable name will have VARDEF as its root node.

Each rule returns a subtree by "rewriting" the input using the -> notation. Everything on the right hand side of -> encodes the hierarchy of the subtree. The token closest to the ^ symbol is the root node, and rest within the parentheses are the children of the root. Notice that we only keep the meaningful nodes in the tree, and discard any unnecessary input by "rewriting" them. Some rules do not return a subtree, but simply delegate the other rules to return the subtree.

Now we need a tree walker to visit each node in the tree so it can render the final output. The tree walker grammar is shown below:

// Translate.g
tree grammar Translate;

options {
 tokenVocab = CSharpSQL;
 ASTLabelType = CommonTree;
}

@members {
 String className;
 List<String> columns = new ArrayList<String>();
}

program : (declaration
          {
           String table = "CREATE TABLE " + className + '\n' + "(" + '\n';
           String seperator = ",";
           Object[] arrayColumns = columns.toArray();
           for(int i = 0; i < arrayColumns.length; i++) {
            if(i == arrayColumns.length - 1) seperator = "";
            table += " " + arrayColumns[i].toString() + seperator + '\n';
           }
           table += ")";
           System.out.println(table);
           columns.clear();
          } ) + ;

declaration : ^(CLASSDEF ID variable_statement*) { className = $ID.text; } ;

variable_statement : ^(VARDEF type ID)
                     {
                      columns.add($ID.text + " " + $type.value + " NOT NULL");
                     } ;

type returns [String value]
    : 'string' { $value = "nvarchar(255)"; }
    | 'int'   { $value = "integer"; }
    | 'decimal' { $value = "number(21,6)"; }
    | 'DateTime'  { $value = "date"; };

Here is where all the translation happens. Once again we have a couple of member variables declared in @members to help collect all the required fields for the translation. The tree grammar rule is very similar to parser or lexer rule. Since the input is a tree, you need to define rules that will match the nodes in the tree. You construct the rules by specifying the relationships between nodes using the same syntax parser used to create the tree. We know whenever the tree walker sees the CLASSDEF one of the child nodes (ID) is the class name. CLASSDEF can have zero or more variable_statement which means it can have zero or more VARDEF as its children. Each VARDEF contains exactly two children nodes, a type and ID which is used to construct column definition.

Let's run the recognizer using the following test program:

// Test.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import java.io.*;

public class Test {
 public static void main(String[] args) throws Exception {
  ANTLRInputStream input = new ANTLRInputStream(System.in);
  CSharpSQLLexer lexer = new CSharpSQLLexer(input);
  TokenRewriteStream tokens = new TokenRewriteStream(lexer);

  CSharpSQLParser parser = new CSharpSQLParser(tokens);
  CSharpSQLParser.program_return r = parser.program();

  CommonTree tree = (CommonTree)r.getTree();
  CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
  Translate walker = new Translate(nodes);
  walker.program();
 }
}

Notice that the output of the parser is passed into the Translate tree walker for the translation.

CSharp to PL/SQL 3

Posted by Aaron Feng Sun, 15 Apr 2007 01:25:00 GMT

UPDATED: Part 2 of this translation has been added.

After reading the ANTLR book, I decided to build a simple translator that will recognize C# like syntax and output PL/SQL. Even though I heard about ANTLR two years ago, I have not kept up with it, so I decided to start out simple. More complicated translator often involves creating an Abstract Syntrax Tree (AST), and a tree parser that will walk the tree during translation. In my translator, I will only use token rewrite and String Template in order to keep it simple. Rewrite allows you to "rewrite" the input token to something else. String Template is a template engine that will be used to specify the overall structure of the output.

The input language will look like the following:

public class Users {
    public int UserId;
    public string FirstName;
    public string LastName;
}

The output will be a script to create a Users table with 3 columns defined in the Users class. It will also create a sequence and a trigger on UserId column to auto increment the value on insert. Before creating any objects in the database it will check for the existence and drop them if they already exist.

The grammar used to generate lexer and parser is shown below (CSharpSQL.g):

grammar CSharpSQL;

options {
 output = template;
 rewrite = true;
}

@members {
  String className;
  List columns = new ArrayList();
}

// parser
program : declaration+ -> translate(
 name = { className }, 
 id = { className.substring(0,className.length() - 1) + columns.get(0) }, 
 columns = { columns } 
) ;

declaration : class_statment '{' (variable_statment)* '}' ;


class_statment : scope_modifier 'class' ID
{  
  className = $ID.text; 
} ;

variable_statment : scope_modifier type ID  ';'  
{    
  String tmp = $ID.text;
  if(tmp.toLowerCase().equals("id"))
   tmp = className.substring(0, className.length() - 1) + tmp;
  columns.add(tmp + " " + $type.text + " NOT NULL");
} ;

scope_modifier : 'public' ;

type : 'string' -> template() "nvarchar(255)"
     | 'int' -> template() "integer"
     | 'decimal' -> template() "number(21,6)"
     | 'DateTime' -> template() "date" ;
     // add more types if needed

// lexar
ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_') * ;

WS  :   ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;

The grammar used here is a combined grammar which means it contains the rule for the lexer and parser. It uses a couple of member variables to remember matched tokens, so it can be passed to the template for the final generation.

The template used for the generation looks like the following (CSharpSQL.stg):

group CSharpSQL;

translate(name, id, columns) ::= <<
BEGIN
EXECUTE IMMEDIATE 'DROP TABLE <name> CASCADE CONSTRAINTS PURGE';
EXCEPTION WHEN OTHERS THEN NULL;
END;
/

BEGIN
EXECUTE IMMEDIATE 'DROP SEQUENCE <name>_SEQ';
EXCEPTION WHEN OTHERS THEN NULL;
END;
/

CREATE TABLE <name>
(
  <columns; separator=",\n">
)
/

CREATE SEQUENCE <name>_SEQ
        START WITH 1
        INCREMENT BY 1
/

CREATE OR REPLACE TRIGGER <name>_TR
BEFORE INSERT ON <name>
FOR EACH ROW
DECLARE TEMP_NO int;
 BEGIN
     SELECT <name>_SEQ.NEXTVAL INTO :NEW.<id>Id FROM DUAL;
     SELECT <name>_SEQ.CURRVAL INTO GLOBALPKG.IDENTITY FROM DUAL;
END;
/
>>

The recognizer can be tested with this test class (Test.java):

import org.antlr.runtime.*;
import org.antlr.stringtemplate.*;
import java.io.*;

public class Test {
 public static void main(String[] args) throws Exception {
  FileReader groupFileReader = new FileReader("CSharpSQL.stg");
  StringTemplateGroup templates = new StringTemplateGroup(groupFileReader);
  groupFileReader.close();

  ANTLRInputStream input = new ANTLRInputStream(System.in);
  CSharpSQLLexer lexer = new CSharpSQLLexer(input);

  TokenRewriteStream tokens = new TokenRewriteStream(lexer);

  CSharpSQLParser parser = new CSharpSQLParser(tokens);
  parser.setTemplateLib(templates);

  CSharpSQLParser.program_return r = parser.program();
  StringTemplate output = r.getTemplate();
  System.out.println(output.toString());
 }
}

The language I defined here is fairly limited, but it can be expanded if needed. The output can be easily retargeted to a different database if a different template is specified. The reason I created this translator is because writing PL/SQL can be a tedious task, and error prone. With a simple translator, it can do most of the work for me.

Method inheritance? 1

Posted by Aaron Feng Tue, 13 Mar 2007 03:05:00 GMT

A couple of days ago I was poking around Park Place source code and I saw something resembling the following syntax:

# ... is some arbitrary string in this case
class A < B '...'
 # ...
end

I am no Ruby expert, in fact, I have written very little Ruby code so far. However, the code above looked really odd to me. It looks like class A is inheriting from class B, and class B is taking some arbitrary string as an argument.

Well, after closer inspection I was completely wrong on what B is. B is not a class at all, in fact, it is a method under cover.

def B str
 # do something with str variable here
end

'...' is some argument to the B method, which makes a lot of sense, but how can class A inherit from a method? If you try the code above, whatever is in method B will execute, but you will get an error when class A is trying to inherit from B. The reason is simple, class A is expecting to inherit from a class not a method. To prevent the error, you need to somehow construct a class in method B.

def B str
 # do something with str variable here
 Class.new
end

Ruby automatically returns the last line as the return value. In this case, you will not get the error anymore.

The first question one should ask is, why would anyone want to do this? The code I showed above is a very simple version of the real code that appeared in Park Place. In Park Place it creates a new class and adds methods to the class based on the input. This is powerful because it allows one to hide the base class since it does not actually exist. On top of that, using this technique one can write code that writes code. Ruby on Rails uses similar techniques to achieve all the magic.

Host your own Amazon S3

Posted by Aaron Feng Thu, 08 Mar 2007 14:17:00 GMT

Park Place is an Amazon S3 clone in Ruby minus the SOAP support. It is amazing how little amount of code is required to achieve what S3 does.

Another interesting aspect of Park Place is that it uses Camping microframework. Camping is similar to Rails in the sense that it is a web framework that utilize the MVC design pattern. It is small web framework that weighs less than 4k! It is described as "a little white blood cell in the vein of Rails" on [Camping's][camp] wiki. You can see all the code in the framework in one page of your browser.

You can actually migrate from Camping to Rails fairly easily. That got me thinking, maybe [Shoechicken][shoechicken] should use Camping instead of Rails to start with. We can then migrate to Rails as the need arises.

Older posts: 1 2 3 4