Grokking Rhino Mocks 2
Recently I have been doing research about integrating Rhino Mocks into the project at work. It brought back some memories because I used Rhino Mocks at my last job. The API has changed some since I used it last, but it took me no time to get started. In the past, we talked about experimenting with a mocking framework at my current job, but we never pursued it.
Typically we write stubs to isolate the class we are testing. This approach works great for testing the states of the object, but it can become cumbersome when the behavior needs to be tested. Some of the pros and cons using stubs are as follows:
Pros:
- Fast and light weight.
- Easier to write and understand.
Cons:
- Specialized methods are required in order to verify states.
- Doesn't test the behavior.
- More non-production code needs to be written and maintained.
- Time consuming for complicated interactions.
- Requires maintenance when the logic changes.
- Mundane and repetitive.
Currently we have over 2000 tests in the project and over 100 stubs! It would be nice if we could eliminate all the stubs. So far I have converted around 40 tests, and everything is going much smoother than I expected.
Using Rhino Mocks promotes behavior testing which has helped me to become more aware of what the code is actually doing. The tests will fail automatically if a method is unexpectedly called. It is also simple to ensure if expected calls are in correct sequence when order is important. One of my favorite features is the ability to test abstract base classes. We usually create a testable class which extends the abstract class in order to test the base behavior. I'm not a big fan of this approach but it does work.
After I converted the tests to use Rhino Mocks instead of stubs I noticed the following things:
- Removed unnecessary or duplicated method calls.
- General refactoring to make the code cleaner.
- More expressive tests using expectations (behaviors testing) and less asserts (state testing).
- Deletion of unnecessary stubs.
Overall, this has been a really good experience in taking a closer look at the tests we first wrote. I believe if the tests are healthy the code will also be. Of course, there's a penalty to be paid for using a mocking framework. It caused the tests to run a little bit slower than before. Personally, I don't mind the slower test runs if I can actually test the behavior and reduce the amount of code required.
Interesting article on mock object: Mocks aren't stubs
DiffMerge 3.0
For anyone that is tired of not being able to do a 3 way merge or directory comparsion with your diff tool- Sourcegear is offering a cross platform diff tool to do just that. Best of all, it's FREE. DiffMerge 3.0 can be downloaded from here.
How should your application be configured?
Deployment is an important process of the software cycle. It needs the same level of care as the application itself in order to have a professional and successful product. Deployment needs to make a good first impression. After all, it is the first thing a customer sees.
Deployment itself is a big topic. In this post, I only want to focus on the actual deployment package. Since I'm working on a Windows platform, I will specifically be talking about the Windows Installer technology.
Beside the daily software development, I'm also involved with the direction of our deployment package. As our software matures, the required configuration also increases. The first intuition is to allow the MSI package to do all the configurations. This approachhas several draw backs. These draw backs might even ruin the professional feel of the application.
Let's start off with how one would allow the MSI to do all the fancy configurations. The only way to do it in the MSI world is to use CustomActions (CA). Anyone who has written a CA will tell you it is not trivial to build it reliably. There are several ways CA can be written, but the most robust way is to use unmanaged C++. This requires not only the knowledge of C++, but a solid understanding of the internals of the MSI. On top of that, the UI facility in the MSI is far less simple than standard Winforms. Complicated configurations might require a fancier UI. This will also not be easy to achieve.
Let's say in a perfect world, CA comes for free. The problem doesn't stop here. What if the customer needs to reconfigure the software? Since the configuration is embedded into the MSI, customers would have no choice but to uninstall and reinstall. Even worse, during the uninstall, some machine or user specific files might be lost since the uninstall removes all the files that came with the package. Sure, there must be some way to change the configuration manually right? If the MSI did it for the customerin the first place, I'm sure he/she might have no idea how to do it manually.
I would like to propose a different approach: External configuration tool that ships with the package. The MSI can even fire up the configuration tool after a successful installation. This would solve all the problems I described above, and the code base can be more homogeneous to the application (Assuming you are not building a C++ app). It will be much easier to build an intelligent external tool that can provide real time feedback on any configuration errors. The tool can even receive commandline arguments passed in from the MSI for slient installation.
One could make the argument that the configuration can be done inside the MSI via CAs and still have an external tool to configure the application. You may wonder who would want to duplicate this effort. If you are in the .NET world, you will have to support two different codebases that configures the application.
I think everyone would agree reducing the numbers of CAs can improve the reliability and the compatibility of the installation package.
I'm not saying CAs should be completely avoided, but use them wisely. What I'm saying is the application should be configurable outside of the deployment package.
Is Excel the ultimate business solution? 5
Sometimes I wonder why exporting to Excel is commonly found in software? It seems like the first requirement for all business-like software is the ability to export to Excel. This is also apparent from the development side since many UI controls right out of the box have this magical ability baked in, especially 3rd party controls.
I understand Excel is a powerful tool, and it solves problems for many. It's just ironic to me that we are building software that exports to some other format, so people can use an external tool to make sense of the data. Shouldn't the software itself help people to analyze the data?
I have actually seen software that has no real ability to analyze the underlying data. The only way to make sense of the data is to export to Excel then run some fancy macro in Excel.
The baked in exporting feature also creates another dilemma. If they are made by different vendors, they probably don't export the same way (maybe even from the same vendor) in terms of layout and formats. The facility to change the exporting behavior is usually limited or non-existant.
One could argue that Excel is just another format that allows people to share data, and they should be able to manipulate the data in Excel however they want. I'm totally fine with that. What I have a problem with is that exporting to Excel shouldn't be a replacement for real business value features. We shouldn't design the application around some magical control because it can export to Excel.
Google Dahon Bikes 2
I have been checking out Dahon bikes for over a year now. Dahon specializes in folding bikes. A folding bike is perfect for me since I take the commuter train to work. The primary reason I did not purchase one yet is because I cannot figure out which model will best fit my needs. I am sure people from Google do not have this problem. Google recently pulled an Ikea by giving Dahon bikes to all of the Google employees. Well not all Google employees, just those in Europe, Africa, and Middle East only. I guess Google feels people in America would not trade in their Escalade for a bike.
Currently I have my eye on the Matrix (No I did not picked the bike because of its name).

A full list of 2007 Dahon's bikes can be found here. Which one would you get?
Custom Actions
I have recently put my installer hat on again. At work, we use Windows Installer XML(Wix) to create our MSIs. I have been away from Wix since I came up with a way to automate the MSI generation on each build.
In order to customize and increase installation experiences, custom actions are often required. Custom actions in the MSI world allow you to inject custom behaviors during installation via VBScript, JScript or C++.
It is crucial to understand custom action fully before starting to build them. Steven Bone wrote a great three part tutorial on this topic a while back.
- Custom Action Tutorial Part I – Custom Action Types and Sequences
- Custom Action Tutorial Part II – Creating the Project
- Custom Action Tutorial Part III – What we did in Part II
I really like this tutorial since there is so little material on this topic. Especially since creating custom actions is not a trivial task. If you do not believe me, give it a read ;)
ExcelXMLWriter
As Steve already mentioned, we are working on generation of Excel files this week. After doing some research, I found a free ExcelXMLWriter by Carlos Aguilar. It is written in C#, and does not require Excel to be installed in order to generate the files. We played around with it for a few days, it seems to do what we want. Carlos also offers a code generator tool that transforms Excel file into C# code.
Translation Part 2 via Abstract Syntax Tree
Last post I created a simple translator by using ANTLR V3 that translated C# (or Java) like syntax into PL/SQL. The translation relied heavily on String Template and token rewrite. It generated the following PL/SQL statements from a single class declaration:
- Drop table
- Drop sequence
- Create table
- Create sequence
- Create Trigger (sequence and trigger are for auto increment id column)
Notice there was not a one-to-one relationship between the input and the output. Output like this is ideal for template based translation, but not when you are constructing an Abstract Syntax Tree (AST). The reason is the tree nodes will have to be duplicated in order to render multiple items from a single input.
In this example, my language will stay the same, but output will only contain the creation of table. In the future post, I will change the language into something simpler, and the ability to output all the previous items (drop table, create auto-incremented column) separately. In this example I only want to focus on the basics on creating AST by keeping everything else as simple as possible.
Let's take look at the new parser and lexer grammar.
// CSharpSQL.g
grammar CSharpSQL;
options {
output = AST;
ASTLabelType = CommonTree;
}
tokens {
CLASSDEF;
VARDEF;
}
// parser
program : (declaration { System.out.println($declaration.tree.toStringTree()); } ) + ;
declaration : class_statement '{' (variable_statement)* '}'
-> ^(class_statement variable_statement*) ;
class_statement : scope_modifier 'class' ID
-> ^(CLASSDEF ID) ;
variable_statement : scope_modifier type ID ';'
-> ^(VARDEF type ID) ;
scope_modifier : 'public' ;
// more can be added
type : 'string'
| 'int'
| 'decimal'
| 'DateTime' ;
// lexer
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_') * ;
WS : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;The first thing to notice is the tokens declaration before the program rule. Since the the output will be a tree, tokens declaration defines a set of virtual tokens that can be used as tree nodes. Virtual token help the tree parser to understand the relationship between nodes. In this example there are two virtual tokens, CLASSDEF and VARDEF. We need a way to distinguish the difference between variable name and the name of the class since we cannot tell by the token itself. Class name will have CLASSDEF as its root node, and variable name will have VARDEF as its root node.
Each rule returns a subtree by "rewriting" the input using the -> notation. Everything on the right hand side of -> encodes the hierarchy of the subtree. The token closest to the ^ symbol is the root node, and rest within the parentheses are the children of the root. Notice that we only keep the meaningful nodes in the tree, and discard any unnecessary input by "rewriting" them. Some rules do not return a subtree, but simply delegate the other rules to return the subtree.
Now we need a tree walker to visit each node in the tree so it can render the final output. The tree walker grammar is shown below:
// Translate.g
tree grammar Translate;
options {
tokenVocab = CSharpSQL;
ASTLabelType = CommonTree;
}
@members {
String className;
List<String> columns = new ArrayList<String>();
}
program : (declaration
{
String table = "CREATE TABLE " + className + '\n' + "(" + '\n';
String seperator = ",";
Object[] arrayColumns = columns.toArray();
for(int i = 0; i < arrayColumns.length; i++) {
if(i == arrayColumns.length - 1) seperator = "";
table += " " + arrayColumns[i].toString() + seperator + '\n';
}
table += ")";
System.out.println(table);
columns.clear();
} ) + ;
declaration : ^(CLASSDEF ID variable_statement*) { className = $ID.text; } ;
variable_statement : ^(VARDEF type ID)
{
columns.add($ID.text + " " + $type.value + " NOT NULL");
} ;
type returns [String value]
: 'string' { $value = "nvarchar(255)"; }
| 'int' { $value = "integer"; }
| 'decimal' { $value = "number(21,6)"; }
| 'DateTime' { $value = "date"; };Here is where all the translation happens. Once again we have a couple of member variables declared in @members to help collect all the required fields for the translation. The tree grammar rule is very similar to parser or lexer rule. Since the input is a tree, you need to define rules that will match the nodes in the tree. You construct the rules by specifying the relationships between nodes using the same syntax parser used to create the tree. We know whenever the tree walker sees the CLASSDEF one of the child nodes (ID) is the class name. CLASSDEF can have zero or more variable_statement which means it can have zero or more VARDEF as its children. Each VARDEF contains exactly two children nodes, a type and ID which is used to construct column definition.
Let's run the recognizer using the following test program:
// Test.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
ANTLRInputStream input = new ANTLRInputStream(System.in);
CSharpSQLLexer lexer = new CSharpSQLLexer(input);
TokenRewriteStream tokens = new TokenRewriteStream(lexer);
CSharpSQLParser parser = new CSharpSQLParser(tokens);
CSharpSQLParser.program_return r = parser.program();
CommonTree tree = (CommonTree)r.getTree();
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
Translate walker = new Translate(nodes);
walker.program();
}
}Notice that the output of the parser is passed into the Translate tree walker for the translation.
CSharp to PL/SQL 3
UPDATED: Part 2 of this translation has been added.
After reading the ANTLR book, I decided to build a simple translator that will recognize C# like syntax and output PL/SQL. Even though I heard about ANTLR two years ago, I have not kept up with it, so I decided to start out simple. More complicated translator often involves creating an Abstract Syntrax Tree (AST), and a tree parser that will walk the tree during translation. In my translator, I will only use token rewrite and String Template in order to keep it simple. Rewrite allows you to "rewrite" the input token to something else. String Template is a template engine that will be used to specify the overall structure of the output.
The input language will look like the following:
public class Users {
public int UserId;
public string FirstName;
public string LastName;
}The output will be a script to create a Users table with 3 columns defined in the Users class. It will also create a sequence and a trigger on UserId column to auto increment the value on insert. Before creating any objects in the database it will check for the existence and drop them if they already exist.
The grammar used to generate lexer and parser is shown below (CSharpSQL.g):
grammar CSharpSQL;
options {
output = template;
rewrite = true;
}
@members {
String className;
List columns = new ArrayList();
}
// parser
program : declaration+ -> translate(
name = { className },
id = { className.substring(0,className.length() - 1) + columns.get(0) },
columns = { columns }
) ;
declaration : class_statment '{' (variable_statment)* '}' ;
class_statment : scope_modifier 'class' ID
{
className = $ID.text;
} ;
variable_statment : scope_modifier type ID ';'
{
String tmp = $ID.text;
if(tmp.toLowerCase().equals("id"))
tmp = className.substring(0, className.length() - 1) + tmp;
columns.add(tmp + " " + $type.text + " NOT NULL");
} ;
scope_modifier : 'public' ;
type : 'string' -> template() "nvarchar(255)"
| 'int' -> template() "integer"
| 'decimal' -> template() "number(21,6)"
| 'DateTime' -> template() "date" ;
// add more types if needed
// lexar
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_') * ;
WS : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;The grammar used here is a combined grammar which means it contains the rule for the lexer and parser. It uses a couple of member variables to remember matched tokens, so it can be passed to the template for the final generation.
The template used for the generation looks like the following (CSharpSQL.stg):
group CSharpSQL;
translate(name, id, columns) ::= <<
BEGIN
EXECUTE IMMEDIATE 'DROP TABLE <name> CASCADE CONSTRAINTS PURGE';
EXCEPTION WHEN OTHERS THEN NULL;
END;
/
BEGIN
EXECUTE IMMEDIATE 'DROP SEQUENCE <name>_SEQ';
EXCEPTION WHEN OTHERS THEN NULL;
END;
/
CREATE TABLE <name>
(
<columns; separator=",\n">
)
/
CREATE SEQUENCE <name>_SEQ
START WITH 1
INCREMENT BY 1
/
CREATE OR REPLACE TRIGGER <name>_TR
BEFORE INSERT ON <name>
FOR EACH ROW
DECLARE TEMP_NO int;
BEGIN
SELECT <name>_SEQ.NEXTVAL INTO :NEW.<id>Id FROM DUAL;
SELECT <name>_SEQ.CURRVAL INTO GLOBALPKG.IDENTITY FROM DUAL;
END;
/
>>
The recognizer can be tested with this test class (Test.java):
import org.antlr.runtime.*;
import org.antlr.stringtemplate.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
FileReader groupFileReader = new FileReader("CSharpSQL.stg");
StringTemplateGroup templates = new StringTemplateGroup(groupFileReader);
groupFileReader.close();
ANTLRInputStream input = new ANTLRInputStream(System.in);
CSharpSQLLexer lexer = new CSharpSQLLexer(input);
TokenRewriteStream tokens = new TokenRewriteStream(lexer);
CSharpSQLParser parser = new CSharpSQLParser(tokens);
parser.setTemplateLib(templates);
CSharpSQLParser.program_return r = parser.program();
StringTemplate output = r.getTemplate();
System.out.println(output.toString());
}
}The language I defined here is fairly limited, but it can be expanded if needed. The output can be easily retargeted to a different database if a different template is specified. The reason I created this translator is because writing PL/SQL can be a tedious task, and error prone. With a simple translator, it can do most of the work for me.
ANTLR

I have been following ANother Tool for Language Recognition (ANTLR) on and off for about two years now. I always felt ANTLR is one of the most exciting, and not as well known tools, out there. With ANTLR you can build a program that is able to recognize input, a recognizer, by specifying the grammar for a given input (language). ANTLR does all the dirty work for you- it builds lexical analyzers (input character stream, and output tokens) and parser (input token output syntax tree) for you when you feed it BNF grammar. Once you have a recognizer you can do all kinds of cool things like transformation. Basically you can build a Domain Specific Language (DSL) that can help you solve a problem.
I am glad to see there will be a Definitive ANTLR Reference hitting the book store soon. Currently the book is in beta and it is available for purchase in pdf form. After reading the book, I feel that I understand ANTLR at a much deeper level. I am planning to build some simple transformation tools with ANTLR in the near future. I will talk about it in later posts. If you want to be a savvy programmer, this book is a must read.
