Implementing Interface in JRuby 1
Recently I decided to convert one of the tools I wrote in C# to JRuby. During the process I needed to implement a Java interface in JRuby. This sounds really weird to me because Ruby has no need for an interface because of the duck-typing system. The syntax for implementing an interface in JRuby is shown below:
class SomeClass
include javax.swing.tree.TreeCellEditor
# ...
endIn this case, SomeClass implements TreeCellEditor Java inteface. It uses mix-in like syntax. Prior to the 1.0 release, I believe it used ruby inheritance syntax instead of the current mix-in syntax. That would look even more weird because Ruby doesn't support multiple inheritance and it will be confusing when you are implemneting an interface and extending a class.
I don't know about you, but implementing an interface in Ruby still feels weird to me.
Time to go Functional
In my college days I avoided functional programming languages such as LISP and Prolog like the plague. Recently I decided to revisit and learn a functional language. My choice of language is Erlang (as you can see from my last post).
One of the primary reasons I chose Erlang over other functional languages is Erlang's ability to handle concurrency and distributed programming. Before I get into concurrent programming using Erlang I need to learn the basics. I started my Erlang adventure by reading Getting Started with Erlang. I would recommend anyone that is interesting in Erlang to start there. I found the reading was easy and helpful for someone who doesn't know anything about Erlang. Like many other dynamic lanaguages, Erlang comes with an interpreter. I found it very helpful to enter the code into the interpreter while I'm reading the examples. Next step is to read Joe Armstrong's book: Programming Erlang Maybe one day I can find a good use for Erlang.
Erlang in Action
I decided to see if I could rewrite the problem I solved in a previous post in Erlang. Here is the problem again:
A/BC + D/EF + G/HI = 1
-module(constraint).
-export([solve/0]).
solve() ->
solve(generate_unique()).
solve([A,B,C,D,E,F,G,H,I]) when A/(10 * B + C) +
D/(10 * E + F) +
G/(10 * H + I) /= 1 ->
solve();
solve([A,B,C,D,E,F,G,H,I]) ->
io:format("~w/~w~w + ~w/~w~w + ~w/~w~w = 1 ~n",
[A,B,C,D,E,F,G,H,I]).
generate_unique() ->
generate_unique(0, []).
generate_unique(9, List) ->
List;
generate_unique(N, List) ->
Number = random:uniform(9),
case lists:member(Number, List) of
true -> generate_unique(N, List);
false -> generate_unique(N + 1, [Number|List])
end.Gecode/r Under Mac OSX
I had a little bit of trouble getting gecode/r running at first. Gecode/r is a wrapper around Gecode, so you'll need to also install Gecode before installing Gecoder/r. I started out using the disk image for Mac OSX 10.4, but Gecode/r would produce the following error when I tried to require Gecode/r:
irb(main):002:0> require 'gecoder'
dyld: NSLinkModule() error
dyld: Symbol not found: __ZNK6Gecode9Exception4whatEv
Referenced from:
/usr/local/lib/ruby/gems/1.8/gems/gecoder-0.3.0/lib/gecode.bundle
Expected in: flat namespaceIt turns out, I have to download Gecode source and compile it on my Mac. After that, all I had to do is: gem install gecoder.
Gecode/r
Recently I solved my first constraint programming problem using C# (Steve Eichert also solved it via Ruby). I decided to poke around the Internet to see if there's an API for constraint programming. After all, there are real world problems that can be solved using constraint programming. I can't imagine people just hand roll there own API for complex problems.
After few searches in google, I came across Gecode. Gecode is an open source API implemented in C++ distributed under the BSD style license. I noticed there is also Gecode/J which is a Java interface to Gecode. I started to wonder, is there a Ruby version of Gecode? It turns out Andreas Launila recently started Gecode/r under Google Summer of Code.
Gecode/r is still at the early stage, but there is a downloadable alpha version. I tried to use it to solve the same problem, however, I ran into some difficulties. It currently only supports 3 mathematical operations: +, -, *. No division for now. Andreas suggested I rewrite the equation by removing the division. The only problem with rewriting is that the equation actually becomes more complex. I haven't been able to get my version to work yet.
For those who are interested in what Gecode/r looks like, below is an example of solving the classic send more money problem using gecode/r, which is taken from gecode/r's example page:
class SendMoreMoney < Gecode::Model
def initialize
s,e,n,d,m,o,r,y = @letters = int_var_array(8, 0..9)
(equation_row(s, e, n, d) + equation_row(m, o, r, e)).must ==
equation_row(m, o, n, e, y)
s.must_not == 0
m.must_not == 0
@letters.must_be.distinct
branch_on @letters, :variable => :smallest_size, :value => :min
end
def to_s
%w{s e n d m o r y}.zip(@letters).map do |text, letter|
"#{text}: #{letter.val}"
end.join(', ')
end
private
# A helper to make the linear equation a bit tidier. Takes a number of
# variables and computes the linear combination as if the variable
# were digits in a base 10 number. E.g. x,y,z becomes
# 100*x + 10*y + z .
def equation_row(*variables)
variables.inject{ |result, variable| variable + result*10 }
end
end
puts SendMoreMoney.new.solve!.to_sBrain Teaser: Constraint programming
I had never heard of constraint programming until a co-worker wrote the following problem on the board a few days ago:
A/BC + D/EF + G/HI = 1
All variables are unique digits from 1 - 9. Two variables next to each other is equivalent to 10 * var1 + var2. For example, if B = 2, C = 5 then the result is 10 * 2 + 5 which equals 25.
I decided to whip up a program to find out the answer.
class Program {
static List<int> ints = new List<int>();
static Random r = new Random();
static void Main(string[] args) {
double A = 0, B = 0, C = 0, D = 0, E = 0, F = 0, G = 0, H = 0, I = 0;
while (!(((A / (10 * B + C)) + (D / (10 * E + F)) + (G / (10 * H + I))) == 1)) {
GenerateUniqueInts();
A = ints[0]; B = ints[1]; C = ints[2];
D = ints[3]; E = ints[4]; F = ints[5];
G = ints[6]; H = ints[7]; I = ints[8];
}
Console.WriteLine(String.Format("{0}/{1}{2} + {3}/{4}{5} + {6}/{7}{8} = 1",
A, B, C, D, E, F, G, H, I));
}
public static void GenerateUniqueInts() {
ints.Clear();
int value = (r.Next() % 9) + 1;
while (!(ints.Count == 9)) {
if (!ints.Contains(value)) ints.Add(value);
value = (r.Next() % 9) + 1;
}
}
}It only takes a couple of seconds for the program to figure out the answer. It turns out there is only one solution to this problem. However, on my first implementation I instantiated the Random object in the GenerateUniqueInts() method. For some reason, that can take up 10 minutes to run. If you are interested in the answer, try the program out.
Grokking Rhino Mocks 2
Recently I have been doing research about integrating Rhino Mocks into the project at work. It brought back some memories because I used Rhino Mocks at my last job. The API has changed some since I used it last, but it took me no time to get started. In the past, we talked about experimenting with a mocking framework at my current job, but we never pursued it.
Typically we write stubs to isolate the class we are testing. This approach works great for testing the states of the object, but it can become cumbersome when the behavior needs to be tested. Some of the pros and cons using stubs are as follows:
Pros:
- Fast and light weight.
- Easier to write and understand.
Cons:
- Specialized methods are required in order to verify states.
- Doesn't test the behavior.
- More non-production code needs to be written and maintained.
- Time consuming for complicated interactions.
- Requires maintenance when the logic changes.
- Mundane and repetitive.
Currently we have over 2000 tests in the project and over 100 stubs! It would be nice if we could eliminate all the stubs. So far I have converted around 40 tests, and everything is going much smoother than I expected.
Using Rhino Mocks promotes behavior testing which has helped me to become more aware of what the code is actually doing. The tests will fail automatically if a method is unexpectedly called. It is also simple to ensure if expected calls are in correct sequence when order is important. One of my favorite features is the ability to test abstract base classes. We usually create a testable class which extends the abstract class in order to test the base behavior. I'm not a big fan of this approach but it does work.
After I converted the tests to use Rhino Mocks instead of stubs I noticed the following things:
- Removed unnecessary or duplicated method calls.
- General refactoring to make the code cleaner.
- More expressive tests using expectations (behaviors testing) and less asserts (state testing).
- Deletion of unnecessary stubs.
Overall, this has been a really good experience in taking a closer look at the tests we first wrote. I believe if the tests are healthy the code will also be. Of course, there's a penalty to be paid for using a mocking framework. It caused the tests to run a little bit slower than before. Personally, I don't mind the slower test runs if I can actually test the behavior and reduce the amount of code required.
Interesting article on mock object: Mocks aren't stubs
Custom Actions
I have recently put my installer hat on again. At work, we use Windows Installer XML(Wix) to create our MSIs. I have been away from Wix since I came up with a way to automate the MSI generation on each build.
In order to customize and increase installation experiences, custom actions are often required. Custom actions in the MSI world allow you to inject custom behaviors during installation via VBScript, JScript or C++.
It is crucial to understand custom action fully before starting to build them. Steven Bone wrote a great three part tutorial on this topic a while back.
- Custom Action Tutorial Part I – Custom Action Types and Sequences
- Custom Action Tutorial Part II – Creating the Project
- Custom Action Tutorial Part III – What we did in Part II
I really like this tutorial since there is so little material on this topic. Especially since creating custom actions is not a trivial task. If you do not believe me, give it a read ;)
ExcelXMLWriter
As Steve already mentioned, we are working on generation of Excel files this week. After doing some research, I found a free ExcelXMLWriter by Carlos Aguilar. It is written in C#, and does not require Excel to be installed in order to generate the files. We played around with it for a few days, it seems to do what we want. Carlos also offers a code generator tool that transforms Excel file into C# code.
Translation Part 2 via Abstract Syntax Tree
Last post I created a simple translator by using ANTLR V3 that translated C# (or Java) like syntax into PL/SQL. The translation relied heavily on String Template and token rewrite. It generated the following PL/SQL statements from a single class declaration:
- Drop table
- Drop sequence
- Create table
- Create sequence
- Create Trigger (sequence and trigger are for auto increment id column)
Notice there was not a one-to-one relationship between the input and the output. Output like this is ideal for template based translation, but not when you are constructing an Abstract Syntax Tree (AST). The reason is the tree nodes will have to be duplicated in order to render multiple items from a single input.
In this example, my language will stay the same, but output will only contain the creation of table. In the future post, I will change the language into something simpler, and the ability to output all the previous items (drop table, create auto-incremented column) separately. In this example I only want to focus on the basics on creating AST by keeping everything else as simple as possible.
Let's take look at the new parser and lexer grammar.
// CSharpSQL.g
grammar CSharpSQL;
options {
output = AST;
ASTLabelType = CommonTree;
}
tokens {
CLASSDEF;
VARDEF;
}
// parser
program : (declaration { System.out.println($declaration.tree.toStringTree()); } ) + ;
declaration : class_statement '{' (variable_statement)* '}'
-> ^(class_statement variable_statement*) ;
class_statement : scope_modifier 'class' ID
-> ^(CLASSDEF ID) ;
variable_statement : scope_modifier type ID ';'
-> ^(VARDEF type ID) ;
scope_modifier : 'public' ;
// more can be added
type : 'string'
| 'int'
| 'decimal'
| 'DateTime' ;
// lexer
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_') * ;
WS : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN; } ;The first thing to notice is the tokens declaration before the program rule. Since the the output will be a tree, tokens declaration defines a set of virtual tokens that can be used as tree nodes. Virtual token help the tree parser to understand the relationship between nodes. In this example there are two virtual tokens, CLASSDEF and VARDEF. We need a way to distinguish the difference between variable name and the name of the class since we cannot tell by the token itself. Class name will have CLASSDEF as its root node, and variable name will have VARDEF as its root node.
Each rule returns a subtree by "rewriting" the input using the -> notation. Everything on the right hand side of -> encodes the hierarchy of the subtree. The token closest to the ^ symbol is the root node, and rest within the parentheses are the children of the root. Notice that we only keep the meaningful nodes in the tree, and discard any unnecessary input by "rewriting" them. Some rules do not return a subtree, but simply delegate the other rules to return the subtree.
Now we need a tree walker to visit each node in the tree so it can render the final output. The tree walker grammar is shown below:
// Translate.g
tree grammar Translate;
options {
tokenVocab = CSharpSQL;
ASTLabelType = CommonTree;
}
@members {
String className;
List<String> columns = new ArrayList<String>();
}
program : (declaration
{
String table = "CREATE TABLE " + className + '\n' + "(" + '\n';
String seperator = ",";
Object[] arrayColumns = columns.toArray();
for(int i = 0; i < arrayColumns.length; i++) {
if(i == arrayColumns.length - 1) seperator = "";
table += " " + arrayColumns[i].toString() + seperator + '\n';
}
table += ")";
System.out.println(table);
columns.clear();
} ) + ;
declaration : ^(CLASSDEF ID variable_statement*) { className = $ID.text; } ;
variable_statement : ^(VARDEF type ID)
{
columns.add($ID.text + " " + $type.value + " NOT NULL");
} ;
type returns [String value]
: 'string' { $value = "nvarchar(255)"; }
| 'int' { $value = "integer"; }
| 'decimal' { $value = "number(21,6)"; }
| 'DateTime' { $value = "date"; };Here is where all the translation happens. Once again we have a couple of member variables declared in @members to help collect all the required fields for the translation. The tree grammar rule is very similar to parser or lexer rule. Since the input is a tree, you need to define rules that will match the nodes in the tree. You construct the rules by specifying the relationships between nodes using the same syntax parser used to create the tree. We know whenever the tree walker sees the CLASSDEF one of the child nodes (ID) is the class name. CLASSDEF can have zero or more variable_statement which means it can have zero or more VARDEF as its children. Each VARDEF contains exactly two children nodes, a type and ID which is used to construct column definition.
Let's run the recognizer using the following test program:
// Test.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import java.io.*;
public class Test {
public static void main(String[] args) throws Exception {
ANTLRInputStream input = new ANTLRInputStream(System.in);
CSharpSQLLexer lexer = new CSharpSQLLexer(input);
TokenRewriteStream tokens = new TokenRewriteStream(lexer);
CSharpSQLParser parser = new CSharpSQLParser(tokens);
CSharpSQLParser.program_return r = parser.program();
CommonTree tree = (CommonTree)r.getTree();
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
Translate walker = new Translate(nodes);
walker.program();
}
}Notice that the output of the parser is passed into the Translate tree walker for the translation.
