Page snatcher 2
A couple of months ago I wrote a utility that will download a web page with all the dependencies (css, and images) to your hard drive. All the references in the web page will be changed to refer to your local copy.
I wrote it as a prototype, and it took me 30 to 40 minutes to write it, so I'm sure there is room for improvement. I pointed to a few web pages, such as amazon, ebay, google, and my blog it worked pretty well!
The code requires Why the lucky stiff's Hpricot library. With out further adieu, here is the code below:
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'rio'
module Hpricot
class Elem
def is_css
if self.name == "link"
self["type"] == "text/css"
else
false
end
end
def is_full_path
if self.name == "link"
self["href"][0..6] == "http://"
elsif self.name == "img"
self["src"][0..6] == "http://"
else
false
end
end
end
end
if ARGV.size.zero?
puts "Missing web page you wish to snatch."
exit
end
url_scheme = "http://"
url = ARGV[0]
doc = Hpricot(open(url_scheme + url))
Dir.mkdir(url) unless File.directory?(url)
doc.search("link") do |item|
if item.is_css
if item.is_full_path
rio(item['href']) > rio(url)
else
rio(url_scheme + url + item['href']) > rio(url)
end
# nested style sheets in another style sheet
css_path = File.dirname(item['href'])
css_file = File.basename(item['href']).scan(/(.*?\.css)/m).flatten.to_s
file = File.open(url + "/" + css_file,"r")
inner_css = file.read.scan(/@import '(.*?\.css)';/m).flatten
inner_css.each do |css|
css_url = url_scheme + url + css_path + "/" + css
rio(css_url) > rio(url)
end
file.close
item['href'] = css_file
end
end
doc.search("img") do |item|
if item.is_full_path
rio(item["src"]) > rio(url)
else
rio(url_scheme + url + item["src"]) > rio(url)
end
item["src"] = item["src"].split("/")[-1]
end
File.open(url + "/" + url + ".html", "w") do |file|
file << doc.to_s
endSlime video and transcript 2
Learning Lisp has been a challenge so far. I'm plunging my way through Practical Common Lisp at a decent pace though. Learning Lisp development environment is another story. Since all the hardcore Lisp hackers are all using emacs with slime I figured I would do the same. I have no previous experience with emacs, so I'm simultaneously learning 3 things at once.
One night I came across Maco Baringer's slime video. After I watched the video, all the dots in my head started to connect. In the video, Maco demonstrated the usage of slime while he developed a morse code application. I kept rewinding to figure out the key strokes used. Recently Peter Christensen wrote a transcript for the slime video. This made the video much easier to follow. Christensen also has a transcript for Baringer's Hello World video using Uncommon Web framework.
Besides the transcripts, Christensen is also working on emacs/slime cheat sheet. As a noob learning Lisp, I need all the help I can get.
Coming out of the Lisp dungeon 1
For the past few weeks I have been playing around with Lisp during pretty much all my free time. Yes, Lisp. This is not my first encounter with Lisp. I played around with Lisp when I was in college, and I hated every moment of it. Functional languages have always been foreign to me, and I tried to stay far away from that path. I'm at the point in my career I can no longer ignore the existence of functional languages.
Last year I decided to learn Erlang since it received a lot of buzz in the community. Shortly after learning the basics, I lost interest. Even though I no longer keep up with Erlang on a regular basis, it has eased me into the realm of functional languages.
Back in 2004, I read Hackers & Painters by Paul Graham. The chapter entitled "Beating the average" has stuck in my mind since I first read it. In this chapter, Graham described how he was able to overtake the big corporations during the internet boom using an unconventional programming language: Lisp. I decided to give the chapter another read to refresh my mind. After reading it again, I was inspiredto give Lisp another try. To setup my Lisp environment, I installed emacs, slime, and sbcl on my macbook.
After going through a few online Common Lisp tutorials, I just couldn't get enough of it. The more I learned about it, the deeper I wanted to understand.Eventually I stumbled on to Practical Common Lisp book by Peter Seibel. The book is very well written and easy to read. By the nature of the language, Lisp tends to be more theoretical. Seibel connected the theoretical with the practical which made the book relevent and enjoyable. Best of all, the whole book is freely available online. Even if you have no desire to learn Lisp, just read the first 3 chapters (they are very short). Who knows, you might just continue to read the whole thing.
Project Euler Solutions 1 - 5 5
Click on each problem for a more detailed solution.
1. Add all the natural numbers below 1000 that are multiples of 3 or 5.
start = Time.now
total = 0
(1...1000).each do |n|
total += n if (n % 3).zero? or (n % 5).zero?
end
puts "Took: #{Time.now - start} seconds"
puts totalTook: 0.000977 seconds
start = Time.now
def fib(n1, n2, total)
return total if n2 > 1000000
total += n2 if (n2 % 2).zero?
fib(n2, n1 + n2, total)
end
puts "Took: #{Time.now - start} seconds"
puts fib(1, 2, 0)Took: 1.0e-05 seconds
3. Find the largest prime factor of 317584931803.
def next_prime(start_num, max_num)
is_prime = true
prime = 0
(start_num + 1..max_num).each do |n|
prime = n
(2..n - 1).each do |nn|
if (n % nn).zero? then is_prime = false; break; end
end
if is_prime then break; end
is_prime = true
end
prime
end
start = Time.now
n = 1
biggest_prime = 0
num = 317584931803
while(num != 1 or num > n)
n = next_prime(n, num)
if n > 0
result = num % n
if (result).zero?
biggest_prime = n
num = num / n
end
end
n += 1
end
puts "Took: #{Time.now - start} seconds"
puts biggest_primeTook: 0.477182 seconds
4. Find the largest palindrome made from the product of two 3-digit numbers.
start = Time.now
result = 0
left = 0
right = 0
(100...1000).to_a.reverse.each do |l|
(100...1000).to_a.reverse.each do |r|
temp = (l * r).to_s
if temp == temp.reverse and temp.to_i > result
result = temp.to_i
left = l
right = r
end
end
end
puts "Took: #{Time.now - start} seconds"
puts "#{left} * #{right} = #{result}"Took: 1.501993 seconds
5. What is the smallest number divisible by each of the numbers 1 to 20?
def is_prime(num)
if num < 2 then return false; end
(2..num - 1).each do |n|
if (num % n).zero?
return false
end
end
return true
end
def smallest_factor(num)
(2..num - 1).each do |n|
if (num % n).zero? then return n; end
end
return num
end
start = Time.now
result = 1
(1..20).each do |n|
if is_prime(n)
result = result * n
elsif not (result % n).zero?
result = result * smallest_factor(n)
end
end
puts "Took: #{Time.now - start} seconds"
puts resultTook: 0.00018 seconds
Project Euler
A while back my friend James Horsley told me about Project Euler. I just pushed it onto my stack of things to look into. Recently, I was reminded of it again from Steve Eichert at work, so decided to give a try.
Project Euler contains a collection of mathematical problems ranging in difficulties. A problem can be solved using pencil and paper or using a computer program. The only requirement using a computer program is that it should run under one minute. It's an honor system because all you need to submit is the answer.
My choice of weapon is Ruby. I have been spectating Ruby for the past 5 years (not entirely true, but mostly). I figured it's time to roll up my sleeves. I'll be posting my solutions in batches for those who are interested. In addition to posting the solutions, I'll also post the amount of time each solution took. However, I will not post the answers. You can run the code on your own machine if you wish to see the answers. This way I will not ruin it for people who are interested in solving the problems themselves. All the code has been run on my MacBook on an Intel Core 2 Duo 2.0 Ghz with Ruby 1.8.
Hardcore Erlang
There is definitely a lot of momentum behind Erlang recently and more is about to come. A few months ago Joe Armstrong released Programming in Erlang which set off the initial Erlang awareness for many people including myself. Earlier this month, Channel 9 posted two videos with Armstrong on Erlang (part1 and part2).
Now another Erlang book is in progress: Hardcore Erlang by Joel Reymont.
It is also another Pragmatic Programmers book. The inital project for the book was
a poker server, but now the focus is on a stock exchange program. A quote from
Reymont:
"So lets build a stock exchange! Not just any stock exchange but one running on the biggest Erlang cluster in the world. This cluster does not exist yet but can be put together on a moments notice, using Amazon EC2."
This book might just keep Erlang on the hotness list for the year of 2008.
Another reason for Rhino Mocks Generic Constraint 2
Jeffrey Palermo recently had a post about Generic Constraints for Rhino Mocks - make unit tests more readable. I would like to touch on an alternative reason why you might want to use Generic Constraint. I'll reiterate Jeffrey's example before I start, with minor modification, and offer possible alternative ways the tests can be written.
Jeffrey implemented a GenericConstraint class to capture the parameter of the method call on a mock object. Below resembles his original example:
[Test]
public void ShouldSaveObjectWithAllInformation() {
string firstName = "Aaron";
string lastName = "Feng";
MockRepository mocks = new MockRepository();
IPersonRepository personRepository = mocks.CreateMock<IPersonRepository>();
personRepository.SavePerson(null);
GenericConstraint<Person> personConstraint = new GenericConstraint<Person>();
LastCall.On(personRepository).Constraints(personConstraint);
mocks.ReplayAll();
PersonController controller = new PersonController(personRepository);
controller.PersonFirstName = "Aaron";
controller.PersonLastName = "Feng";
controller.Save();
mocks.VerifyAll();
Person person = personConstraint.GetParameterObject();
Assert.AreEqual(person.FirstName, firstName);
Assert.AreEqual(person.LastName, lastName);
}Once the PersonConstraint captured the Person that is being saved, he asserted that the values are as expected. This makes the test look more like a typical unit test.
Jeffrey's goal was to avoid the following code:
public delegate void Proc<P>(P p);
[Test]
public void ShouldSaveObjectWithAllInformationUsingBuildInConstraint() {
string firstName = "Aaron";
string lastName = "Feng";
MockRepository mocks = new MockRepository();
IPersonRepository personRepository = mocks.CreateMock<IPersonRepository>();
personRepository.SavePerson(null);
LastCall.On(personRepository).IgnoreArguments().Do(
new Proc<Person>(delegate(Person person) {
Assert.AreEqual(person.FirstName, firstName);
Assert.AreEqual(person.LastName, lastName);
})
);
mocks.ReplayAll();
PersonController controller = new PersonController(personRepository);
controller.PersonFirstName = "Aaron";
controller.PersonLastName = "Feng";
controller.Save();
mocks.VerifyAll();
// Notice no asserts
}An astute reader might say: "Hey you don't have to do that, just implement the Equals method on the Person class." Which would look like the following:
[Test]
public void ShouldSaveObjectWithAllInformationUsingEquals() {
MockRepository mocks = new MockRepository();
IPersonRepository personRepository = mocks.CreateMock<IPersonRepository>();
// Have to implement Equals on Person
personRepository.SavePerson(new Person("Aaron", "Feng"));
mocks.ReplayAll();
PersonController controller = new PersonController(personRepository);
controller.PersonFirstName = "Aaron";
controller.PersonLastName = "Feng";
controller.Save();
mocks.VerifyAll();
// Notice no asserts again
}The last example by implementing an Equals on the Person object which made the test look too clean. The asserts are invisible. On top of that, you have to implement an Equals method on an Object which you might not ever call the Equals in the real system. I believe this is the real power behind Jeffrey's Generic constraint approach. One should avoid writing any code that is not utilized by the real system just to satify the test.
List comprehension kicks ass 4
Recently I'm on an Erlang high, so I have tried to play around with it as much as I can. It's very common for any application to create a new list based on an existing list. For example in C# you would do something like the following:
public List<string> QualifiedUserNames(List<User> users) {
List<string> names = new List<string>();
foreach(User user in users) {
if(user.Age >= 30) {
names.Add(user.Name);
}
}
return names;
}Equivalent code in Erlang:
QualifiedUserNames(Users) -> [Name || {user,{name,Name},{age, Age}} <- Users, Age >= 30]The Erlang function uses list comprehension to do all the dirty work. It loops through every item in the Users list, and extracts only user "type" which matches the pattern {user,{name,Name},{age,Age}}. This is done because Erlang is a dynamic language, and the list doesn't have to contain heterogeneous items. Age >= 30 is a predicate that checks if the user should be added to the newly created list and if so, the Name is added.
Pretty cool, right? I think so. This capability is one of the many reasons why Erlang program is usually shorter than programs written in other languages. Well back to programming in Erlang some more.
The beauty of code
A few days ago, Steve Eichert sent me this link to Marcel Molina's presentation at Ruby Hoedown 2007. Marcel explores what makes code beautiful. He lists the three following attributes:
- Proportion
- Integrity
- Clarity
Proportion references the amount of code needed to make a feature work. You wouldn't expect 27 lines of code to multiply two numbers together. The code has integrity if it doesn't break down under non-trivial cases. Lastly, the code should be clear as to what it is trying to accomplish.
This is a very simple and elegant way to describe beautiful code.
Watch the video if you to see how Marcel came up with the assertions.
JRuby Dilemma
I submitted the following code to JRuby mailing list, and I haven't received any reponse yet. Can you spot the problem?
package my;
import java.util.Vector;
public class MyClassInJava {
Vector vector;
public MyClassInJava(java.util.Vector vector) {
this.vector = vector;
}
public Object getVector() {
return vector;
}
}Here is my ruby code which calls the Java code above:
include Java
require 'my.jar'
class MyVector < java.util.Vector
def my_method
end
end
class MyRuby
def initialize
my_vec = MyVector.new
c = Java::my.MyClassInJava.new(my_vec)
vec_from_java = c.getVector()
if vec_from_java.respond_to?(:my_method)
puts "found"
else
puts "not found"
puts vec_from_java.java_class
end
end
end
r = MyRuby.newThe output from the ruby code :
not found
org.jruby.javasupport.proxy.gen.Vector$Proxy0
