Hello, no I build it with a Ruby script. But I inserted two codes by hand, http:// and .com. All the rest is based on a probability-length based weight, and a test data that is the following: english books in .txt format, and different .html pages from wikipedia.