Scala Simple Build Tool — Not so simple after all, at least for now…

Update:I got sbt working by building directly from the master branch from their github repo. The current version is 0.7.5. The tagged 0.9.4 version is actually an older version. Anyway, tried it and kinda loved it.

This is just another late night rambling…I was trying to get a proper scala build system setup. I was using Maven scala plugin for a while, but longing for something simpler and more scalanic (is there such a word?). I was pretty happy at Cake, the Clojure build system and expected SBT to allow me to break away from using Maven to build Scala projects…boy, was I wrong…

First off, when you google ‘simple build tool’, you get a link to the SBT Google code home page. Well, nothing wrong there, except the “latest” version on Google code was 0.7.4 and it was half a year ago…Maybe it’s not that outdated, so I downloaded it, followed this instruction and setup my ~/bin/sbt script. Running it, it asked me to setup projects, and it only supported up until Scala 2.7.7…Hrm, 2.8 was out for a while now, so obviously, SBT 0.7.4 isn’t the latest. Reading their home page more carefully, they’re moving the repository to Github. Awesome! I’d pick Github over Google Code any time too.

Heading over to their Github repo, and found the latest stable version is 0.9.2. Good! So it should support Scala 2.8 now! Downloaded the zip, unzipped it, and of course it wasn’t executable. You need to build it. There’s a README.md, so quickly I less’ed it. For step 1, it asked me to go to the setup wiki page on Google Code (!), which is the steps I did setting up 0.7.4…I guess they’re using 0.7.4 as a bootstrapping build…Anyways, I did that. Step 2 was to run `sbt update “project Launcher” proguard “project Simple Build Tool” “publish-local”`. Of course it didn’t work. It’s complained 0.7.4 version of sbt-launch can’t download Scala 2.7.7 from any of the repository…bummer! But hey, I can download Scala 2.7.7 lib from Maven! So I quickly updated pom.xml of one of my projects to use Scala 2.7.7 and did an upgrade. Now 2.7.7 is happily in my local Maven repo. Ran that command again, hooray! It started to build, and judging by the number of packages it’s building, “simple” isn’t the first adjective that comes into my mind. Anyway, it’s building at least, so even if it’s a little complicated, so be it…Except…of course it broke half way… and why?

[info] Post-analysis: 107 classes.
[info] == Precompiled 2.7.7 / compile ==
[info]
[info] Precompiled 2.8.0 / compile …
[info]
[info] == Precompiled 2.8.0 / compile ==
[info] Source analysis: 9 new/modified, 0 indirectly invalidated, 0 removed.
[info] Compiling main sources…
[warn] there were deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[info] Compilation successful.
[info] Post-analysis: 108 classes.
[info] == Precompiled 2.8.0 / compile ==
java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
at java.lang.ClassLoader.defineClass(ClassLoader.java:616)

You’ve gotta be kidding me! I set -Xmx512M and it’s not enough? And why is it building every version of Scala *from source*?? Is there something called a…JAR?

Anyway, increased -Xmx from 512 to 1024M, ran again, wait, and same thing happened again! Out of PermGen space…urrgh…

I decided to give up, at least for the day… SBT is anything but simple, at least from my experience. I know it’s open source and people put efforts into it without compensation, so I shouldn’t be critical about it. I’ll give it a try again, and hopefully it’s worth the time investment.

My impression on Scala so far

I’ve been exploring Scala on and off for some time now. Here’s my highly subjective and very limited impression of Scala.

What I like about Scala:

0) It’s statically-typed language.

That’s right! I don’t care what you ninjas say. As much as I love dynamic languages, I just prefer statically typed language for big projects. The benefit of having type information is enormous for a project with a large code base. I know you have to write unit tests for it anyways, but when 80% of your unit test code is checking types, it’s just counter-productive.

1) The ability to combine functional programming style with OOP.

This is what Scala is known for – a hybrid language that boasts the better in both OOP world and FP world. I’m not a functional programming purist, but I think Scala did a good job blending the functional programming elements into OOP. Writing Scala programs, I find myself much more empowered to be able to choose from both styles where I see fit. The ability to pass functions around and apply high order functions can significantly reduce boilerplate code and visual clutter.

2) The ability to create your own control flow (well, sort of).

I had a blog post about how to use function curry to create a customized control flow. This is very empowering and yet, it’s not an one-off syntactic sugar that’s sprinkled randomly into the language like some other language would do. Customization of control flow is a result of combining generic language features (such as currying, by-name parameter, implicit conversion, etc).

3) Object instead of static

Scala does away with the Java static keyword. Instead, it provides “object” keyword to define a class that only has one instance. It has the same power as singleton design pattern, without the extra boilerplate code. (if you think implementing singleton in Java is simple, think again. Especially in the multi-threaded context). Also, you can use the same construct to define “companion” objects, which can be used to implement the factory pattern, and put in miscellaneous methods, implicit conversion methods and so on. Extremely powerful yet elegant.

4) Operator overloading (well, sort of).

OK, it’s not exactly operator overloading – Scala allows many non-alpha-numeric characters in the method name, so operators are essentially methods. duh! if you think about it…why should + be treated any different from “add()”? Being able to use special characters in method names makes code easy to understand. However, I’m a bit disappointed that question mark (?) cannot be used in method names…One of the things I liked about LISP is you can define a function (odd? (x) (…)) and readers immediately know this function returns a boolean. I don’t have to ponder whether I should name it isOdd() or hasChildren(). Anyway…

5) Traits

In most cases, multiple inheritance is manageable. Seems Java threw the baby out with the bath water – rejecting multiple inheritance completely. It’s ironic that the AOP guys cracks open Java classes in the byte code level to do the “mixins”…With Scala, this seems to be natural.

6) Built-in parser-combinator library

It’s easy to do some non-mission critical parsing using the Scala standard library. The library itself is implemented as an internal DSL such that writing parser rules feels like writing EBNF directly.

Here’s what’s not so hot for me so far:

1) The generics still is obtrusive sometimes.

That said, however, it’s vastly superior than Java’s generics system with far better type inferencing. But sometimes, you still have to write down a lot of types especially on a parameterized method. What’s worse is because Scala is more strict than Java wrt types, you cannot ignore generic types the way you can in Java code. Moreover, because Scala runs on JVM, and because of type-erasure, you still can’t write code such as val t = new T()…although we can’t blame this on Scala.

2) The collection library.

For each collection data structure, Scala has the mutable implementation, immutable implementation, and the native Java Collection Library. If you’re writing code that calls Java library, very often you have to wrap JCL lists/maps/sets into the Scala ones, because you want to use the nice functional features Scala provides. I’m not smart enough to know a better solution to this, I’m just merely pointing out a sore spot. Clojure has a very clever and elegant way of integrating JCL data structures into Clojure code, but since Clojure and Scala are vastly different, I don’t know how relevant this is.

3) Start up time

One of Scala’s goal is to be able to both scale up and scale down. Scala programs can be run as scripts by the interpreter, but it can’t compete with Python or Ruby in terms of development turn-around time in terms of scripting. Again, this is because of the dreadful JVM startup process.

4) Lack of good tooling

The best Scala development environment I found so far is the Scala plugin for IntelliJ. It’s still not great for debugging, but for coding and integration with Maven and such, it’s great. The Eclipse plugin lags significantly, despite having built a brand new website and seemingly having more resource pour into it…I know I shouldn’t be overly critical of a community effort, but it’s just frustrating that it can’t even do auto import of classes…

Overall, I think Scala really has the potential to become a big name in the programming language landscape. It can do everything Java can and does better. It has so much features that Java doesn’t dare to add to keep backward compatibility or keep the language dead simple or politics or whatever reason…

Scala is not a hype, but in order to be adopted more rapidly, I think it needs to

1) Stablize the language and core library.

2) Tooling, tooling, tooling! Seriously.

Use function currying to reduce repetition and make code clean

Lately, I’ve been writing a parser using the Scala parser-combinator framework to parse some saves from a game. As a responsible programmer (:P), I write unit tests for each rule. However, I found myself having to write the following code over and over again:

@Test
def testRule1() {
  parserRule.apply(new CharSequence("someInput")) match {
    case Success(result, _) => {
      assertEquals("expected", result)
      /* other asserts if the result is a collection of something else */
      }
    case NoSuccess(msg, _) => fail(msg)
  }
}

It’s worth noting that you cannot pass in a string value to a Parser. A parser expects a CharSequence CharSequenceReader object. Luckily, with Scala, you can define implicit conversion to turn a string into a CharSequence CharSequenceReader:

implicit def str2charSeqReader(v:String) = new CharSequenceReader(v)

Another goody Scala offers is that obj.apply(arg) method can be invoked by obj(arg), so the above code becomes:

/* ... */
parserRule("someInput") match {
/* ... */
}

It’s much cleaner.

However, the matching part is not clean still. It’s not that bad for a single test case, but it’s hardly a good practice to copy/paste it all over the test code base. We need to refactor this. The first thing comes to mind is to extract the matching part into a separate method, and let the caller provide the assertion part. Because Scala is a functional language, we can declare the method takes a function as a parameter, instead of using the delegate pattern, creating another class to encapsulate the method to be called, as you would do in Java:

def tryMatch[T](pr:ParseResult[T], f:(T)=>Unit) {
  pr match {
    case Success(r, _) => f(r)
    case NoSuccess(msg, _) => fail(msg)
  }
}

Now the consumer (the test code) looks like this:

@Test
def test1() {
  tryMatch(rule1("input"), (result)=>{assertEquals("expected",result)})
}

It’s better, except that when you have multiple assert* methods need to be called, it quickly becomes ugly:

@Test
def test2() {
  tryMatch(listRule("1,2,3,4"), (result)=>{assertEquals(1,result(0));assertEquals(2,result(1));assertEquals(3,result(2));assertEquals(4,result(3))})
}

To do better, we can use function currying. I read about function currying before, but not until now did it dawn on me that I can use this technique to make my code look cleaner. Basically, currying means that a function with n parameters is the same as the function being applied one parameter at a time in succession. e.g., function add(x1:Int,x2:Int,…,xn:Int) adds all its parameters. add(1,2)=[add(1)](2), where add(1) becomes a partially applied function that takes an integer and returns an integer. Then the partial function is applied to the second parameter which gives the eventual result.

This is all very theoretical, but the practical implication is that we can transform tryMatch() method shown above from a method with 2 parameters into a curried function. Why? Because with Scala, when you have a method that only takes a single parameter (as is the case with the partially applied function), you can replace parentheses with curly brackets! Let’s see how it works:

def tryMatch[T](pr:ParseResult[T])(f:(T)=>Unit) {
  pr match {
    case Success(r, _) => f(r)
    case NoSuccess(msg, _) => fail(msg)
  }
}

Basically, we break the two parameter definitions. This creates a PartialFunction in Scala. No other code needs to be changed at all. Now we can call tryMatch like this:

@Test
def testList() {
  tryMatch(listRule("1 2 3")) {
    result => {
      assertEquals(1, result(0))
      assertEquals(2, result(1))
      assertEquals(3, result(2))
    }
  }
}

This is like creating your own control flow (like try-catch-finally). Ain’t that neat?

Finding Happy Numbers using Scala

The problem was posted on Programming Praxis. The algorithm itself is pretty straightforward, anyone can do it with a few if/else/fors, but to coerce myself to think functionally, I decide to practice writing it in Scala.

A number is a happy number if the sum of square of its digits eventually arrive at 1. For example, 7=>72=49=>42+92=97=>92+72=>130=12+32+02=10=>12+02=1, so 7 is a happy number. 17 is not a happy number because by applying the above process, it goes into a loop.

Step1: calculating the sum of squares of a number

To get a list of numbers from a given number, we can first convert the number into a string, and then map every character of the string to its corresponding integer value. A more mathematical way is to divide the number continually by 10 until the original number becomes 0, adding the remainder to a list each time…The first method is easier to visualize, so here it goes:
n.toString.toCharArray.map{digit=>Integer.valueOf(""+digit)}

It’s quite wordy but I’ve yet to find a better way. I’m pretty sure there is so if you know, do inform me.
Now I need to square each item in the list. So modify the above statement:
n.toString.toCharArray.map{digit=>Math.pow(Integer.valueOf(""+digit).doubleValue,2.0)}
One thing bothers me is that I have to explicitly call someInt.doubleValue…I thought Scala does the implicit conversion for me? Then I realized Integer.valueOf(…) gives me java.lang.Integer, not scala.lang.Int. So I have to write a implicit conversion function myself:

implicit def integer2double(i:Integer):Double = i.doubleValue

Now I can get rid of the hideous someInt.doubleValue.
Since we’re doing implicit conversion already, why not just implicitly convert a numeric character to a double so that it can be accepted by Math.pow?

implicit def char2double(ch:Char):Double = Integer.valueOf("" + ch).doubleValue

Now the code can be shortened to

n.toString.toCharArray.map { digit => Math.pow(digit, 2) }

Isn’t that sweet? Implicit conversion is cool but it’s easy to get carried away and do everything implicit, which makes the code hard to maintain, so there gotta be a balance somewhere. In the scope of this small exercise, I guess it’s OK to use it.

Now that we have the squares of individual digits in a list, we can calculate the sum by reducing or folding it:

((n.toString.toCharArray.map { digit => Math.pow(digit, 2) }).foldLeft(0.0) { _ + _ }).toInt

{_ + _} seems a lot like line noise. The underscore converts the statement into a closure. It’s a shortcut for (a,b)=>a+b. It’s succinct yet should be used sparingly.

Step2:return

lol, yeah, no extra fluff is needed. We can capture the constraints in one return statement, but before that, I need to decide on what states I need to carry on from each recursive step. (You know I’m going to use recursion, don’t you? :P)

There are a couple of states involved:
1) the limit. It’s the number of steps we allow before the final “1” is reached. It took number seven 5 steps to reach 1. This variable is the cut-off: if the given number can’t reach 1 before the limit, it’s considered unhappy (return false)
2) the current number of steps
3) the set of numbers already appeared. In the case of seven, the set is {49 97 130 10}. This set is used to determine if the calculations fall into a loop. If the current calculated number appears in this set, the original number is unhappy.

So here’s our method:

def isHappyNumber(n:Int, limit:Int, numOfTries:Int, alreadySeen:Set[Int]):Boolean = {
val sos = ((n.toString.toCharArray.map { digit => Math.pow(digit, 2) }).foldLeft(0.0) { _ + _ }).toInt
return sos == 1 ? true : (!alreadySeen.contains(sos) && numOfTries+1 <= limit && isHappyNumber(sos, limit, numOfTries+1, alreadySeen+sos))
}

The second line basically says: when sos (sum of squares) is 1, return true, otherwise, is the number already in the set of numbers seen during the calculation?, if not, does the number of calculation exceed the limit? if not, repeat the calculation, with sos being the “original” number, increase the counter and put sos in the alreadySeen set.

Note that this is a tail recursive function, we can add @tailrec annotation to the method to let the compiler optimize it – turn the recursion into a loop so that it won’t grow in stack.

Now that we’ve had the body of the function, we can write an overload method that provides initial values:

def isHappyNumber(n:Int):Boolean = isHappyNumber(n, 10, 0, Set[Int]())

To find all happy numbers between 1 and 100:

println (1 to 100 filter { isHappyNumber(_) })