Use Python decorator to curry functions

It’s been a while since the last time I wrote about Python. This morning, I was listening to a podcast on my way to work. They were discussing functional programming and dynamic languages…I learned Python before I went into Computer Science, and then I learned about functional programming and through learning of Scala and Clojure, my functional programming concepts have been enriched. As I was listening, it suddenly appeared to me that there isn’t a way in Python to curry a function. Not that it’s critical to everyday development, but wouldn’t it be neat if I can curry a function in Python?

Then the hosts of the podcast discussed how dynamic languages are so flexible that you can pretty much do anything to it. “You can take a function as parameter, return a function from a function, and so on.” Hey, isn’t that what Python’s decorator can do? I learned decorators before, but I haven’t used it beyond the scope of creating properties and certainly haven’t written any decorators. I thought this would be a good exercise for learning decorators.

Here’s a simple example of what function currying: suppose you have a method

def add(x,y):
  return x+y

Then calling add(1,2) should be the same as add(1)(2). add(1) is what they call a partially applied function. It’s a function that takes one parameter.

Our goal here is to write a decorator “curried” that takes a function with n parameters and transform it in a way that can be applied n times and get the final result.

We’ll start with unit tests first:

import unittest

class CurryTest(unittest.TestCase):

	def test_with_no_args(self):
		@curried
		def do_nothing():
			return ""
		self.assertEquals("", do_nothing())

	def test_with_int_args(self):
		@curried
		def add_int(x,y):
			return x+y
		self.assertEquals(3, add_int(1)(2))
	def test_with_str_args(self):
		@curried
		def add_str(x,y):
			return "%s%s"%(x,y)
		self.assertEquals("ab", add_str("a")("b"))

So we make sure that a currying on a function takes no parameter is valid but should be a pass through, and also the “curried” decorator can be applied to any function with arguments (excluding positional arguments and keyword arguments)

A decorator is simply a function that takes a function as parameter:

def curried(fn):
  pass

and @curried is simply a syntactic sugar for:

def fn(...): ...
fn=curried(fn)

So, now we can write “curried” decorator.
To make the test for function with no argument pass, in curried() function, we can test to see if fn has arguments. Python’s standard library provides inspect.getargspec method:

def curried(fn):
  argspec = inspect.getargspec(fn)
  if len(argspec.args)==0:
    return fn
  else:
    # later

Now the first test passes.

For the other two cases, here’s the strategy. In Python, when a class defines __call__ method, the instance of that class is said to be “callable”. For instance:

class A(object):
  def __call__(self, arg):
    return arg

f=A()
f("echo")  # this gives you "echo"

This is very similar to Scala’s apply() function. Now that we have this in our inventory, we can define a `PartialFunction` class, take all the required parameters of the original function, and allow them to be applied one at a time. So the __call__ method of PartialFunc will look like this:

def __call__(self, value):
  # Xxx

If all the required parameters are passed in, PartialFunc should evaluate the original function with the complete argument list. Otherwise, PartialFunc stores the parameter in an instance variable, and returns itself.

Here’s the complete code:

class PartialFunc(object):
	def __init__(self, fn, argspec):
		self.fn = fn
		self.argspec = argspec 
		self.args = []

	def __call__(self, value):
		self.args.append(value) 
		if len(self.args) == len(self.argspec.args):
			arglist = ",".join(["self.args[%d]"%i for i in range(0, len(self.args))])
			return eval("self.fn(" + arglist + ")")
		else:
			return self

and the curried decorator:

def curried(fn):
	argspec = inspect.getargspec(fn)
	if len(argspec.args) == 0:
		return fn
	else:
		return PartialFunc(fn, argspec)

It’s pretty straightforward. When the parameters are complete, I construct a python statement that calls the original function with the complete argument list, and then pass the statement into an eval statement. I know evals are evil, but I can’t find a way in Python to dynamically change the signature of the original method and make it accept a variable length argument (varargs).

So this is it. It’s quite simple. Python methods can have varargs and keyword args, the situation gets a little more complicated. The thing is, both varargs and keyword args are not mandatory, so it’s hard for the curried function to know whether the argument list has been completed…Also, if you take default values into account, it could get even more complicated.

Advertisements

Use function currying to reduce repetition and make code clean

Lately, I’ve been writing a parser using the Scala parser-combinator framework to parse some saves from a game. As a responsible programmer (:P), I write unit tests for each rule. However, I found myself having to write the following code over and over again:

@Test
def testRule1() {
  parserRule.apply(new CharSequence("someInput")) match {
    case Success(result, _) => {
      assertEquals("expected", result)
      /* other asserts if the result is a collection of something else */
      }
    case NoSuccess(msg, _) => fail(msg)
  }
}

It’s worth noting that you cannot pass in a string value to a Parser. A parser expects a CharSequence CharSequenceReader object. Luckily, with Scala, you can define implicit conversion to turn a string into a CharSequence CharSequenceReader:

implicit def str2charSeqReader(v:String) = new CharSequenceReader(v)

Another goody Scala offers is that obj.apply(arg) method can be invoked by obj(arg), so the above code becomes:

/* ... */
parserRule("someInput") match {
/* ... */
}

It’s much cleaner.

However, the matching part is not clean still. It’s not that bad for a single test case, but it’s hardly a good practice to copy/paste it all over the test code base. We need to refactor this. The first thing comes to mind is to extract the matching part into a separate method, and let the caller provide the assertion part. Because Scala is a functional language, we can declare the method takes a function as a parameter, instead of using the delegate pattern, creating another class to encapsulate the method to be called, as you would do in Java:

def tryMatch[T](pr:ParseResult[T], f:(T)=>Unit) {
  pr match {
    case Success(r, _) => f(r)
    case NoSuccess(msg, _) => fail(msg)
  }
}

Now the consumer (the test code) looks like this:

@Test
def test1() {
  tryMatch(rule1("input"), (result)=>{assertEquals("expected",result)})
}

It’s better, except that when you have multiple assert* methods need to be called, it quickly becomes ugly:

@Test
def test2() {
  tryMatch(listRule("1,2,3,4"), (result)=>{assertEquals(1,result(0));assertEquals(2,result(1));assertEquals(3,result(2));assertEquals(4,result(3))})
}

To do better, we can use function currying. I read about function currying before, but not until now did it dawn on me that I can use this technique to make my code look cleaner. Basically, currying means that a function with n parameters is the same as the function being applied one parameter at a time in succession. e.g., function add(x1:Int,x2:Int,…,xn:Int) adds all its parameters. add(1,2)=[add(1)](2), where add(1) becomes a partially applied function that takes an integer and returns an integer. Then the partial function is applied to the second parameter which gives the eventual result.

This is all very theoretical, but the practical implication is that we can transform tryMatch() method shown above from a method with 2 parameters into a curried function. Why? Because with Scala, when you have a method that only takes a single parameter (as is the case with the partially applied function), you can replace parentheses with curly brackets! Let’s see how it works:

def tryMatch[T](pr:ParseResult[T])(f:(T)=>Unit) {
  pr match {
    case Success(r, _) => f(r)
    case NoSuccess(msg, _) => fail(msg)
  }
}

Basically, we break the two parameter definitions. This creates a PartialFunction in Scala. No other code needs to be changed at all. Now we can call tryMatch like this:

@Test
def testList() {
  tryMatch(listRule("1 2 3")) {
    result => {
      assertEquals(1, result(0))
      assertEquals(2, result(1))
      assertEquals(3, result(2))
    }
  }
}

This is like creating your own control flow (like try-catch-finally). Ain’t that neat?

Use delegation to write map/filter in Java

The problem

In Java, imagine you have a list of User objects, each encapsulates the user’s id, first name, last name and age. Then you want to call a web service UserService.deleteUsersByIds(List<Integer> userIds) to delete the users from your data store. It doesn’t sound too hard, does it? All you need to do is to transform you List<User> to List<Integer>. So you go ahead and write the following code:

List<Integer> ids = new ArrayList<Integer>(users.size());
for (User user : users) {
  ids.append(user.getId());
}

Then you go ahead and use your ids list, and everything is fine and dandy.

However, two minutes later, you find yourself having to provide another API method with a list of user’s names in String. So, again, you exercise your CSC101 skill:

List<String> names = new ArrayList<String>(users.size());
for (User user : users) {
  names.append(new StringBuilder(user.getFirstName()).append(" ").append("user.getLastName()));
}

Now, something else comes along and you need to write a piece of code that returns a list of names that belong to people who are under 21 years of age in the list…You get the idea. Well, things get boring pretty quickly.

As it turns out, these are two very important functions in functional programming: map and filter.

  • map(coll, f) “loops” over the collection, calls the function f on each element, and add the return of the f(element) to the return collection.
  • filter(coll, f) “loops” over the collection, calls f(element), and only add element to the return list when f(element) returns true

Use delegation for generic-ity

Now we take our first step in designing our generic map function:

<FromType, ToType> List<ToType> map(ArrayList<FromType> list) {
  List<ToType> retval = new ArrayList<ToType>(list.size());
  for (FromType item : list) {
    [...]
  }
  return retval;
}

What we left out in the above code snippet is how the input is mapped to the output. This is where delegates come in. Unfortunately, Java doesn’t have the language-level delegate. We need to design an interface for this delegate.

interface MapDelegate<FromType, ToType> {
  ToType map(FromType obj);
}

The delegate is parameterized (to provide more type safety) with FromType and ToType. FromType is the type of the objects in the original list, and ToType is the type of objects in the mapped list. Now we need to change our method signature to incorporate the delegate.

<FromType, ToType> List<ToType> map(ArrayList<FromType> list, MapDelegate<FromType, ToType> mapDelegate) {
  List<ToType> retval = new ArrayList<ToType>(list.size());
  for (FromType item : list) {
    retval.add(mapDelegate.map(item));
  }
  return retval;
}

Now the client code will look like this:

List<User> users = getUserListFromSomeWhere();
List<String> ids = map(users, new MapDelegate<User,String>() {
  public String map(User obj) {
    return new StringBuilder(user.getFirstName()).append(" ").append("user.getLastName()).toString();
  }
});

Similarly, we can write a filter function:

<T> List<T> filter(List<T> list, FilterDelegate<T> filterDelegate) {
  List<T> retval = new ArrayList<T>(list.size());
  for (T item : list) {
    if (filterDelegate.filter(item)
      retval.add(item);
  return retval;
}
interface FilterDelegate<T> {
  boolean filter(T item);
}

What about return value creation?

Use delegation, we can separate the parts of an algorithm in terms of their interfaces and leave the implementation to the caller. However, given the above filter and map methods, what if I don’t want the return type to be ArrayList? What if I want a LinkedList or a HashSet? Doesn’t the statement

  List<T> retval = new ArrayList<T>(list.size());

an implementation by itself?

Absolutely! For more flexibility, the “new” statement in the implementation body has to be delegated out as well. We introduce a ReturnDelegate interface:

interface ReturnDelegate<R extends Collection<?>> {
  R createReturnCollection();
}

and plug in the return delegate to the map method:

<FromType, ToType, R extends Collection<?>> R map(Collection<FromType> coll, MapDelegate<FromType, ToType> mapDelegate, ReturnDelegate<R> returnDelegate) {
  R retval = returnDelegate.createReturnCollection();
  for (FromType item : list) {
    retval.add(mapDelegate.map(item));
  }
  return retval;
}

Now the actual implementation has been completely separated. I know you can probably achieve flexibility without return delegate with the use of reflection, but on some systems (like GWT, which is what I’m working on and what this code is originally designed for), reflection is off limits.

Explore Clojure: Building a Bifid cipher

Lately I’ve been teaching myself Clojure, a Lisp dialect on the JVM platform. I still love Erlang and still learning it, but Clojure has a special draw for me being a JVM language and its Lisp roots. I studied Scheme (another Lisp dialect) in my college years and deemed it purely academic. However, Clojure has the potential of changing this and bring the expressiveness of Lisp and the power of functional programming to the Java world.

The best way to learn a language is to use it to solve some non-trivial practical problems. I stumbled this problem on Programming Praxis. It’s about building a Bifid cipher, which I thought could be a good practical problem for me to solve in Clojure.

Polybius Square

Bifid cipher is based on Polybius Square, which looks like this:

1 2 3 4 5
1 B G W K Z
2 Q P N D S
3 I O A X E
4 F C L U M
5 T H Y V R

Every encodable letter is given a vector value, which is the coordinate in this square. For example, X=[3 4], H=[5 2]. The first task would be building the Polybius square in Clojure.

My function polybius-square is going to take 2 parameters: charset is a list of encodable characters, and size is the size of the Polybius square.

(defn polybius-square
  "Polybius square by the given charset and size"
  [charset square-size]
  (map #(vector %1 [%2 %3]) charset
    (for [x (range 1 (+ 1 square-size)), y (range 1 (+ 1 square-size))] x)
    (take (* square-size square-size) (cycle (range 1 (+ 1 square-size))))))

Let’s see how this works. Suppose you’re calling this function with
(polybius-square "ABCDE" 3), charset is bound to “ABCDE” and size is bound to 3. What I want is a seq of vectors in the form of: (\Letter [x y]) where x and y are the coordinates. It’s natural to use map, with the function-to-map taking three parameters and vectorize them. That’s what the anonymous function does:
#(vector %1 [%2 %3])

%1 is the letter, %2 and %3 are the coordinates. I want the coordinate sequence to be like this: [1 1], [1 2], [1 3], [2 1], [2 2], [2 3]. In imperative language, you’d use a nested for loop, but we’ve established that’s ugly. We want the parameters fed to map to be of the following:
"ABCDEFGHJI" (1 1 1 2 2 2 3 3 3) (1 2 3 1 2 3 1 2 3 1 2 3)

To generate the first list, we use list comprehension:
(for [ x (range 1 (+ 1 square-size)), y (range 1 (+ 1 square-size))] x)
It reads: for every x in range 1 to square-size+1 (because “range” is right-exclusive) and for every y in the same range, output the value of x and take the whole output as the list. Because the binding of y comes later, and Lisp evaluates from right to left, the result of this list comprehension is (1 1 1 2 2 2 3 3 3)

Now let’s look at the other list: (1 2 3 1 2 3 1 2 3) – a cyclic list. Fortunately, Clojure has a built-in cycle function, which returns a lazy sequence of cycles of a collection. We want to call it with cycle '(1 2 3), so it’s going to be cycle (range 1 (+ 1 square-size)). Because cycle will be evaluated to an infinite sequence, we have to tell it when to stop.
When you see the square as a flat list, it’s clear that the cycle should stop when it reaches square-size^2, hence (take (* square-size square-size) (cycle (range 1 (+ 1 square-size)))).

The codec maps

After the Polybius square is built, we have the bases for our encryption. However, it’s not user-friendly: it’s a flat list of tuples. We want two maps for fast searching from letters to their encoding forms (letter-2-code) and from transposed vectors to the letters (transposed-2-letter).

(defn- letter-2-code
  "Produce a mapping between letters and their codes"
  [square]
  (letfn [(helper [the-square the-map]
          (if (empty? the-square)
            the-map
            (let [the-item (first the-square)
                  the-key (first the-item)
                  the-number (second the-item)]
              (recur (rest the-square) (assoc the-map the-key the-number)))))]
    (helper square (empty hash-map))))

The strategy here is to walk through the list, add each item to a hash-map. Here we define an inner function, which is basically a recursive helper. The function itself is pretty straightforward. Take note that in Clojure, when you need to do tail recursion, always use recur instead of calling the function by its name. This is because JVM cannot do automatic tail cut optimization (TCO), but Clojure does the optimization through the recur keyword. After the binding, we kick off the recursion by calling it with the initial list.

In the similar fashion, we have:

(defn- transposed-2-letter
  "Produce a mapping between the transposed codes to letter"
  [square]
  (letfn [(helper [the-square the-map]
            (if (empty? the-square)
              the-map
              (let [the-item (first the-square)
                    the-key (second the-item)
                    the-letter (first the-item)]
                (recur (rest the-square) (assoc the-map the-key the-letter)))))]
    (helper square (empty hash-map))))

Encode

Now comes the encode function. It’s important to note that the encoding will be different if a different polybius square is provided. Therefore, both encode and decode functions take the Polybius square as parameter.

During the encoding phase, we first convert every character in the message into a vector, then reorganize the vectors. To better illustrate, we use an example:
Suppose we have the following mapping:
{I [3 3], H [3 2], G [3 1], F [2 3], E [2 2], D [2 1], C [1 3], B [1 2], A [1 1]}
and we want to encode the word “CEDE”
CEDE => ([1 3] [2 2] [2 1] [2 2]), then we want to transpose the “matrix” composed of the above vectors, so it would be like:
([1 2] [2 2] [3 2] [1 2]), and finally looking up every item in the transposed-2-letter mapping
{[3 3] I, [3 2] H, [3 1] G, [2 3] F, [2 2] E, [2 1] D, [1 3] C, [1 2] B, [1 1] A}
=>BEHB

To get a mapping of message to vectors, we again map a function to the list:
(map #(get l2c % %) message) (l2c has been bound)
Then we apply the interleave function to the result, this gives:
(1 2 2 2 3 2 1 2). This is good but we want a list of vectors instead of a flat list. Clojure provides partition function, which does exactly it:
(partition 2 '(1 2 2 2 3 2 1 2))
gives:
([1 2] [2 2] [3 2] [1 2])

Then, we want to lookup every item in the result in the transposed-to-letter map and get a list of mapped characters. To achieve this, again, we use map, providing a different mapping:
(map #(get tr2l % %) ...) (tr2l is bound)
This returns a list of characters. We want a string, so we concatenate the characters by applying str function:
(apply str (map #(get tr2l % %) ...))

Here’s the complete code:

(defn encode
  "Return the message encoded by the specified polybius-square"
  [message polybius-square]
  (let [l2c (letter-2-code polybius-square)
        tr2l (transposed-2-letter polybius-square)]
    (apply str
      (map #(get tr2l % %) (partition 2 (apply interleave (map #(get l2c % %) message)))))))

Here you see the expressiveness of Clojure: all operations are chained together and formed in expression.

The decode function behaves similarly. The only trick is we need to split the translated code list in two. We use partition method again, with size being half of the entire list.

(defn decode
  "Return the decoded message using the specified polybius-square"
  [encoded polybius-square]
  (let [l2c (letter-2-code polybius-square)
        c2l (transposed-2-letter polybius-square)
        codes (apply concat (map #(get l2c % %) encoded))
        len (count codes)
        parts (partition (/ len 2) codes)]
    (apply str (map #(get c2l %1 %1) (map #(vector %1 %2) (first parts) (second parts))))))

Putting it together

Now with all the pieces are in place, we can put it together by writing a simple test:

(defn -main
  "A simple test case entry point"
  []
  (let [square (polybius-square "ABCDEFGHIKLMNOPQRSTUVWXYZ" 5)]
    (do (println (encode "BIFIDCIPHER" square))
      (println (decode (encode "BIFIDCIPHER" square) square)))))

Output:

BGAHFRQTOXW
BIFIDCIPHER

Have the complete source code here at http://github.com/kevinjqiu/bifid-clj, I’m sure there are a lot to be improved of the above implementation of the Bifid cipher. Especially currently I have no elegant way to solve the situation where the letter to be encoded is not in the Polybius square. Also, I’m sure my code isn’t entirely the idiomatic Clojure. If you have any suggestions, don’t hesitate to write a comment.

Although I have been learning Clojure for 3 days, I’m already enjoying its expressiveness and the clean look. As a Java programmer, Clojure also gives me the benefit of full access to the Java land. The potential is endless!