Saturday, January 12, 2013

Java to Scala: Motivation

If you are a Java programmer and haven't heard about Scala already you might wonder if it is worth the effort learning. Here I show two examples that motivated me to switch from Java to Scala.

This first one I stole from somewhere ― I just can't remember where from, sigh ― and modified it slightly. It is a simple implementation of a class that stores the contact information (name and phone number) of a person.

public class Contact implements Serializable {
    private final String name;
    private final String phone;

    public Contact(String name, String phone) {
        this.name = name;
        this.phone = phone;
    }

    public String getName() {
        return name;
    }

    public String getPhone() {
        return phone;
    }

    public Contact withName(String name) {
        return new Contact(name, phone);
    }

    public Contact withPhone(String phone) {
        return new Contact(name, phone);
    }

    public boolean equals(Object o) {
        if (this == o)  return true;
        if (o == null || getClass() != o.getClass()) return false;
        
        Contact p = (Contact) o;
        if (name != null ? !name.equals(p.name) : p.name != null) {
            return false;
        }
        if (phone != null ? !phone.equals(p.phone) : p.phone != null) {
            return false;
        }
        return true;
    }

    public int hashCode() {
        int result = name != null ? name.hashCode() : 0;
        result = 31 * result + (phone != null ? phone.hashCode() : 0);
        return result;
    }

    public String toString() {
        return "Contact(" + name + "," + phone + ")";
    }
} 

There are no surprises or difficulties in it. Just a lot of boilerplate code. Any Java programmer will have written hundreds of classes similar to this one - most of them a lot more complex. With the Contact class one can create new contacts from scratch or use one of the copy constructors as shown below:

Contact stefanOffice = new Contact("Stefan", "62606");
Contact johnOffice   = new Contact("John", stefan.getPhone());
Contact stefanHome   = stefanOffice.withPhone("401251497");

These 20 plus lines of the the Contact class in Java boil down to a single line in Scala with essentially identical functionality:

case class Contact(name: String, phone: String)


val stefanOffice = Contact("Stefan", "62606")
val johnOffice   = Contact("John", stefan.phone)
val stefanHome   = stefanOffice.copy(phone = "401251497")

A case class in Scala provides default implementations for setters, getters, equality tests and copy constructors. In many cases they are sufficient and if not they can easily be overridden.

While this example is a bit extreme with respect to the amount of code reduction (typically the factor is between two and five) it is not an unrealistic example. But I also wanted to demonstrate the benefits of Scala on a slightly more complex problem.

We are going to look at some code that generates an index of over keywords. For instance, let's say we have the keywords "Apple", "Ananas", "Mango", "Banana", "Blackberry" and we would like to have an alphabetically sorted index that lists the keywords (again alphabetically sorted) starting with the same letter:

A: Ananas, Apple
B: Banana, Blackberry
M: Mango

Not very challenging; should be easy. Well, here comes the Java implementation and as you can see it is far from elegant:

import java.util.*;

public class Index {    
  public static void main(String[] args) {
    List<String> keywords = 
       Arrays.asList("Apple", "Ananas", "Mango", "Banana", "Blackberry"); 
    Map<Character, List<String>> ch2words = new TreeMap<>(); 
    for(String keyword : keywords) {   
      char firstChar = keyword.charAt(0);     
      if(!ch2words.containsKey(firstChar))      
        ch2words.put(firstChar, new ArrayList<String>());       
      ch2words.get(firstChar).add(keyword); 
    } 
    
    for(List<String> list : ch2words.values()) 
      Collections.sort(list); 
      
    StringBuilder str = new StringBuilder();  
    for(Character firstChar : ch2words.keySet()) {
      str.append(firstChar+": ");
      Iterator<String> words = ch2words.get(firstChar).iterator(); 
      while(words.hasNext()) {
        str.append(words.next());  
        str.append(words.hasNext() ? ", " : "\n");  
      }
    }
    System.out.print(str);      
  }
}

We take the keywords and put them into a TreeMap to map the first letter to the list of words. I chose a TreeMap to have the map entries automatically sorted according to the first character. We then need to sort the keywords in each of the entries of the TreeMap and finally use a StringBuilder to write the formatted index to a String.

Now I am going to write some bad and truly disgusting Scala code that essentially is a one-to-one mapping from Java to Scala. While far from optimal it shows what already be gained without using any novel concepts such as functional programming or pattern matching.

import scala.collection.mutable.{Map,StringBuilder}

val keywords = List("Apple", "Ananas", "Mango", "Banana", "Blackberry")
val ch2words = Map[Char, List[String]]()
for(keyword <- keywords) {
  val firstChar = keyword(0)
  if(!ch2words.contains(firstChar))      
    ch2words(firstChar) = List()       
  ch2words(firstChar) = keyword::ch2words(firstChar)
}
val str = new StringBuilder()
for(ch <- ch2words.keys.toSeq.sorted) {
  str ++= ch2words(ch).sorted.mkString(ch+": ", ", ", "\n")      
}  
println(str)

Largely due to type inference in Scala the code becomes less cluttered and the logical flow is easier to recognize. Also the confusing distinction between primitive data types such as char and their corresponding objects (Character) is gone.

Now let's do it again and see how a solution that takes advantage of Scala features could look like:

val keywords = List("Apple", "Ananas", "Mango", "Banana", "Blackberry")
val ch2words = keywords.groupBy(_(0)).mapValues(_.sorted).toList.sortBy(_._1)
val lines = ch2words.map{case (ch,words) => ch+": "+words.mkString(", ")}
print(lines.mkString("\n"))

First of all it is much shorter. But in addition, it is also more readable provided an understanding of some basic concepts in Scala has been acquired. Let's have a closer lock at the code.

The first line is trivial. It creates a list of all the keywords and that's it. The second line is interesting. First the keywords get grouped by their first character (keywords.groupBy(_(0))). That produces a map of letters to keyword lists:

Map(M -> List(Mango), A -> List(Apple, Ananas), B -> List(Banana, Blackberry))

As you can see neither the letters nor the lists of keyword are sorted. We take the values of the map (those are the lists of keywords) and sort them. mapValues(_.sorted) does the job.

Map(M -> List(Mango), A -> List(Ananas, Apple), B -> List(Banana, Blackberry))

To sort the map according to letters we convert the map to a list, which returns a list of tuples, containing letters and their keywords, and sort the list according to the first tuple component ― the letter (toList.sortBy(_._1)).

List((A,List(Ananas, Apple)), (B,List(Banana, Blackberry)), (M,List(Mango)))

Now the data structure is sorted and we simply need to create a formatted string. Pattern matching is used to extract the letters and the corresponding keyword lists (case (ch,words)) and the lines of the index are constructed by mapping (ch,words) tuples to strings. Finally, all lines are glued together via mkString("\n") and printed out.

A: Ananas, Apple
B: Banana, Blackberry
M: Mango

To conclude, in almost all cases Scala allows one to write shorter code than in Java; and it is frequently even more readable. But it also makes it possible to write code that causes a PERL coder to blush. To quote Spider-man: "With great powers come great responsibilities!".