Thursday, November 27, 2014

Longest word

I recently came across this post, where Joey Gibson presents two implementations of a function to find the longest word in an array of strings. The procedural implementation, which often is what you start with when converting from Java to Scala, looks like this:

def longestWord(words: Array[String]) = {
  var word = words(0)
  var idx = 0
 
  for (i <- 1 until words.length)
    if (words(i).length > word.length) {
      word = words(i)
      idx = i
    }
 
  (word, idx)
}

It is a bit nicer than a corresponding implementation in Java but doesn't buy you much. The functional version Joey presented, is much shorter:

def longestWord(words: Array[String]) =
  (("", -1) /: words.zipWithIndex) {(old, cur) =>
    if (cur._1.length > old._1.length) cur
    else old
  }
}

It always gives me a kick to see how much shorter and more readable functional/Scala code is in comparison to procedural/Java code. In this case, however, readers rightfully argued that this isn't very readable. But luckily we can make it readable and even shorter! Firstly, instead of using a fold left (/:), we could use a reduceLeft

  def longestWord(words: Iterable[String]) = {
    words.zipWithIndex.reduceLeft{(a,b) => 
      if (a._1.length > b._1.length) a else b}
  }  
}

which already is an improvement in readability. Note that I also changed the parameter type from Array[String] to Iterable[String]. This way the function works for lists, arrays, ..., anything Iterable.

Now I am getting greedy. I am not sure if the maxBy function was around in 2010 when Joey wrote his post, but these days we can simply write

  def longestWord(words: Iterable[String]) = 
    words.zipWithIndex.maxBy(_._1.length)

which is super short and very readable. If you don't like the tuple notation then this version is a bit more explicit and the most readable, I believe:

  def longestWord(words: Iterable[String]) = 
    words.zipWithIndex.maxBy{case (word,_) => word.length}

Of course, if you only want the longest word (and not its index) words.maxBy(_.length) will do the job.

No comments:

Post a Comment