Seven languages in 70 minutes

a whirlwind tour in functional programming


The Problem

(AKA "SLB Programming Challenge 06", also on Reddit)

Find the (last) longest sub-string of the given string that contains at most two unique characters.


"abcbcbcdeffgggfgfabdcbgfddffgggfgfcdab"

The algorithm

\(ch\) 0 current character
\(ch_0\) 0 current "older" character (cached, same as \(str[c_b]\))
\(ch_1\) 0 current "newer" character (cached, same as \(str[c_s]\))
\(c_b\) 0 current begin index
\(c_s\) 0 current start of the latest \(ch_1\)-only sunstring
\(c_e\) 0 current end index
\(m_b\) 0 current max substring begin index
\(m_e\) 0 current max substring end index

Reference C implementation

char* longest2UniqueCharSubstring(char* str, char** end) {
        char ch0 = 0, ch1 = 0; /* current substring's characters       */
        int cb = 0, ce = 0;    /* current substring begin/end          */
        int mb = 0, me = 0;    /* max (so far) substring begin/end     */
        int cs = 0;            /* start of the last ch1-only substring */
        char ch;               /* current character                    */
        while (ch = str[ce]) {
            if (ch != ch1) {
                if (ch != ch0) {
                    /* met a new character */
                    cb = cs; cs = ce; ch0 = ch1; ch1 = ch;
                }else {
                    /* met ch0 again */
                    cs = ce; ch0 = ch1; ch1 = ch;
                }            
            } /* else do nothing - still at the ch1 */
            
            ce++; /* advance the current character */ 
            if (me - mb <= ce - cb) {
                /* update the current max */
                me = ce; mb = cb;
            } 
        }
        
        *end = &str[me];
        return &str[mb];
    }

Functional Programming solution

It's like math!

\[ In: \{S^{i}\}_{i=0}^{n-1}\\ c_b^{0}=0, c_e^{0}=0, c_{s}^{0}=0, m_b^{0}=0, m_e^{0}=0\\ \forall i = \{1, .., n-1\}:\\ (m_b^{i+1}, m_e^{i+1})= \cases{ (c_b^{i+1}, c_e^{i+1}) & \text{if } c_e^{i+1} - c_b^{i+1} \geq m_e^{i} - m_b^{i}\cr (m_b^{i}, m_e^{i}) & \text{otherwise} }\\ c_e^{i+1} = c_e^{i}+1\\ (c_b^{i+1}, c_{s}^{i+1}) = \cases{ (c_b^{i}, c_{s}^{i}) & \text{if } S^{i+1}=S^{c_{s}^{i}}\cr (c_b^{i}, c_e^{i}) & \text{if } S^{i+1}=S^{c_b^{i}}\cr (c_{s}^{i}, c_e^{i}) & \text{otherwise} }\\ Out: \{S^{k}\}, k = \{m_b^{(n-1)}, .., m_e^{(n-1)}-1\} \]

It's like math!

  • \(S^{i}\) - current character
  • \(c_b^{i}, c_e^{i}\) - current substring (begin/end index)
  • \(c_{s}^{i}\) - current start of the latest second-char-only substring
  • \(m_b^{i}, m_e^{i}\) - current max substring (begin/end index)

Functional Programming principles

  • Program is a "mathematical function"
  • Everything is an expression
  • No side effects
  • Functions are first-class. Higher-order functions.
  • ...and other functional programming features/idioms

What we will use

  • Immutability
  • Tuples/Destructuring
  • Higher-order functions (fold)
  • Closures/Lambdas

Immutability

  • Assign (bind) once, never change
  • Kind of "const" in C++
  • "Changing" via creating a different value
  • Good practice regardless of the language
  • Example (Scala REPL):
scala> val x = 10
x: Int = 10

scala> x = 11
:8: error: reassignment to val
       x = 11
         ^

scala> var y = 10
y: Int = 10

scala> y = 11
y: Int = 11

Tuples and Destructuring

  • Tuples are just an ordered bunch of values
  • Fixed, usually small size
  • Can be of different data types
  • Destructuring is used to extract contents
  • Mathematical notion of set product
  • Example (F# REPL):

        > let myTuple = ("The", ("answer", "is"), 42);;
         val myTuple : string * (string * string) * int = ("The", ("answer", "is"), 42)

        > let (a, b, c) = myTuple;;
         val c : int = 42
         val b : string * string = ("answer", "is")
         val a : string = "The"

        > let (_, (y, _), z) = myTuple;;
         val z : int = 42
         val y : string = "answer"

Higher-order functions: Fold

  • AKA "reduce", "accumulate", "aggregate", "inject"
  • Function that takes:
    • a sequence
    • starting value of acc
    • function: (curVal, acc) -> acc
  • "Wraps" accumulator through the sequence
  • Example (Haskell REPL):

        Prelude> let addFunc cur acc = acc + cur
        Prelude> foldl addFunc 100 [1,2,3,4,5]
        115

Lambdas and Closures

  • In-place, anonymous functions:
    Prelude> foldl (\ curr acc -> curr + acc) 0 [1,2,3,4,5]
    15
    Prelude> foldl (+) 0 [1,2,3,4,5]
    15
  • Closures "remember" the local outside environment:
    Prelude> let addN n = (\x -> n + x)
    Prelude> let add3 = addN 3
    Prelude> let add5 = addN 5
    Prelude> add3 7
    10
    Prelude> add5 7
    12

The program structure

  • Represent "state" by a tuple: \(((m_b^{i}, m_e^{i}, c_b^{i}, c_e^{i}), c_s^{i}, ch_0^{i}, ch_1^{i})\)
  • Use it as accumulator in "fold"

Code

module Challenge06 = 
      let longest2UniqueCharSubstring (s:string) = 
        let ((mb, me, _, _), _, _, _) = 
            Seq.fold (fun ((mb, me, cb, ce), cs, ch0, ch1) ch -> 
              let ce1 = ce + 1
              let getMax b e = (if (me - mb > e - b) then           
                                   (mb, me, b, e) else (b, e, b, e))
              if   (ch = ch1) then (getMax cb ce1, cs, ch0, ch1)
              elif (ch = ch0) then (getMax cb ce1, ce, ch1, ch )
              else                 (getMax cs ce1, ce, ch1, ch ))
            ((0, 0, 0, 0), 0, '\000', '\000') s
        s.Substring(mb, me - mb)
Designed by Don Syme
When 2005
Platform .NET
Inspired by ML-family (OCaml)
Paradigm Functional/OOP hybrid
Typing Static, type inferred
Nesting Indentation
TIOBE rank 15
Fun fact Most OCaml code is valid F#

  • Type inference (Hindley-Milner)
  • Type providers, computation expressions, active patterns, etc, etc
  • Seamless .NET integration (including WPF etc)
  • Tools support (VisualStudio, Xamarin)
  • Both OOP and "classical" FP, ML-style
  • It's Microsoft :)

  • It's Microsoft :(
  • Steep learning curve
  • Extra complexity
  • Type system may be too restrictive

  • XBox Live TrueSkill
  • Financial applications
  • Scientific computations
  • Game servers

Code

(ns challenge06 (:gen-class))
(defn longest2UniqueCharSubstring [s]
  (let [ 
    [[mb me _ _] _ _ _]
    (reduce (fn [[[mb me cb ce] cs ch0 ch1], ch]
              (let [ ce+    (inc ce)
                     getMax (fn [b e] (if (> (- me mb) (- e b))
                                          [mb me b e] [b e b e]))]
                (cond
                  (= ch ch1) [(getMax cb ce+) cs ch0 ch1]
                  (= ch ch0) [(getMax cb ce+) ce ch1 ch ]
                  :else      [(getMax cs ce+) ce ch1 ch ])))
      [[0 0 0 0] 0 \o0 \o0] s)]
    (subs s mb me)))
Designed by Rich Hickey
When 2007
Platform JVM, .NET, JavaScript
Inspired by Lisp family, Erlang, Haskell
Paradigm Functional
Typing Dynamic
Nesting Parentheses!!!
TIOBE rank 50
Fun fact Rich Hickey has a lot of hair, also on the chest

  • Minimalist syntax (it's a Lisp, but with less parens)
  • Java ecosystem
  • Immutability
  • Multimethods
  • Macros
  • Concurrency

  • Prefix notation
  • Parentheses
  • It's slow

  • Powers many web-services
  • Boeing, Citi Group et al
  • 4clojure

Code

module Challenge06 where

longest2UniqueCharSubstring s = take (me - mb) . drop mb $ s
  where 
    ((mb, me, _, _), _, _, _) = foldl f init s
    init = ((0, 0, 0, 0), 0, '\0', '\0')
    f = \ ((mb, me, cb, ce), cs, ch0, ch1) ch ->
      let ce1 = ce + 1
      let getMax = \b e -> if (me - mb) > (e - b)
                              then (mb, me, b, e) else (b, e, b, e)
        in case () of _ | ch1 == ch -> (getMax cb ce1, cs, ch0, ch1)
                        | ch0 == ch -> (getMax cb ce1, ce, ch1, ch )
                        | otherwise -> (getMax cs ce1, ce, ch1, ch )
Designed by Simon Peyton Jones
When 1990
Platform Compiled (GHC) or interpreted (GHCi)
Inspired by Clean, Miranda, ML, Scheme
Paradigm Functional
Typing Static
Nesting Indentation
TIOBE rank 36
Fun fact Was originally created as a research language

  • Expressive static type system
  • Type inference
  • Lazyness
  • It's PURE!!!
  • Compact syntax
  • Software Transactional Memory
  • "...isn't for everyone. But everyone should give it a try."

  • Hard to learn
  • It's pure
  • Low chance of becoming mainstream

  • Siemens for space applications
  • The New York Times
  • Credit Suisse
  • Bank of America Merril Lynch
  • AT&T
  • a lot of financial applications

Code

longest2UniqueCharSubstring = (s) ->
    [[mb, me]] = s.split('').reduce(
        ([[mb, me, cb, ce], cs, ch0, ch1], ch) ->
            getMax = (b, e) ->
                if (me - mb > e - b) then [mb, me, b, e] 
                                     else [b, e, b, e]
            switch ch
                when ch1 then [getMax(cb, ce + 1), cs, ch0, ch1]
                when ch0 then [getMax(cb, ce + 1), ce, ch1, ch ]
                else          [getMax(cs, ce + 1), ce, ch1, ch ]
        ,[[0, 0, 0, 0], 0, '', ''])
    s.substring(mb, me);
Designed by Jeremy Ashkenas
When 2009
Platform JavaScript
Inspired by Ruby, Python, Haskell
Paradigm Functional, prototype-based
Typing Dynamic
Nesting Indentation
TIOBE rank 170
Fun fact First CoffeeScript compiler was written in Ruby

  • Syntax - much less cruft comparing to JavaScript
  • Fixes some JavaScript mistakes (e.g. global variables)
  • Literate CoffeeScript

  • Debugging
  • "It's just a JavaScript with different syntax"
  • Shorter does not mean better
  • Significant whitespace
  • The syntax has its own "bad parts"

  • Multiple open source web-frameworks

JavaScript (generated)

// Generated by CoffeeScript 1.7.1
    (function() {
      var longest2UniqueCharSubstring;

      longest2UniqueCharSubstring = function(s) {
        var mb, me, _ref;
        _ref = s.split('').reduce(function(_arg, ch) {
          var cs, cb, ce, ch0, ch1, getMax, mb, me, _ref;
          (_ref = _arg[0], mb = _ref[0], me = _ref[1], cb = _ref[2], ce = _ref[3]), cs = _arg[1], ch0 = _arg[2], ch1 = _arg[3];
          getMax = function(b, e) {
            if (me - mb > e - b) { return [mb, me, b, e]; } 
            else { return [b, e, b, e]; }
          };
          switch (ch) {
            case ch1:
              return [getMax(cb, ce + 1), cs, ch0, ch1];
            case ch0:
              return [getMax(cb, ce + 1), ce, ch1, ch];
            default:
              return [getMax(cs, ce + 1), ce, ch1, ch];
          }
        }, [[0, 0, 0, 0], 0, '', ''])[0], mb = _ref[0], me = _ref[1];
        return s.substring(mb, me);
      };

    }).call(this);
    

Code

longest2UniqueCharSubstring = (s) ->
    [[mb, me]] = s.split('').reduce(
        ([[mb, me, cb, ce], cs, ch0, ch1], ch) ->
            getMax = (b, e) ->
                if (me - mb > e - b) then [mb, me, b, e] else [b, e, b, e]
            switch ch
                when ch1 then [getMax(cb, ce + 1), cs, ch0, ch1]
                when ch0 then [getMax(cb, ce + 1), ce, ch1, ch ]
                else          [getMax(cs, ce + 1), ce, ch1, ch ]
        ,[[0, 0, 0, 0], 0, '', ''])
    s.substring(mb, me);

Code

class Challenge06 {
    static def longest2UniqueCharSubstring(String s) {
        def (mpos) = s.inject([[0, 0, 0, 0], 0, '\0', '\0']) {
            acc, ch ->
                def (pos, cs, ch0, ch1) = acc
                def (mb, me, cb, ce) = pos
                def ce1 = ce + 1
                def getMax = {b, e -> (me - mb > e - b) ?
                        [mb, me, b, e] : [b, e, b, e]}
                switch (ch) {
                    case ch1: [getMax(cb, ce1), cs, ch0, ch1]; break
                    case ch0: [getMax(cb, ce1), ce, ch1, ch ]; break
                    default:  [getMax(cs, ce1), ce, ch1, ch ]
            }
        }
        s.substring(mpos[0], mpos[1])
    }
}
Designed by James Strachan
When 2003
Platform JVM
Inspired by Python, Ruby, Perl, Smalltalk
Paradigm OOP, with functional features
Typing Dynamic, partially static
Nesting Curly braces
TIOBE rank 44
Fun fact Most Java code is valid Groovy

  • Dynamic typing
  • Java ecosystem
  • Syntax far more compact than Java
  • Native support for various markup languages
  • Metaprogramming/DSL
  • Smooth learning curve

  • Dynamic typing
  • "Scala is a long-term replacement"
  • Pretty slow
  • Debugging

  • IntelliJ IDEA
  • DSLs in major financial applications
  • Grails
  • LinkedIn
  • Wired.com
  • Sky.com

Code

object Challenge06 {
  def longest2UniqueCharSubstring(s:String) = {
    val ((mb, me, _, _), _, _, _) = s.foldLeft(((0, 0, 0, 0), 0, '\0', '\0'))({
      case (((mb, me, cb, ce), cs, ch0, ch1), ch) => {
        val ce1 = ce + 1
        val Ch1 = ch1
        val Ch0 = ch0
        val getMax = {(b:Int, e:Int) =>
          if (me - mb > e - b) (mb, me, b, e) else (b, e, b, e)}
        ch match {
          case Ch1 => (getMax (cb, c1), cs, ch0, ch1)
          case Ch0 => (getMax (cb, c1), ce, ch1, ch )
          case _   => (getMax (cs, c1), ce, ch1, ch )
        }
      }
    })
    s substring (mb, me)
  }
}
Designed by Martin Odersky
When 2003
Platform JVM
Inspired by Eiffel, Erlang, Haskell, Java, Lisp, OCaml et al
Paradigm Both Functional and Object-Oriented
Typing Statically typed, with type inference
Nesting Curly braces, significant whitespace
TIOBE rank 41
Fun fact Scala's Actor framework (Akka) was backported to Java

  • Has "all the nice language features"
  • Endorses FP, but can do OOP as well
  • Java ecosystem
  • Good performance
  • Concurrency, Actor library (Akka)
  • Native XML support

  • Too many language features
  • Many ways of doing things
  • Slow compiler
  • Steep learning curve
  • Bloated libraries
  • Easy to write cryptic code

  • Walmart
  • LinkedIn
  • Twitter
  • Sony
  • Siemens
  • Other guys

Code

-module(challenge06).
-export([longest2UniqueCharSubstring/1]).

longest2UniqueCharSubstring(S) ->
  {{Mb, Me, _, _}, _, _, _} = lists:foldl(
    fun (Ch, {{Mb, Me, Cb, Ce}, Cs, Ch0, Ch1}) ->
      Ce1 = Ce + 1,
      GetMax = fun
        (B, E) when (Me - Mb > E - B) -> {Mb, Me, B, E};
        (B, E) -> {B, E, B, E}
      end,
      if Ch =:= Ch1 -> {GetMax(Cb, Ce1), Cs, Ch0, Ch1};
         Ch =:= Ch0 -> {GetMax(Cb, Ce1), Ce, Ch1, Ch };
         true       -> {GetMax(Cs, Ce1), Ce, Ch1, Ch }
      end
    end,
    {{1, 1, 1, 1}, 1, $0, $0}, S),
  string:sub_string(S, Mb, Me - 1).
Designed by Joe Armstrong
When 1986
Platform Erlang Virtual Machine (BEAM)
Inspired by Prolog, Lisp
Paradigm Functional
Typing Dynamic
Nesting Indentation
TIOBE rank 47
Fun fact Initially was created for Telecom

  • Immutability
  • Actor model
  • Error recovery
  • Live updating
  • Abstractions for building highly-scalable, fault-tolerant systems
  • "Nine nines" reliability (99.9999999%)
  • Ericsson claims productivity boost 9-25 times

  • Data structures (string, records)
  • Boilerplate and duplication (language "too simple")
  • Bad at managing memory
  • JIT is not state of the art
  • Dated syntax
  • Not quite a general purpose language

  • Decades of Telecom
  • WhatsApp
  • Facebook
  • MMO Games servers
  • OSS: CouchDB, Riak, Wings3D etc.

Running time comparison

  • "Alice's Adventures in Wonderland"
  • Whitespace removed, lowercase, ~4Mb
  • Win7 64-bit, Intel i7 (4 cores), 6Gb RAM
  • Recent software versions:
    • Scala v2.11, jvm 1.8
    • Haskell GHC v7.6.3 (-O2 flag)
    • F# v2.0, .NET 4, VS 2010
    • Groovy v2.2.2, jvm 1.8
    • Clojure v1.5.1, jvm 1.8
    • Erlang OTP v17
    • CoffeeScript v1.7.1, Node.js v0.10.28

Running time, ms

Observations

  • Scala beats everyone (24ms)
  • Nothing can beat C (15ms), but can be pretty close!
  • JVM IL size: Scala < Groovy < Clojure
  • F# translates neatly into C#-esque IL
  • JIT performance matters. JVM better than BEAM
  • Erlang strings are abysmal. Cache misses matter.
  • JVM 1.8 vs 1.6: Groovy worse, Scala better, Clojure same

Age, years

Popularity - TIOBE rank

OSS projects - GitHub rank

Ohloh - monthly commits

Ohloh - monthly contributors

Ohloh - monthly lines of code

Ohloh - monthly projects

Ohloh - monthly projects (vs C++)

Other FP buzzwords

  • Referential transparency
  • Currying and partial application
  • Recursion and tail call optimization
  • Monads/computations
  • Lazy evaluation

Some more FP buzzwords

  • Algebraic data types
  • Pattern matching
  • List comprehension
  • Memoization
  • Parametric polymorphism and multimethods

Functional vs. OOP decomposition

Why?..

  • All normal languages were already taken
  • To appear smart!
  • "7 languages in 7 weeks" by Bruce A. Tate
  • Because it's fun(ctional)!

Why (seriously)?..

  • ~30 FP talks at the NDC Oslo 2014
  • Parallel programming
  • More accessible these days
  • Hardware is cheaper than programmer time (?..)
  • Production quality languages/success stories

Why (again)?..

  • Testing (e.g. QuickCheck)
  • Reasoning (it's like math)
  • REPL
  • Pleasant to program
  • Code size matters

Why (some more)?..

  • It's like math
  • Understanding may make you a better developer
  • Twisting your brain
  • JavaScript is a functional language
  • Also, LINQ, SQL, Excel...
  • Job market niche

Fallacies

  • Compact does not mean elegant
  • "Elegant" does not mean readable
  • ...and does not mean efficient
  • "Success stories" are often fakes
  • Abusing FP idioms where they don't belong
  • Immutability is not a panacea
  • Golden Hammer syndrome

The future

  • A niche?..
  • ...but getting wider because of parallelism
  • Scala, F# getting more popular
  • Adopted by businesses (mostly startups)
  • Mainstream languages are getting functional features
  • New languages appear (e.g. Swift)
  • Will be always good for brain

In the end of the day, what matters is maintainable code

/*!
 *  The function checks if functional programming is really cool 
 *  @param none
 *  @return true if functional programming is cool and false otherwise
 */
virtual bool doTheCheckThatFunctionalProgrammingIsCool() const 
{
    //  call the function that will tell if functional programming is cool 
    //  and remember the returned value
    bool theFunctionalProgrammingIsCool = isFunctionalProgrammingCoolOrMaybeNot();        
    //  check if functional programming is cool and return true. 
    //  Otherwise, return false
    if (theFunctionalProgrammingIsCool) 
    {
        // return true because functional programming is cool
        return true;
    }
    else
    {
        // the functional programming is maybe not cool, so return false
        return false;
    }
}

??? Questions ???