About programming languages

After just being hired by a group that thinks SML is the only true language, I should probably not be writing this… but here we go.

Today I participated in another programming contest. Nothing official, just for the fun of it. And I decided to challenge myself and solve the first problem (a really easy problem) in a different language. My language of choice for quick and dirty coding has always been C++. It can easily read and write on standard I/O, has a bunch of libraries and data-structures available, and you can compile and have an executable file with a simple g++ code.cpp. No weird keywords or classes, no linking of funny libraries, no console, just a plain binary file ready to be run, and most importantly, run fast!

But there are all these fancy languages around, so I decided to go ahead and see how they would perform (or better how I would perform with them). The problem in question was a simple one. Given a very big number (up to 10¹⁶), we had to transform it in a funny way (reverse and rotate digits — e.g. 6 becomes a 9) and check if the original and the transformed numbers are both primes [1]. The primality test is the first thing that pops out. Maybe I need to implement a fast primality check algorithm? That’s not the case for this problem, the silly one that checks for divisor between 2 and the square root of the number should work just fine (the time limit is 2s). The problem there is the size of the number. Remember that a 32 bit integer can only hold up to 2.147.483.647, a 10 digit number. So we just need to use a 64 bit integer and it should work just fine. Same algorithm.

My first attempt was to implement it on my English of programming languages (not my native language, but another one in which I am quite fluent): OCaml. The code looked nice, apart from some weird castings from int to float and then to int again because, apparently, sqrt and exponentiation only work on floats (what happened to polymorphic functions? wait… it gets worse). After implementing all my recursive functions beautifully, I tested on a few cases and made sure it was working. Then I tested on the biggest possible input and it worked. Great! Submit… runtime error. 🙁 Turns out that, although my machine is 64 bits, the server is 32, so the integers there were overflowing. All I needed to do is use OCaml’s “long int”. I found two libraries: Int64 and Big_int. Since Big_int had more operations (like sqrt and exponentiation), I went for that one. The problem was that, since I was no longer using int, I was no longer allowed to use + or – or /, no no no. I had to use add_big_int, sub_big_int, div_big_int, and so on. My pretty non-verbose functions were ugly 🙁 . With that came a million type errors, I had to add castings everywhere and link a library when compiling. The thing was just horrendous.

This was more than one hour into the contest. So I decided to try a different thing.

Integers of arbitrary precision? Let’s take a chance on my German of programming languages (I cannot quite speak it but really wish I could): Python. It took me an hour to write a 50 line code, since I had to kind of learn everything from scratch, but I had a working program. In the meantime I learned the hard way that Python and recursion do not go together. It was working for the biggest possible input, and it is Python, so if it needed a longer integer it would switch at runtime. Great! Submit… memory limit exceeded. What? Reading here and there I decided that using range() was the thing to blame, so I switched to a strange islice thing. Submit… time limit exceeded 🙁 Come on!

By the time the contest ended I had two beautiful pieces of code in two different languages that just did not work because of technical reasons. Don’t get me wrong, I am sure these languages have their place and are really good for some stuff. But for coding something easy fast and efficiently, I’ll stick with the good old C++.

Out of curiosity I checked what language the other people in the contest were using. Note that these are all people at least 5 years younger than me, possibly 10, still undergrads in CS. And what were they using? C++!! It is quite impressive, since I am pretty sure they learn programming in Python first, and then go ahead to C or ML. They could choose any of the fancy modern languages, but decide to stick with C++. To show that I do not have a biased sample, just check the statistics of websites like codeforces, spoj or UVa.

To think that I learned how to program in Java (it was the latest thing) and nowadays kids are learning how to program in Python (it is the latest thing), the persisting use of C++ says something about it. A colleague referred to C as an honest language. I think C++ is fits this description too. No fancy stuff, it is the thinnest layer between you and assembly code, and it does the job pretty well.

[1] Problem K here.

About Scala

Some reasons why I dislike Scala and all its claimed “smartness”:

Implicity
If there is something I really hate is when people try to hide things from me. Maybe that’s why I find Scala’s implicit stuff so annoying. They try to make things easier by, for example, allowing you to declare implicit type converters. So imagine that you have some class ChocolateCake and then you declare an implicit converter to class Cake. This means that a casting from ChocolateCake to Cake will be made implicitly when necessary. But what happens is that sometimes the compiler is able to infer this conversion, and sometimes not! And when it is not capable, it will throw a typing error on the compilation. And then you realize that you wanted to use Cake the whole time… But you notice that the objects have ChocolateCake type in lots of places. Then you stop and think: what the fuck? And you start searching the code around and find this mini-method ChocolateCakeToCake. And you search the project for places where this method is used, but you find none! So nothing more natural than commenting these three lines of code that are not called anywhere, but then what happens? Your code stops compiling!! Giving errors in places you have never touched… What the fuck, again?? You look one more time and notice the “implicit” word in front of this method. Then you google it around and finds the explanation in Scala’s documentation.
After reading it you understand, but c’mon!!! Was this really necessary? In my opinion, that is not really improving readability of the code or anything… People need to realize that hiding some implementation sometimes makes sense, but then searching for errors on something hidden is *extra* difficult. You either hide stuff that really do not matter and should not be the programmer’s concern, or just make it explicit!
PS: Also holds for the implicit factories… nothing intuitive about that! I prefer all the code duplication, really.

Map[Map[A]]
You want to declare a smart matrix of some type A, what do you do? Map[Map[A]], of course, such as in old C++. Then you want to assign some value to a position: Map(i)(j) = x, right? Wrong! Scala will give an error that Map(i) does not exist… Of course not, I am declaring its existence with this assignment. Of all the robust implementation of the various (really, millions!) of datatypes in Scala, they could not make this simple and intuitive thing work? Such a shame… In C++ it works like a charm!

Data Types
Scala users are very proud about the amount of libraries and datatypes that are already built-in the language. It puzzles me why… I don’t think that having 4 ways to represent a list is particularly good. On top of that, each of these 4 ways have tiny little annoying differences, such as different operators for concatenation. Also, some have assignment and others don’t. And because of these small differences, changing your code from using List to using Seq can be quite a headache.

Mutable and Immutable
Why having sooooo many different datatypes?? Just pick mutable or immutable objects and stick with them! This makes C pointers look like a piece of cake. And because of all the types, we never know which library scala is automatically importing for you…
This is actually a real error: type mismatch.
Found: scala. collection.Map[bla bla bla]Required: scala.collection.mutable.Map[same bla bla bla]And guess what? Importing scala.collection.mutable.Map does not help…. So what is going on? I was using a map method that returned a collection.Map (although it is applied to a collection.mutable.Map). And the weird thing is, the method is available in both types (mutable or not), but it only returns the non-mutable Map. For me this really does not make sense.

Well… there’s a lot more. Maybe if you use this language carefully, some good might come out of it. I just haven’t found it yet. We all know that programming languages, such as operating systems, are like soccer and religion: do not start a discussion on that! But I really needed to get this out of my chest, and it’s good for other anti-scala people to realize they are not alone in the world.

About call-by-value, call-by-name and thunking

This is something nice I learned during the winter school in Marseille, and used right after it on a program I was implementing in OCaml. This week I finally got the time to write it down nicely with an example that hopefully makes things easier to understand =)

So in programming we often use some functions, right? These functions have arguments (which can be themselves functions). Call-by-value and call-by-name are two different strategies to deal with these arguments (and please do not confuse this with passing parameters as values and as references… these are actually two ways of passing parameters in the call-by-value paradigm, if I understand correctly). Here’s a brief explanation of both:

Call-by-value
The arguments of a function are evaluated and then their results are passed to the function.

Call-by-name
The arguments of a function are substituted directly on the function’s body and evaluated only when necessary.

Take, for example, the following function:

let f x = x +x

and apply it to (print “hi!!”; 4).
If  f (print “hi!!”; 4) is evaluated using a call-by-value strategy, the argument will be evaluated once, hi!! will be printed and the return value will be 8.
If  f (print “hi!!”; 4) is evaluated using a call-by-name strategy, the argument will be copied to the body of the function, which will become: (print “hi!!”; 4) + (print “hi!!”; 4). Then, each part of the sum will be evaluated, hi!! will be printed twice and the return value will be 8.

Now, as far as I know, most of the languages use the call-by-value strategy. Maybe because of efficiency, since the repeated use of an argument would cause it to be evaluated several times in the call-by-name strategy. But sometimes it is the case that we actually want a call-by-name strategy to be used. This happened to me when I was implementing a continuation passing style. Basically my function had two arguments, which were functions themselves, called success and fail. Some computation was done and depending on the result the function would be continued by success or fail. Now, I did not want these functions to be executed right away, of course, but only when I had already computed something and decided which was the case. This was implemented in OCaml, which uses the call-by-value strategy. So, in order to simulate a call-by-name, I had to use something called thunking.

Thunk
Thunks are types such as ()-> A which are used to delay evaluation, i.e., if a parameter of a function is of thunk type, it is not evaluated until needed, in a call-by-name style. Thunks simulate CBN in a CBV language.

This is easier to see with an example:

(* Quick OCaml example of call-by-value, call-by-name and thunking *)


let arg () = 
  print_endline “Function evaluated.”
;;


let cbv f =
  print_endline “=> Not using the argument, evaluated anyway.”
;;


let cbn f = 
  print_endline “=> Not using the argument, not evaluated.”
;;


let _ = 
  print_endline “Call-by-value:”;
  cbv (arg ());
  print_endline “Call-by-name:”;
  cbn (fun () -> arg ())
;;

You can try this by saving a file and executing it with ocaml. Cool, isn’t it?