In this note I want to compare performance of the following languages:

  • C++
  • Kotlin
  • Python

Update: Rust was also compared in follow-up note: Language Performance (Part 2): Rust

One could argue that (obviously) C++ is the fastest and python is the slowest. But I think it would be interesting to see some concrete numbers based on realistic benchmark, so here we go. Besides it turned out there are some interesting details that could slow down or speed up computations in certain cases.

Short Summary

Methodology

For benchmark I used subproblem from the ICFPC-2017 contest where you need to do BFS on a graph. There graphs were ranging from 10 to 10000 edges. I filtered tiny ones and got a following list of 13 maps:

0: randomSparse.txt      	V=86	E=123
1: randomMedium.txt      	V=97	E=187
2: tube.txt              	V=301	E=386
3: boston-sparse.txt     	V=488	E=945
4: edinburgh-sparse.txt  	V=961	E=1751
5: oxford-center-sparse.txt	V=1425	E=2020
6: nara-sparse.txt       	V=1560	E=2197
7: gothenburg-sparse.txt 	V=1175	E=2234
8: icfp-coauthors-pj.txt 	V=892	E=3425
9: van-city-sparse.txt   	V=1986	E=3601
10: oxford2-sparse-2.txt  	V=2389	E=3632
11: edinburgh-10000.txt   	V=7357	E=10000
12: oxford-10000.txt      	V=6658	E=10000

All languages implemented the same algorithm:

  1. Read the graph from the file
  2. From each node run BFS and save results
  3. Compute graph “hash”:
for v in all_nodes:
  tmp = 0
  for u in all_nodes:
    d = distance(v, u) # 0 if u is unreachable from v
    tmp = tmp ^ (d * d) # xor
  hash += tmp
return hash

Step 1 wasn’t measured in total time to focus only on pure CPU work comparison (and avoid measuring IO and serialization/deserialization costs).

Languages compared

  1. cpp: C++ with -O3 optimization
  2. python: python 2.7
  3. python3: python 3.6
  4. cython_full: cython with having both steps 2 and 3 implemented in C++
  5. cython_bfs: cython with only bfs implemented in C++ (step 2)
  6. kotlin: single run of Kotlin
  7. kotlin_jit_5: run 5 times Kotlin program in a loop and measure the last run
  8. kotlin_jit: run 100 times and measure last run

Scripts

There are pre-requisites in being able to run scripts

# install cython:
pip install Cython

 # install kotlin:
brew install kotlin

Some of the languages require compilation before running, so the following command will prepare it:

$ compile_all.sh

Then the config in bench.py contains specification how to run each language

RUN_SCRIPTS = {
    "cpp": "./cpp/run.sh",
    "python": "./python/run.sh",
    "python3": "./python/run_python3.sh",
    "kotlin": "./kotlin/run.sh",
    "kotlin_jit": "./kotlin/run_jit.sh",
    "kotlin_jit_5": "./kotlin/run_jit_5.sh",
    "cython_bfs": "./cython/run.sh",
    "cython_full": "./cython/run_full.sh",
}

This scripts will run benchmarks over available maps and append results into separate file results.txt:

{"map": "vancouver.txt", "edges": 3601, "graph_hash": "3987248", "time": "123.39", "nodes": 1986, "name": "cpp"}
...

This file then could be visualized using the following command:

python visualize.py

This will produce the graph of relative timings compared to c++:

And also with logarithmic scale:

Results

Absolute times (ms)



Relative times