ICFPC 2017: Ticket to Ride
This year ICFPC was essentially about writing AI to modified version of “Ticket to Ride”.
The results are not available yet, and there were no live scoreboard, so we have to wait until final results are announced to get the idea how we did this year. In short though, I think, we didn’t do well: we developed some cool algorithms, but due to bugs that we haven’t fully fixed during contest, they were performing worse than greedy baseline solution, so we had to send that.
[UPDATE] Final results
Rank | Team | Points | Score |
---|---|---|---|
1. | Frictionless Bananas | 1053 | 230901103 |
2. | DiamondPrincess | 924 | 104459039 |
3. | JODY HiGHROLLERS | 854 | 56500318 |
4. | The Blind Hen | 827 | 118433150 |
5. | GennAI | 740 | 17913994 |
6. | Adlersprung | 734 | 32813674 |
7. | Piggybank Software | 694 | 98547270 |
8. | Wile E. | 660 | 80268070 |
9. | Begot | 614 | 24063869 |
10. | trup16 | 581 | 20340988 |
11. | A Storm Of Minds | 576 | 23677727 |
12. | code-o-matic | 544 | 9528083 |
13. | Love and Lies | 423 | 9101316 |
14. | AIM Tech | 384 | 3963862 |
15. | Lambding Snakes vs Coding Monkeys | 356 | 92927 |
16. | Olympia | 328 | 2420547 |
17. | FromRedmondWithLove | 170 | 1037753 |
My Thoughts
- By spirit this year was resembling 2012 (the one with AI to Supaplex-like game). Organizers also promised to change task description and similarly there was no live scoreboard. These are the details that I didn’t like about this contest.
- Because of changing constraints and addition of new features it felt a lot like Software Engineering contest. One would need to write flexible solution that would allow quick iterations and easy modifications in the anticipation of changes.
- Task itself was fairly simple and easy to understand: no specification of VM or other obfuscations. I think of this as a downside. For 3 days marathon I was expecting something more challenging.
- It would be interesting to see what other teams developed and was there anything beyond optimized search with heuristics.
Lessons learned
- Develop testing infrastructure earlier. We implemented arena that allowed verification of algorithms on scale only 36 hours into the contest. Before that we were relying on ad-hoc testing that has large variance.
- Implement algorithms incrementally. We got 2 fairly sophisticated algorithms, but we tested them on scale only during final round which revealed that on large maps they loose to greedy solution.
- Don’t postpone end-to-end integration. We underestimated complexity involved and were not trying to run offline mode until 6 hours before the end.
Our Algorithms
ChaosPunter
Very primitive solver that keeps track of not-taken edges and chooses one of them at random.
GreedyPunter
This is fairly straightforward greedy approach. At each move let’s consider all available edges and take the one the maximizes the score:
- Copy the players graph
- Add the new edge to the graph
- Run BFS from each mine using player’s graph to get which cities are connected
- Accumulate the score using precomputed distances from original graph
FastGreedyPunter
The same idea as GreedyPunter
but uses better data structures to avoid doing M
BFS-s (M
is number of mines) which was quite slow on large maps.
More specifically, we implemented class ComponentsList
to maintain connected
components for particular player with the following API:
void union(c1, c2)
- unite 2 connected canonical componentsc1
andc2
int component(v)
- return the canonical component for vertexv
list<int> nodes(c)
- return all nodes for componentc
void start_transaction()
- start of transaction with all subsequentunion
-svoid rollback_transaction()
- cancel all union-s that were done since the start of latest transaction
Internally ComponentsList
uses the following data structures:
list<int>
of size n to keep track of canonical components for all vertexes.list<set<int>>
of size n for nodes that belong to particular component
On union(c1, c2)
we move each node v
from c2
to c1
and add (v, c2)
to
transactions stack that would allow to do a reverse operation.
If we make simple optimization to always move from smallest component then
both union
and rollback_transaction
has complexity O(min(c1, c2))
.
FastGreedyStochasticBridgesVerticesMaxPunter
This is a modification of FastGreedyPunter
using as objective function
weighted combination of 4 features:
score_gain
- actual increase in our score.bridge_gain
- find edges in the graph that belong to many shortest routes between vertices and incentivize claiming those.vertex_gain
- for eachu
compute sum of-d(u,v)
over allv
in graph and use this as a score. The motivation is to claim vertices that are “central” to graph.stochastic_gain
- do random moves and using score from achieved positions.
Then we defined “initial” phase of the game by condition
current_move < 3.0 * math.sqrt(num_vertices)
. In “initial” phase
enabled bridges_gain
and vertex_gain
and disabled stochastic_gain
.
MTCSSolver
MTCS
stands for “Monte Carlo Tree Search”, but our author was mostly referring
to it as VladSolver
. The idea is fairly simple: let’s simulate a number of games
and then take a move that has highest probability of winning.
Due to huge problem space this approach doesn’t work as is, so we implemented a number of heuristics to narrow down number of possibilities.
TODO: explain heuristics
Infrastructure
tcp client
In the first iteration we made a simple wrapper that would communicate with
game server over TCP-IP and call corresponding API of classes implementing PunterIf
:
class PunterIf:
def get_handshake(self): pass
def process_setup(self, data): pass
def process_move(self, data): pass
Using sockets was quite easy in python. Here is sample code:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(SERVER, PORT)
data = s.recv(BUFFER_SIZE)
The only tricky part was that recv
might return not the whole data, so we
had to first read the stream until ":"
accumulating result into buffer. And
then keep reading until getting expected size of the message.
visualizer
We reused js part of graph UI from organizers and implemented our own server in NodeJS. We also added option to replay logs that helped with debugging AI behavior.
Screenshot of UI:
local server
We implemented server in python that would connect 2 or more players and manage game between them. Players have well defined python API so there was no need to communicate via tcp-ip for them and this simplified integration a lot.
arena
To test algorithms on scale we implemented arena (q3arena.py
) on top of local server.
It would continuously run matches with registered bots and record results.
We had a file q3maps.py
with pre-defined maps:
def m(map_name, **kwargs):
return ("maps/{}".format(map_name), kwargs)
MAPS = [
m("sample.json", n=2),
m("lambda.json", n=4),
m("circle.json", n=4),
m("Sierpinski-triangle.json", n=3),
m("tube.json", n=8),
m("randomMedium.json", n=8),
m("randomSparse.json", n=8),
m("boston-sparse.json", n=8),
m("edinburgh-sparse.json", n=16),
m("gothenburg-sparse.json", n=16),
m("nara-sparse.json", n=16),
m("oxford-center-sparse.json", n=16),
m("oxford2-sparse-2.json", n=16),
m("van-city-sparse.json", n=16),
]
and q3robots.py
with registered algorithms:
from all_solvers import *
def create_all_robots():
return [
create_punter(ChaosPunter, name="ChaosSolver"),
create_punter(FastGreedyPunter, name="FastGreedyPunter"),
create_punter(FastGreedyStochasticPunter, name="FastGreedyStochasticPunter"),
create_punter(FastGreedyStochasticBridgesMaxPunter, name="FastGreedyStochasticBridgesMaxPunter"),
create_punter(VladSolver2, name="Vlad-MCTS-2j"),
create_punter(VladSolver3, name="Vlad-MG-3b", timeout=0.8),
create_punter(VladSolver4, name="Vlad-MG-4b", timeout=0.8),
create_punter(FastGreedyStochasticBridgesVerticesMaxPunter, name="FastGreedyStochasticBridgesVerticesMaxPunter"),
create_punter(FastGreedyBridgesVerticesMaxPunter, name="FastGreedyBridgesVerticesMaxPunter"),
create_punter(FastGreedyStochasticVerticesMaxPunter, name="FastGreedyStochasticVerticesMaxPunter"),
]
Arena was storing logs in 2 places:
arena/results
has raw results with metadata and could be aggregated or analyzed laterarena/battle_logs
has actual moves in the same format as visualizer for step-by-step analysis of matches.
To get faster results from arena we used server with 32 cores and ran it in parallel using the following hacky bash commands.
Start arena:
for i in {1..30}; do python q3arena.py & done
Stop arena:
ps aux |grep q3arena |awk '{print $2}' |xargs kill -9
offline mode
In our solution, in order to be runnable offline, solver itself needs to implement 2 methods:
get_state()
that would return a tuple of internal stateset_state(state)
that would restore from the tuple provided
then we glued this all together using cPickle
and base64
module:
# return state
state = self.get_state()
pickled_state = cPickle.dumps(state)
encoded_state = base64.b64encode(pickled_state)
output_json["state"] = encoded_state
# restore from state
encoded_state = input_json["state"]
pickled_state = base64.b64decode(encoded_state)
state = cPickle.loads(pickled_state)
self.set_state(state)
Final submission
After evaluation all of our bots in tournament on all maps provided by organizers,
we found that FastGreedyPunter
performs the best (after the contest we found a bug in
FastGreedyStohastic
, that made it starting to beat it, but it was too late)
We used this as our submission with the following handling of extensions:
- Futures: just ignore
- Splurges: always do moves by splurges of size 1 to trick opponents, who doesn’t handle them properly
- Options: add options to the list of considered rivers with arbitrary threshold 5 (meaning use option only if score given by it is better than regular score by 5 or more)