Pallas Benchmarck results for Miniwulf
Update: Now that Mini has four nodes, I ran the Pallas
benchmark again. The results are available
here. This is for four nodes, running MPICH on a 10bT hub.
I ran Miniwulf in four configurations to gather benchmark data:
- 10bT hub and MPICH
- 10bT hub and LAM-MPI
- 100bTX switch and MPICH
- 100bTX switch and LAM-MPI
Since the NICs in miniwulf's nodes are 10bT running half duplex, the
cluster couldn't take full advantage of the fast ethernet switch. None
the less, use of the switch did cut down on collisions, but added some
latency at lower bandwidths. Here are some of the benchmark results in
graphical form:
The interesting thing here is the step discontinuity at or around 8K bytes.
This shows up in all the benchmark tests. Below this LAM-MPI running on a
hub seems the fastest, and after MPICH on the switch is the clear winner.
I'm guessing the cause of this is that less than 8K may be able to fit in a
single packet, and the hub communicates this the fastest. Above it, and
collisions become a problem. Here the switch helps releive the bottle-neck
to a certain degree.
I should also note that all these results were from two nodes running the
tests with a third waiting in the barrier state. For some reason, LAM-MPI
experienced memory errors trying to run the tests on three or more nodes
at once.
For those interested, the raw output of the benchmark runs is here:
Back to Miniwulf.