The Factorio Benchmark Website

test-000055 : Regression testing Factorio

The TLDR

No significant regressions were identified, but cross version migrations (such as recharting the map) can result in performance hits across versions.

The Question

Factorio has had its fair share of updates over the years, and across those versions, many optimizations have come. This test aims to categorize the performance optimizations and potential regressions across a variety of saves.

These changes have often carried recipe changes as well, so accurate testing does become more difficult. Fortunately, recipies have not changed since 0.17.60 (0.18.45 is current at the time of writing). This means we have a good window where we can see approximate 1:1 equivalence between the versions.

The Test

For this test we'll want a fairly good collection of maps that stress a reasonable variety of different playstyles. A small slice of all possible playstyles are catalogued here. It is from this list that we will select a few maps from various versions.

This test ran each of these maps in the version the savefile was created in (or nearest available version), and all future Factorio versions available at the time of testing.

For each map in each version of Factorio, a verbose benchmark was ran for 100 ticks and 10 runs. Factorio-benchmark-helper was used to automate this process, using the --regression-test flag (Linux only). Each version of Factorio used was the headless version as it had the greatest availability/ease of use. The average of the 10 runs was taken, so minor differences should not neccessarily be treated as genuine change.

The Data

Based on this data, we can see a few interesting things.

For one, performance seemed to be greatly improved by version 0.18.22. As a matter of fact, this version was one where in this bug was fixed. The fix only applies to Linux, but it's nice to see none the less.

Also of note is that the "other" section seemed to jump up after 0.17.79. The overwhelming majority of the "other" section infact belongs to chartUpdate instead, as a migration that recharts the map was introduced.

Another interesting thing to point out is that the fluids update time effectively appeared to dissappear after 0.17.79. This is because it's now updated in parallel with the electric network from 0.18.0 onwards. It does appear that it caused the electric network time to take longer, but the combined cost was reduced.

Closing

There wasn't any significant unexplained performance regression observed in the maps/versions tested. At the same time, only two updates appeared to have a significant performance uplift, 0.18.0 and 0.18.22. Considering many versions are trivial changes, this is hardly a surprise.