Probably because Twitter permissions granularity is very sparse. Last time I checked it was just read the whole account or read/write the whole account.
Thus, to create private lists on your account they need the permission to write.
I will ask to the original creator of the benchmark suite to run the tests again. My development environment is very different, adding the results for Gin would mean that I should change all the results.
Just a tip, to compare martini with Gin, you can run this: