Sure, you can make it 10x faster by optimizing the complexity, but I can improve throughput by 1000x by putting it on 1000 machines.
In practice, it’s a bit of both. Paying attention to Big-O and understanding the gross relative cost of various operations goes a long way, but being able to run an arbitrarily large number of them can get you out of a lot of jams.