Notes about Python performance at Scale
So first, what is fast? great question. This guy really sums it up nicely — fast response time (200ms), 200 request per sec (throughput) and 2000 users per one quad machine with 50% cpu load. Scale-factor is NOT performance (you can scale shitty code and it doesn’t mean its production ready).
Write TESTS for benchmarking. Otherwise there is no sense in profiling (there are great tools like pytest)
time.time() — just measures what you want. crap, because a single measurement doesn’t represent reality.
timeit module — “python -m timeit …”
cProfile — built in profiling tool
More — line_profiler, yep, greenletProfiler, memoryProfiler.
Offline profiling will slow your code down, so don’t use it in production.
You need to know what’s going on in production.
Between the lines, I think Mahmoud is most excited about async scaling (because of python CPU and I/O bounds).
A Side note about garbage collection
Garbage collection debugging — python has a GC api that we can use and see what are the vars that are not cleaned — a great way to debug memory leaks.