This is my 'high-level' approach to profiling and speeding up code in Python. This is by no means an exhaustive approach and will most likely only point out quick and easy places to speed up your code.
The approach is pretty standard for profiling any code: start at the top-level and methodically work your way down the codebase to find the bottlenecks.
This approach is designed to find the most obvious performance bottlenecks quickly and allow you to get back to coding. Fortunately it's is easy enough to run on your code periodically and find some pain points sooner rather than later. 1
The first step is to run the script/application with the built-in profiler and make sure to stress the slower or more interesting code sections:
python -m cProfile -o stats <my_awesome_code>.py
Next, exit the application and use this little script to see where most of the running time was spent:
This script will output all the function calls your application performed and sort them by cumulative time.
This profile output represents an idea of where the majority of the processing time was spent. This may or not be useful at this point, but it can give yield a few clues.
It's possible that steps one and two did not yield any useful performance increases. So, now is the time to drop down a level and narrow down a single function to focus on.
Take another look at the profiling output from step one. Find the function
with the largest
percall column. Now, profile each line in this function to
see if there is something slow that can be easily re-factored.
For this step a few additional profiling tools are needed:
You can find some documentation on these here. In general, I've found these are a little awkward to use and documentation is a bit lacking. Luckily, this simple approach to profiling doesn't need too much documentation.
The first and probably most awkward step is to place the
around the function you're interested in profiling 2. Don't worry, there is
nothing to import for
@profile because it's magic 3.
Next, run the
kernprof.py script with your script/application as the
kernprof.py -v -l <script> <your_script_args>.
Now perform the operations that will cause the profiled function to run.
Finally, we can use the
line_profiler module to look at the results. The
above invocation of the
kernprof.py script created a profiler data file,
Feed this data file to
line_profiler module, and it will print the timing of
the function broken down by line:
python -m line_profiler <data_file>
At this point, you will probably see a few lines stand out as far as the
% Time columns go. In my experience, the slowness at this level
tends to be something like building a list with a lot of elements, constantly
looking for existence of an item in a big list, and other operations that deal
with large iterables.
Now the easy part is finished. It's time to think about the algorithm and data structures. The process of improving your slow performing code is a bit outside of this post since that process is usually very specific to the code itself.
This is a pretty classic case, but you would be surprised how often it shows up.
For example, consider a snippet of code that spends a lot of time looking for existence of an object in a large list within a tight loop. This could be a great place to use a dictionary instead.
The dictionary will greatly improve your look up time but will waste more memory. This is the classic trade off and only you can decide if the increased memory usage is worth it in the context of your application.
As I mentioned in the beginning, this approach might leave you needing more speed. Luckily, there are still several options.
Some bottlenecks like storage I/O, network I/O, etc. are not going to easily show up with this type of profiling. This approach is a quick way to profile and fixing CPU bound tasks.
Profiling and optimization is a very complicated topic. So, my simple approach is barely scratching the surface. This is a topic you will definitely want to learn more about if you are interested in becoming a better programmer. Luckily, there are some really great talks on this subject to help you learn more from real experts.
2 I've seen problems trying to use
@profile on more than one function at a
3 I wish this was done in a different way. The magic of inserting this into
__builtins__ really bothers me philosophically.
Test your skills. Learn something new. Get help. Repeat.Start a FREE 10-day trial