Ambiguous documentation

pstats does not behave as the documentation says it does. I got to the stage that you can look through the output of Lib/test/test_profile.py (in Python distribution) and the output of that very same file but with a import hprofile as profile at the top instead of the import profile. My output is good, very good I'd even dare to say. However it is not sorted in the same way!

The output is suppsed to be sorted by "stdname". The description of this output in the documentation is as follows:

The subtle distinction between 'nfl' and 'stdname' is that the standard name is a sort of the name as printed, which means that the embedded line numbers get compared in an odd way. For example, lines 3, 20, and 40 would (if the file names were the same) appear in the string order 20, 3 and 40. In contrast, 'nfl' does a numeric compare of the line numbers. In fact, sort_stats('nfl') is the same as sort_stats('name', 'file', 'line').

But, it appears they actually sort the data with the criteria in a different order as explained above! They seem to sort on 'name', 'line', 'file' when using "stdname".

Notice however how they also say that "the standard name is a sort of the name as printed". And this would make more sense, that is what they actually do. Don't know why they explain it as the same as "nfl" though. Got me confused for a while (I admit, untill halfway this post! Why it can be usefull to blog about your problems!).

On another note, work is progressing nicely. I only need to implement a couple more Stats methods before I'm done. They are a bit harder again though:

The .print_callers() and .print_callees() methods. I'll need to add data into the hstats module before I can do this. But hopefully that shouldn't become to difficult.
Support for loading more then one profiling file. The hard part is not the merging of the data, the problem is that there is also a .dump_stats() method which can save all the data. Since I can not join two hotshot files (not withouth considerable hacking in _hotshot and I try to keep the delta on that file as small as possible, besides, I'm running out of time, need to do uni work next week) I am currently thinking of just pickeling the data. Then to load I can just try one of the formats (the pickle or the hotshot file) and if it fails try the second.

After that there are just bits and bobs to do left and right. Like writing some quick comparison script that looks at my speed increase etc and generally making sure I meet all requirements. ;-)

devork

E pur si muove

Ambiguous documentation