devork

E pur si muove

No feedback?

Friday, August 26, 2005

In the beginning of this week I emailed my mentor to know what he thought of my project, if he was happy with the state etc. I do know from his blog that he's fairly busy these days, but still would enjoy some feedback. Also just in case my email got lost somewhere I'm sure he'll read this. So this was basically just a ping to him, sorry to have bothered everyone else!

One FIXME down...

Tuesday, August 23, 2005

Profile.calibrate() now uses the correct timer instead of just the default timer (time.time()). In fact it still uses this timer when the user did not specify a specific timer (and why really would anyone want this? Appart form making me work one or two weeks more in the last two months.), since both _hotshot and time.time() use gettimeofday() for this. Only when the user requested a timer this timer is used now. Which is fair enough.

The last FIXME is the one where the line events from _hotshot are not used when reading the log file in hstats. Since this functionality is not needed in hpstats to emulate pstats I skipped it. And I still plan on skipping it (in favour of that damn uni work). I still removed the FIXME though. I agree that's quite cheecky, but I made a line event raise a NotImplementedError and documented this as such in the docstring. Hope that covers it.

Magic

Monday, August 22, 2005

As promised I worked on a proper installble package of hprof. That went quite smooth, I had some trouble figuring out how to create a .pth file with distutils, but after some googling I found some setup keyword that worked. Not that is it documented in the official documentation. The result is now that some magic makes that profile and pstats are used when available, but when not hprofile and hpstats will be loaded as profile and pstats. Also cared for is that the _hotshot supplied with hprof is used when using hprofile so that the user supplied timer works.

Since I made a release you can grab it. Unfortunately not at Savannah since I am currently parted from my GPG key for a while.

So my current TODO list:

  • Make a proper patch for _hotshot.c and try to see if upstream want's it. It would be nice and make life easier.
  • Look if it is doable to get rid of these last two FIXME's
  • Maybe make a Debian package for the module, I'll need to think about this a bit frist though.
  • Maybe something else I forgot here. I'm still a bit anxious that I didn't find some bovious large thing I missed... ;-)
  • Do the uni work that has a deadline for 2 September. :-(

  • Small update

    Monday, August 22, 2005

    This weeked I've been helping at home again building the patio outside. It's quite impressive to see things changing from a big pit of dirt in the morning to a nice surfaced area in the evening. If I had a digital camera I'd show you nice before and after pictures, however your imagination will have to do.

    Today I did create a small script that shows you how much fater hstats is compared to hotshot.stats. It simply automates what I've bragged about earlier this summer. I lost a litle bit of speed (last time I bragged it didn't collect the callers of every function, but that was needed for pstats emulation) but not significantly. Intrestingly enough it seems that the speed gain is smaller on the 500MHz Pentium III coppermine based laptop than on the 450MHz Ultra Sparc IIi Sarbre Sun box. But it's still significant, on the last tests it was about 30% gain on the Sparc and 25% gain on the i386. Also curious is the Intel outperforming the Sun in absolute figures, 228s to 96s and 157s to 72s. But I don't dare to generalise anything here!

    Tomorrow I'll look at making a proper installable package from my work. A setup.py exists already, but I only used the build target to test the _hotshot module. A MANIFEST.in file still needs to be created for examle as well as some other administrativa files.

    That's it for tonight!

    (Almost) Done!

    Thursday, August 18, 2005

    Just completed hpstats module, it's all in CVS. This means that theoretically the project is finished! Altho I expect to code another few days on it before the end of August, nothing can be perfect... leave alone the first time round!

    Along the list of TODO's is making a little script that shows that my stats analysis is faster then the old hotshot one. Other then that I'll have to go and dig up the email in which my mentor mentions what the requirements where for the project. But I'm sure I'll have a hand full of things to do. Also I guess I'll figure out tomorrow that I forgot something obvious, that's the way things go.

    But for now some ephoria. Some random statistics: I wrote 112 unit tests, all of which pass. A grep FIXME *.c *.py only retruns me tow hits, one of which is non-essential, the other also not really ;-). SLOCCount tells me I wrote about 1833 lines of Python, did this in about 4.53 months and the total estimated cost to develop is apparently $51,046. And this does not include my modifications to _hotshot.c since I did not write a significant amount of the source there.

    That makes me feel good :-)

    Sleep is good

    Thursday, August 18, 2005

    Last night I got some problems with a function call. As far as I could see it was right but it's unit test kept failing. After a while I called it a day and went to sleep. Wake up this morning and have a look. Only to discover it was a trivial error in the test! Aaarch.

    And for today hopefully the last stretch of road to complete. Let's get started!

    Nice progress

    Wednesday, August 17, 2005

    Work has been progressing nicely the last few days. All the hacking has been going smoothly, which does not mean bug free - far from it! But I seem to have finally reached a good cycle of writing unit tests, implementing the features and refining that all. The scaffolding is in place, the design proven to be good enough (I did have to go and modify other bits that I wrote earlier this summer to achieve new functionality), it's almost a joy to work!

    Last night I thought I had printing finished, a bit premature as I discovered this morning. I completely forgot about the filtering of the output. But that's corrected now. Also the modifications needed in hstats for the support of hpstats.Stats.print_callers(); hpstats.Stats.print_callees() is in place so I can now go and knock that down. This should be finished today so I can work on the multiple profiles support tomorrow. Future looks bright (for the moment at least)!

    Ambiguous documentation

    Tuesday, August 16, 2005

    pstats does not behave as the documentation says it does. I got to the stage that you can look through the output of Lib/test/test_profile.py (in Python distribution) and the output of that very same file but with a import hprofile as profile at the top instead of the import profile. My output is good, very good I'd even dare to say. However it is not sorted in the same way!

    The output is suppsed to be sorted by "stdname". The description of this output in the documentation is as follows:

    The subtle distinction between 'nfl' and 'stdname' is that the standard name is a sort of the name as printed, which means that the embedded line numbers get compared in an odd way. For example, lines 3, 20, and 40 would (if the file names were the same) appear in the string order 20, 3 and 40. In contrast, 'nfl' does a numeric compare of the line numbers. In fact, sort_stats('nfl') is the same as sort_stats('name', 'file', 'line').
    But, it appears they actually sort the data with the criteria in a different order as explained above! They seem to sort on 'name', 'line', 'file' when using "stdname".

    Notice however how they also say that "the standard name is a sort of the name as printed". And this would make more sense, that is what they actually do. Don't know why they explain it as the same as "nfl" though. Got me confused for a while (I admit, untill halfway this post! Why it can be usefull to blog about your problems!).

    On another note, work is progressing nicely. I only need to implement a couple more Stats methods before I'm done. They are a bit harder again though:

  • The .print_callers() and .print_callees() methods. I'll need to add data into the hstats module before I can do this. But hopefully that shouldn't become to difficult.
  • Support for loading more then one profiling file. The hard part is not the merging of the data, the problem is that there is also a .dump_stats() method which can save all the data. Since I can not join two hotshot files (not withouth considerable hacking in _hotshot and I try to keep the delta on that file as small as possible, besides, I'm running out of time, need to do uni work next week) I am currently thinking of just pickeling the data. Then to load I can just try one of the formats (the pickle or the hotshot file) and if it fails try the second.
  • After that there are just bits and bobs to do left and right. Like writing some quick comparison script that looks at my speed increase etc and generally making sure I meet all requirements. ;-)

    Clarifying the hstats module

    Monday, August 15, 2005

    Some confusion seems to exist about the stats modules I'm writing. About a month and a bit ago I wrote hstats as a module to analyse hotshot profiling data. It was never my intention to make it compatible with pstats at all. It aimed at being usable to read profiling statistics from hotshot in an efficient way.

    Currently I am working on the hpstats module, which will be api compatible with pstats. This module does use hstats so it only needs to handle higher level stuff. As an effect of this I'm now sometimes putting in new functionality into hstats but I hope this is a good thing.

    So to say the essense again: hstats is not API compatible with pstats and is not meant to be.

    Why one should have a separate /boot partition: lessons learned

    Monday, August 15, 2005
    Background

    The box (SPARC box to make matters more intresting) has two disks /dev/hda and /dev/hdb, they each have the same partion table appart from some free space at the end since two disks are never of the same size. All partitions are of type "RAID autodetec" and run indeed a RAID1 mirror. Several partitions exist:

  • /dev/md0 -> /
  • /dev/md1 -> swap
  • /dev/md2 -> /home
  • /dev/md3 -> /var
  • /dev/md4 -> /home
  • As disks are still disks we started having serious trouble with /dev/hda on the /var partition. Lots of errors, it started with I/O errors on /dev/hda that resulted in a breaking mirror. After investigation with SMART monitoring tools it seemed we first had 40, then 200-something unrecoverable errors. Trying to force SMART to repair these errors failed miserably, not sure why, maybe the spare sectors where all used (altho only 8 are reported to be used!). Anyway the disk is not in a healty condition.

    Being cautious we decide to play it safe, / had enough space so we move /var there. Next we try to build a new filesystem with a bad block scan on /dev/md3. No way, Linux software RAID just doesn't like this and fails the /dev/hda partition. At that point we decide we'll leave /var just on the root filesystem for now untill we get round to buying a new hard drive.

    Updating /etc/fstab and stopping the /dev/md3 RAID device, also zeroing the superblocs so mdadm doesn't try to assemble it at boot time is the next step. Now a simple reboot so we can be 100% sure everything is still fine.

    The problem

    After rebooting we only get SI from the SILO boot loader. WTF?? Boot from a Debian installation CD. Boot: rescue root=/dev/hda0 No luck. Whatever we try, no rescue boot works. So we get out the disk, attach it to another box, and run silo -r on it. Put it back and everything is fine.

    What happened

    When copying /var (a large chunk) to the root partition the filesytem driver or SMART will have decided that it is more efficient to move some of the already existing files. So it very funnily moved our /boot/second.b file around. That wasted about 4 hours of my time.

    So the lesson learn is to always make a separate partition for /boot so problems like this don't occur. Now we only need to fiddle with the usable sectors of /dev/md3 to make a smaller partition with no errors on and move /boot there. But that won't be for today!

    Cross platform developing

    Sunday, August 14, 2005

    Even when using Python this is easier said then done. One should be very carefull as details lies in tiny details. Wich I discovered to my shame.

    Normally I work on a desktop, this box listenst to the name of Ultra 10, so it is a UltraSparc architecture as developed by Sun Microsystems. Sometimes however, like the last few days, I do use a Compaq Armada M700, which is a Intel i386 architecture. So today I started to work again on the U10, looking where I am I run the unit test suite test_hpstats.py. Normally I'll have one test failing, this then is the feature I was working on. Two tests failed however.

    So why did that test I've forgotten about two days ago sudenly fail? After jotting in some print statements it became apparent. In the test I was comparing two values in a dictionarry to be equal. The value compared I extracted as follows: d.values()[-1][-1]. Very fine, but in one case d.values() looked like [[10, 11, 12], [0, 1, 2]] and in the second it was [[0, 1, 2]]. However as it seemed, when I changed from architecture the order in wich d.values() is sorted in is changed!

    Cheeky. One should be very carefull not to run into traps like this. With reflection I say, "Of course, dictionaries are not sorted!" But the error is made so easily!

    Cleaning up?

    Thursday, August 11, 2005

    So last time I said I wasn't happy with hstats. Maybe that was a bit impulsive, but still valid. Kent Johnson kindly pointed out for me that was the wrong attitude. Maybe he's right (read: yeah, of course he's right), but it is still painfull for me to follow his advice.

    But again, I can't decide. However I have some text file somewhere (under control of an application I don't really like, gjots2) which does have some ideas about what should maybe change in hstats. This should stop me from forgetting what I was thinking that needed doing. It is only so hard to create The Right Design(tm). That was my motivation to postpone it, I hope to get a better idea of what it should look like while I'm using it. However now I'm confronted with Kent's statement, which I fear it true:

    Fix it now. There's never time to go back and clean it up later.

    The argument that nothing depends on it yet weights too of course. I guess I'll have to get my mind round it and get on with fixing it. Only I fear of getting the same situation again next month and think it's a bad solution again.

    Or maybe all this ranting of mine is because I'm having a bad day. The work on hpstats hasn't really been moving along the last few days due to various reasons, that doens't help either. Guess I'll see tomorrow what I'll do.

    I wrote crap

    Wednesday, August 10, 2005

    hstats is crap. And I only wrote it a month ago...

    The good part of that observation is that it shows that I'm learning. ;-)

    So why really? Basically just it's desing. It has one single monolithic class. And within this class everything just happens like one would wirte precedural or functial old style C. Quite useless really.

    So for building the pstats wrapper hpstats, I'll use it anyway. I guess it's more important to move on and use it. Later on I can go back, and improve hstats. Which in turn will break hpstats. That's what writing software is for you.

    Refactor mercilessly.

    Never ending quest

    Wednesday, August 10, 2005

    On my never ending quest to find the best testing strategies I stumbled across this article. It is not too long and well worth reading. Explains some stuff about using mock objects and related testing strategies. All of this is of course XP influenced.

    hotshot accepts user timer function

    Friday, August 05, 2005

    After a long day it all works. It's checked in to CVS in the "hotshot_timer-branch". The way it works is that the object returned by _hotshot.profiler(filename) now has an additional method settimer(timer). The timer is any callable object that returns a number or a sequence of numbers who's sum is the time as per the library reference text. It has more overhead then the buildin timer, certainly when a sequence is returned. But that's the price you pay for flexibility.

    When used form the hprofile module (wich is the replacement for profile) this is not a problem however since that is what calibrate() is for. The overhead is still supposed to be reasonably fixed.

    Once more the writing of unit tests did help to fix a lot of initial bugs. Nice.

    Just some tweaking and testing is left to do with hprofile and it's tests now. Then the wrapper will be (hopefully) complete. So next week I should, with a bit of luck, be able to start on the pstats wrapper wich would, if all is ok, also be the last part of my project to finish. Appart from cleaning up of course...all the fixme's etc need to disappear!

    Pointer to funcion

    Thursday, August 04, 2005

    Great fun these things. Certainly when the pointer to the function is stored inside a struc. The result is that my version of _hotshot now uses a void pointer in ProfilerObject as a function to get the time difference since last time it was called. For now it only goes and runs the old (buildin) timer function but it opens up the possibility to set the timer function by a user.

    The setting of the timer function is sheduled for tomorrow, I guess it will be another day until another timer function can be used in reality but it doesn't look bad. This is after all the last bit that will finish off the profiler module wrapper. So I have the feeling I'm making (good) progress.

    Also to mention is that I figured out how to use branches with CVS. I decided not to add the above things in HEAD as it could have gone very wrong and things could (and can) change drastically. I also wasn't too sure about the change in package layout with this stuff added. Although I do have the impression I got it the first time right (the package layout that is).

    Programming can be simple

    Thursday, August 04, 2005

    It really is clear that it has been over 10 months since I last wrote C. As I mentioned in my last post I got segmentation faults when I changed the version string in _hotshot.c. Today I started by figuring out that the version string gets indeed used at import time. That wasn't too hard. Then I looked at my modification. But wait...I'm trying to strcat() on a string that isn't terminated yet! Move the strcat() one line down and joy all around.

    And that while I thought I'd need a debugger...

    Double Ouch

    Tuesday, August 02, 2005
    One

    To get a user supplied timer function in hotshot I need to start hacking on _hotshot.c. Being cautious I pull the file from python CVS, put it in my package tree, write a setup script, build the stuff before modifying everything -just to test build environment- and finally run all tests with the fresh compiled _hotshot.so.

    So far so good. Yesterday I scribbled down roughly what changes I need to do to _hotshot.c for my needs. Still being cautious I decided to start with something trivial. Modifying the version number so I recognise profiles made by my modified _hotshot module. A simple unit test for this was written a few minutes later so I could get along. There is a function char * get_version_string(void); that parses the CVS revision keyword. I simply modify the function to concatenate the string "-hprof" to the string returned by that function. Merely one line modified and one added.

    So lets try it then! Build goes fine. Then running the test. Segmentation fault. eh???? Really? Python segfaults at module import time! Let's make this clear, before even the get_version_string() function gets called!

    Ouch

    Pretty unproductive day. Coz that's where I'm stuck. I can think of what I want. I'm absolutely clueless about what how and when. Guess tomorrow(*) I'll have to dig into gdb.

    Two

    I started writing this blog text a few hours ago on my laptop. Then I got interupted by dinner and then I had to go and play driving instructor for my sister. Thing was I was supposed to be able to have some free time at the place we where going to as my sister had some things to do there but I didn't, so I suspended the laptop (with this blog text half finished). Suspend normally works. But me saying normally in the last sentence says it all. It never woke up again. It did partially, then got a blank screen.

    Ouch

    Nothing would help. Once I got over the fact that I'd lose it all and I resetted the box I started trying to find what went wrong. Maybe it was the fact that it normally wakes up at the lid open event but I also pressed the sleep button? No. After trying out all combinations I still don't know. It works perfectly in every situation I can think off. But it didn't when I needed it to work. Guess that's life for me.

    (*) Tomorrow I'll actually be helping with the building of the house due to logistic reasons. So it will really be the day after.

    Subscribe to: Posts (Atom)