Kevin's Weblog 2007-01-09

Blosxom Performance

2007-01-09 22:50 in /meta
I run my blosxom installation in a hybrid mode, attempting to strike a balance between dynamicism and performance. Since I want comments to show up immediately, individual entries are fully dynamic. The index pages are static files, though. Every 15 minutes, I pull from Subversion and do a partial regeneration of the static pages, which recreates the pages that are obviously related to any new entries: the front page, and the appropriate category and date pages. However, this leaves things in a slightly inconsistent state across the site, since things like the displayed numbers of comments and numbers of postings in each category will not be completely up to date. Also, the picture of the day is not always the same on all pages. To clean this up, once a day I run a complete regeneration of all pages.

Recently, I started getting cron emails saying that the nightly full-rebuild process was being killed. I tried running it manually and the same thing happened. The return code indicated a ‘kill -9’ so I started to suspect that Dreamhost is running some automated process reaper, despite the fact that they claim to have no hard CPU usage limits and that they will contact you if something you are running is causing a problem. Furthermore, trying out different levels of nicing the process suggested a particularly lame scheme, where any process that uses more than 1 minute of CPU time gets killed.

A support request confirmed that there is a CPU monitoring process, but claimed that it’s more sophisticated than my conjecture and based on “sustained high CPU usage”. Of course, even a niced process can use lots of CPU if there’s nothing else trying to run at a higher priority. So, given that I was still getting reaped, their suggestion was to throw in some sleeps, which is lame.

At this point, I started firing up the profiler, while the support representative escalated the request to tier 2. What I found was fairly interesting. Running a request for a single index page under DProf came up with this:
```
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 22.1   0.115  0.307      1   0.1147 0.3068  File::Find::_find_dir
 21.9   0.114  0.192   3531   0.0000 0.0001  entries_index::__ANON__(fb)
 11.5   0.060  0.099     10   0.0060 0.0099  blosxom::BEGIN
 9.64   0.050  0.085   3282   0.0000 0.0000  File::stat::stat
 6.75   0.035  0.035   3282   0.0000 0.0000  File::stat::new
```
What this implies is that the entries subroutine was examining 3531 files when it crawled my data directory. Since I only have 361 blog posts, this was a little odd. On investigation, it turned out that Subversion was to blame. While CVS creates 3 book-keeping files per directory, SVN has many more. Adding a preprocessing callback to File::Find that pruned .svn directories yielded a vast improvement in the number of files considered. But, sadly, not enough improvement to let the big rebuild finish. Which is not really that surprising, since generating many pages is likely to have different bottlenecks than generating just one.

Fortunately, DProf dumps profiling information as it runs, so even though I was getting a ‘kill -9’, I could still gather data on as much of the process as completed. Thus, I found the big killer, my own damn fault:
```
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 38.1   36.51 48.720   3198   0.0114 0.0152  readinglist::filter
```
When I first wrote readinglist, I didn’t bother to do any caching. And, at the time that was okay because filter routines were only run once per execution of blosxom. However, we changed that in 2.0.2 and started running filter for each page that gets generated so that certain plugins would work properly in static mode. Unfortunately, a consequence of that is that inefficiencies in filter subroutines are now much more of a concern, as seen above. Fortunately, implementing a simple cache was, well, simple, and I quickly lopped off that 40%. (I’ve been intending to release a new version of readinglist with some other improvements; all the more reason now.)

At this point, I was still getting killed by the watcher process, though; when I got an email from the tier 2 engineer. He said he’d looked at the logs and looked at the source code, and discovered that in the latest version, the code that granted leniency to niced processes had been removed inadvertantly. It’s been fixed on my hosting machine, and should get rolled out to all the machines shortly.

Although my immediate problem was thus resolved, I may continue these investigations at some point. The current profile looks like this:
```
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 26.9   16.48 62.890   3198   0.0052 0.0197  blosxom::generate
 19.0   11.66 11.660   3199   0.0036 0.0036  Storable::pretrieve
 9.24   5.650  8.350 419544   0.0000 0.0000  blosxom::nice_date
 5.69   3.480  3.480   3198   0.0011 0.0011  readinglist::cache_valid
 4.48   2.740  2.740 168676   0.0000 0.0000  IO::File::open
 4.41   2.700  2.700 419544   0.0000 0.0000  Time::localtime::ctime
 3.70   2.260 14.320   3198   0.0007 0.0045  categories::prime_cache
 3.43   2.100  4.330  71198   0.0000 0.0001  flavourdir::__ANON__(102)
 3.07   1.880  1.880   1754   0.0011 0.0011  blosxom::__ANON__(11c)
 1.23   0.750  0.820   4626   0.0002 0.0002  seemore::story
 1.21   0.740  0.740   3198   0.0002 0.0002  potd::cache_valid
 1.06   0.650  0.650 375042   0.0000 0.0000  UNIVERSAL::can
 1.01   0.620  4.130   3198   0.0002 0.0013  readinglist::filter
 0.64   0.390  8.190   5262   0.0001 0.0016  readinglist::display
 0.47   0.290  0.000   3199   0.0001 0.0000  Storable::_retrieve
```
Clearly, there’s some low hanging fruit here. We’re spending 15% of our time formatting dates. Recall that I only have 361 posts, so for some reason we’re doing this about a thousand times more than we need to. The calls to Storable also seem excessive. I think the categories plugin is to blame for that; although the same issue with 2.0.2 and filter subroutines may apply here. I’m also a little uncertain why readinglist::cache_valid is taking so much longer than potd::cache_valid when the code is basically identical. But, these are all questions for another day.
permalink · comments

Leave a comment

Please use plain text only. No HTML tags are allowed.

About

Kevin's Weblog

2007-01-09

Blosxom Performance

Leave a comment

About