PHP Advent Calendar Day 14

14 Dec 2007

Today's entry, provided by David Sklar, is entitled Timing and Profiling.

David Sklar

Name
David Sklar
Blog
sklar.com/blog/
Biography
David Sklar is a Software Architect at Ning, author of Learning PHP 5 (O'Reilly), PHP Cookbook (O'Reilly), and Essential PHP Tools (Apress), and a fan of half-sour pickles.
Location
New York, New York

You probably want your programs to run as fast as possible. This thrilling PHP Advent Calendar entry talks about ways to time and profile your code, so you can figure out what parts are slow and therefore deserving of your optimization efforts.

microtime() is a simple and direct way to track how long something takes, since it gives you a timestamp that includes milliseconds. (The actual precision varies based on the floating point representation on your system.) Just call microtime() before and after the code you want to time:

  1. <?php
  2.  
  3. $start = microtime(TRUE);
  4. preg_match('@^[a-z]+\.(php|html|js)$@', $filename);
  5. $elapsed = microtime(TRUE) - $start;
  6. printf("The regex match took %.06f seconds.\n", $elapsed);
  7.  
  8. ?>

This produces something like:

The regex match took 0.000169 seconds.

Not bad, but only really useful when compared to something else. So, does the regular expression get faster or slower (or is there no change) if the subpattern is changed to non-capturing?

  1. <?php
  2.  
  3. $start = microtime(TRUE);
  4. preg_match('@^[a-z]+\.(php|html|js)$@', $filename);
  5. $elapsed = microtime(TRUE) - $start;
  6. printf("The regex match took %.06f seconds.\n", $elapsed);
  7.  
  8. $start = microtime(TRUE);
  9. preg_match('@^[a-z]+\.(?:php|html|js)$@', $filename);
  10. $elapsed = microtime(TRUE) - $start;
  11. printf("The non-capturing regex match took %.06f seconds.\n", $elapsed);
  12.  
  13. ?>

The result looks something like:

The regex match took 0.000170 seconds.
The non-capturing regex match took 0.000017 seconds.

One sample each doesn't provide much confidence, so better to run a loop of many iterations of each:

  1. <?php
  2.  
  3. $iterations = 1000;
  4.  
  5. $elapsed = 0;
  6. for ($i = 0; $i < $iterations; $i++) {
  7.     $start = microtime(TRUE);
  8.     preg_match('@^[a-z]+\.(php|html|js)$@', $filename);
  9.     $elapsed += microtime(TRUE) - $start;
  10. }
  11.  
  12. printf("capturing: %d iter in %.06f secs = %.06f iter/sec\n",
  13.        $iterations, $elapsed, $iterations / $elapsed);
  14.  
  15. $elapsed = 0;
  16. for ($i = 0; $i < $iterations; $i++) {
  17.     $start = microtime(TRUE);
  18.     preg_match('@^[a-z]+\.(?:php|html|js)$@', $filename);
  19.     $elapsed += microtime(TRUE) - $start;
  20. }
  21.  
  22. printf("non-capturing: %d iter in %.06f secs = %.06f iter/sec\n",
  23.        $iterations, $elapsed, $iterations / $elapsed);
  24.  
  25. ?>

This shows that the non-capturing version is faster, but not by that much:

capturing: 1000 iter in 0.003977 secs = 251472.150609 iter/sec
non-capturing: 1000 iter in 0.003459 secs = 289082.914053 iter/sec

You can enhance this in other ways, but the basic idea is to use microtime() to keep track of how long a bit of code takes to run, run that code a number of times, and keep track of the total elapsed time. This makes it easy to compare the speed of different solutions to a problem.

Critical to keep in mind when doing timing like this, however, is how much user-visible gain the your performance improvements will actually show. You don't want to make things intentionally slower with no benefit, but often a slower-running but easier-to-read solution is a better choice, especially when the performance difference may be minimal. If a regex match can be run 250,000 times a second or 290,000 times a second, choosing one regex over another isn't going to make much of a difference in practice.

A profiler such as Xdebug is very helpful for giving you the information you need to direct your performance enhancement efforts in the right direction. With Xdebug, you can identify what parts of your code run most frequently, or take up the biggest chunk of the processing time of a page. That way, you can work on speeding up what will actually make the biggest difference in total runtime.

Xdebug is a binary extension to PHP. You can download a library for Windows from xdebug.org. On other platforms, you can install it with the pecl tool. The full documentation on Xdebug's profiling capabilities is at xdebug.org/docs/profiler. Once you enable it in your PHP configuration, it observes PHP programs that execute, writing profiling data to individual text files (which can grow quite large). Then, you can load these files into a tool such as KCacheGrind or WinCacheGrind. These tools give you a pretty graphical view of the profiling data.

Profiling data is essential for effective performance optimization. It might seem bountiful to change a preg_match() call into some plain string manipulation functions, but if that code is executed infrequently, you won't see much speedup. Xdebug can show you which areas deserve your attention.