PHP Advent Calendar Day 22

22 Dec 2007

Today's entry is provided by Derick Rethans. Today also happens to be Derick's birthday, so I hope you'll join me in wishing him a very happy birthday. (Because I'm a little late posting this, and Derick lives in Norway, I'm afraid this is a belated birthday wish. Sorry, Derick!)

Derick Rethans

Derick Rethans
Derick Rethans has contributed in a number of ways to the PHP project, including the mcrypt, date, and filter extensions; bug fixes; additions; and leading the QA team. He now works as project leader for the eZ components project for eZ systems A.S. In his spare time he likes to work on Xdebug, watch movies, travel, and practice photography.
Skien, Telemark, Norway

This might not seem like a useful gem to most of you, but it has a coolness factor that I hope you appreciate. I am a geek and engineer, and I like to know how things work. Because I deal with lots of PHP things, I want to know how PHP works. So, I spend lots of time figuring this out while working on Xdebug, but that doesn't always go deep enough.

So, some years ago, I starting hacking on a little tool called VLD. The original goal was to turn this into an encoder, but as I don't really care about encoding PHP files, it never made it that far.

What does it do? For each script, function, class, and method, this extension shows the internal execution units that represent your PHP code. There are a couple of things that you have to do before VLD shows you any output. First, of course, you have to install it. Follow the instructions, and add to your php.ini file. If all is well, there should be a VLD section in your phpinfo() output. Because VLD outputs all of the opcodes to standard error, it's not really useful to run it through Apache; it is more suited to run from the command line.

Let's see what it does for the following script:

  1. <?php
  3. $a = 42;
  5. if ($a < 50) {
  6.     for ($b = 0; $b < $a; $b++) {
  7.         echo sqrt($b), "\n";
  8.     }
  9. } else {
  10.     echo "The value $a is too high.\n";
  11. }
  13. ?>

After running this with the following command:

  1. php -dvld.verbosity=0 example1.php

You see output like this:

  1. filename: /tmp/example1.php
  2. function name: (null)
  3. number of ops: 23
  4. compiled vars: !0 = $a, !1 = $b
  5. line # op fetch ext return operands
  6. -------------------------------------------------------------
  7.    3 0 ASSIGN !0, 42
  8.    5 1 IS_SMALLER ~1 !0, 50
  9.        2 JMPZ ~1, ->15
  10.    6 3 ASSIGN !1, 0
  11.        4 IS_SMALLER ~3 !1, !0
  12.        5 JMPZNZ 9 ~3, ->14
  13.        6* POST_INC ~4 !1
  14.        7* FREE ~4
  15.        8* JMP ->4
  16.    7 9 SEND_VAR !1
  17.       10 DO_FCALL 1 'sqrt'
  18.       11 ECHO $5
  19.       12 ECHO '%0A'
  20.    8 13 JMP ->6
  21.    9 14 JMP ->21
  22.   10 15* INIT_STRING ~6
  23.       16* ADD_STRING ~6 ~6, 'The+value+'
  24.       17* ADD_VAR ~6 ~6, !0
  25.       18* ADD_STRING ~6 ~6, '+is+too+high.%0A'
  26.       19* PRINT ~7 ~6
  27.       20* FREE ~7
  28.   13 21* RETURN 1

For every executable unit (script, function, method), it generates this type of output, showing the filename and function/method name, the number of opcodes, the IDs of compiled variables, and the opcodes (execution units) themselves.

By playing with the verbosity, you can control what kind of information VLD displays. A verbosity of 1 will add code path analysis to the output, showing which parts of the code can be executed (indicated by the * after the opcode #). A verbosity of 4 shows all possible information VLD can gather about the execution units. You can also instruct VLD not to execute the script you feed to PHP. Simply add the -dvld.execute=0 statement to the command line.

Interpreting this data is non-trivial, but I am sure you can figure it out. As a hint, !0 is a compiled variable, ~1 is a temporary value and ->15 is a jump instruction. In case you have questions, feel free to send me an email.

One last warning (just in case the big red warning on the site is not enough): VLD cannot be used to decode encoding files. Please do not ask me questions about this.