Working with .phar files

It is possible to pack an entire PHP web application up in one single file and run it without unpacking it. This files usually have a .phar extension, which is an acronym for PHp ARchive, loosely based on jar (Java ARchive).

The PEAR installer has been distributed since ages as a single .phar file, thanks to the PHP_Archive package.

With PHP 5.3.0, the Phar extension is an official part of PHP. Shipping your applications as Phar thus is safe since 5.2 has reached its EOL already.

Pros and Cons

Distributing a application as Phar is not all sunshine, some things need to be considered:

Plus

Minus

Conclusion

For me, Phar archives are a nice way to try out new software with minimal setup issues.

Until the Linux distributions have strong Phar support, you should not rely on Phar exclusively to distribute your web application.

Tools to work with .phar files

While .phar files can be saved as .zip and .tar and you can open them with a normal compression utility, adding/extracting the meta data and index file stub is impossible without special tools.

PHP's phar

PHP's source distribution ships with a phar executable that provides a comprehensive interface to Phar files:

$ phar help-list
add compress delete extract help help-list info list
meta-del meta-get meta-set pack sign stub-get stub-set
tree version

With its command line interface, you can create new Phar files, extract files from existing ones or repack, compress, sign and change their meta data and index stub.

Unfortunately, neither Debian nor Ubuntu ship that tool with their PHP packages.

phar-util

Krzysztof Kotowicz's phar-util tool has been written for

building, signing and verifying Phar archives with OpenSSL public/private keys

Either clone the git repository or install it from it's PEAR channel:

$ pear channel-discover pear.kotowicz.net
$ pear install kotowicz/PharUtil-beta

Phing's <pharpackage> task

Phing, my favorite build tool, is able to create Phar archives natively:


     
      
      
       
        
       
      
     

     
    
]]>

I'm using it to generate the SemanticScuttle Phar release file on deployment automatically.

Things to consider

Offline documentation

Since everything is in one big file, accessing the README and INSTALL files inside the Phar archive is hard for most users.

Making it available through the .phar in the browser is not the best option because this requires that the user has a web server running and already needs to have your application setup. Also, having the README with the version number available from outside gives potential attackers important version information .

Alternatively you can offer CLI commands to extract the whole documentation or parts of it. This requires the user to know that it's possible.

SemanticScuttle's .phar offers several CLI commands:

 [options] [args]

Options:
  -h, --help     show this help message and exit
  -v, --version  show the program version and exit

Commands:
  list      (alias: l)
  extract   (alias: x)
  run       (alias: r)
]]>

The user can get a list of certain files - only the ones he'll need, like documentation, default configuration file template and database schema files - and extract them.

There is also a way to execute tool scripts inside the phar, e.g. the avahi export or upgrade scripts.

I used PEAR's awesome Console_CommandLine package to handle input arguments and options. It also generates the help screen automatically.

Configuration files

Your application probably needs to be configured by the user. Normal web apps have a config distribution file the user copies and makes the necessary changes in - easy. Your application also knows where it is and can load it without problems.

With a Phar, things are different. First, the user needs to get the config file template from somewhere, preferably from the phar itself. As seen above, listing the files and extracting it is possible via the CLI interface.

The user probably does know how to do it at the beginning, so the application should detect that the file is missing and give the user instructions how to extract the file and where to save it. Don't expect the directory to be writable.

As for the configuration file location: I chose $nameOfThePhar.config.php for SemanticScuttle, because it is possible to have several installations beside each other this way. It also makes clear that the the phar and the config file belong together.

Benchmark

To get some hard data to talk about, I did some benchmarks comparing delivery of normal files vs. files inside the .phar.

I used ApacheBench, Version 2.3 $Revision: 655654 $, Apache/2.2.17, mod_php 5.3.5-1ubuntu7.2 and apc 3.1.3p1-2.

All URLS have been fetched 1000 times, with a concurrency of 20.

Greg, Phar's father, benchmarked phpMyAdmin in 2008 and measured nearly identical performance.

Static file delivery

Most applications have static files - CSS, images, Javascript.

Static file: themes/default/scuttle.css
Total time Requests/second Transfer rate
Direct 0.275 s 3636.72 39709.15 Kb/s
readfile() with APC 0.464 s 2156.37 23357.88 Kb/s
readfile() without APC 0.386 s 2589.13 28045.55 Kb/s
Phar with APC 6.298 s 158.78 1723.30 Kb/s
Phar without APC 6.259 s 159.78 1734.13 Kb/s

Unsurprisingly, static files are delivered really really fast when Apache delivers them directly without asking PHP. Delivery times of static files from the Phar do not differ when the bytecode cache is off or on.

PHP page

Here are the numbers for the SemanticScuttle index page - some SQL is executed, application caching is disabled.

PHP file: SemanticScuttle index.php
Total time Requests/second Transfer rate
Direct without APC 35.534 s 28.14 349.06 Kb/s
Phar without APC 30.614 s 32.67 466.57 Kb/s
Direct with APC 22.377 s 44.69 554.20 Kb/s
Phar with APC 21.613 s 46.27 660.74 Kb/s
Direct with APC, apc.stat=0 22.731 s 43.99 545.86 Kb/s
Phar with APC, apc.stat=0 32.931 s 30.37 433.84 Kb/s

This was a bit of a surprise for me: The pages are delivered fastest when the Phar was used. Reason is probably the saved filetime lookups APC does to check if the bytecode cache is stale.

To rule out the cache check performance, I set apc.stat=0 and ran the tests again. Now what? The application was slower! The Phar was even slower than without APC! I guess this is because apc.stat set to 0, combined with relative includes (which SemanticScuttle uses everywhere) make a really bad combination.

Written by Christian Weiske.

Comments? Please send an e-mail. Or Reply or Like.