It is possible to pack an entire PHP web application up in one single file and run it without unpacking it. This files usually have a .phar extension, which is an acronym for PHp ARchive, loosely based on jar (Java ARchive).
With PHP 5.3.0, the Phar extension is an official part of PHP. Shipping your applications as Phar thus is safe since 5.2 has reached its EOL already.
Pros and Cons
Distributing a application as Phar is not all sunshine, some things need to be considered:
- The full application - preferably with all dependencies - is contained in one file
- No unpacking needed. You drop it into your web server's document directory and it runs
- Upgrades are easy, at least for the casual user. Download the new version, use it.
- The application's code cannot easily be changed by attackers.
- Since all depepdencies are included, setup is painless and you can run several versions in parallel.
- Incremental updates are not possible. You always have to download the full new version.
- Upgrading is a manual process unless automated otherwise. If the web app is distributed via a PEAR package, upgrading is much easier for admins.
- Looking inside the application and changing files to add own changes is hard.
- Access to the README file or upgrade instructions is hard opposed to "normally" distributed PHP applications where you see the README and open it in an editor.
- Most web servers do not recognize .phar files, thus initial administrative work is needed until the situation gets fixed.
For me, Phar archives are a nice way to try out new software with minimal setup issues.
Until the Linux distributions have strong Phar support, you should not rely on Phar exclusively to distribute your web application.
Tools to work with .phar files
While .phar files can be saved as .zip and .tar and you can open them with a normal compression utility, adding/extracting the meta data and index file stub is impossible without special tools.
PHP's source distribution ships with a phar executable that provides a comprehensive interface to Phar files:
$ phar help-list add compress delete extract help help-list info list meta-del meta-get meta-set pack sign stub-get stub-set tree version
With its command line interface, you can create new Phar files, extract files from existing ones or repack, compress, sign and change their meta data and index stub.
building, signing and verifying Phar archives with OpenSSL public/private keys
Either clone the git repository or install it from it's PEAR channel:
$ pear channel-discover pear.kotowicz.net $ pear install kotowicz/PharUtil-beta
Phing's <pharpackage> task
Phing, my favorite build tool, is able to create Phar archives natively:
I'm using it to generate the SemanticScuttle Phar release file on deployment automatically.
Things to consider
Since everything is in one big file, accessing the README and INSTALL files inside the Phar archive is hard for most users.
Making it available through the .phar in the browser is not the best option because this requires that the user has a web server running and already needs to have your application setup. Also, having the README with the version number available from outside gives potential attackers important version information .
Alternatively you can offer CLI commands to extract the whole documentation or parts of it. This requires the user to know that it's possible.
SemanticScuttle's .phar offers several CLI commands:
[options] [args] Options: -h, --help show this help message and exit -v, --version show the program version and exit Commands: list (alias: l) extract (alias: x) run (alias: r) ]]>
The user can get a list of certain files - only the ones he'll need, like documentation, default configuration file template and database schema files - and extract them.
There is also a way to execute tool scripts inside the phar, e.g. the avahi export or upgrade scripts.
I used PEAR's awesome Console_CommandLine package to handle input arguments and options. It also generates the help screen automatically.
Your application probably needs to be configured by the user. Normal web apps have a config distribution file the user copies and makes the necessary changes in - easy. Your application also knows where it is and can load it without problems.
With a Phar, things are different. First, the user needs to get the config file template from somewhere, preferably from the phar itself. As seen above, listing the files and extracting it is possible via the CLI interface.
The user probably does know how to do it at the beginning, so the application should detect that the file is missing and give the user instructions how to extract the file and where to save it. Don't expect the directory to be writable.
As for the configuration file location: I chose $nameOfThePhar.config.php for SemanticScuttle, because it is possible to have several installations beside each other this way. It also makes clear that the the phar and the config file belong together.
To get some hard data to talk about, I did some benchmarks comparing delivery of normal files vs. files inside the .phar.
I used ApacheBench, Version 2.3 $Revision: 655654 $, Apache/2.2.17, mod_php 5.3.5-1ubuntu7.2 and apc 3.1.3p1-2.
All URLS have been fetched 1000 times, with a concurrency of 20.
Static file delivery
|Total time||Requests/second||Transfer rate|
|Direct||0.275 s||3636.72||39709.15 Kb/s|
|readfile() with APC||0.464 s||2156.37||23357.88 Kb/s|
|readfile() without APC||0.386 s||2589.13||28045.55 Kb/s|
|Phar with APC||6.298 s||158.78||1723.30 Kb/s|
|Phar without APC||6.259 s||159.78||1734.13 Kb/s|
Unsurprisingly, static files are delivered really really fast when Apache delivers them directly without asking PHP. Delivery times of static files from the Phar do not differ when the bytecode cache is off or on.
Here are the numbers for the SemanticScuttle index page - some SQL is executed, application caching is disabled.
|Total time||Requests/second||Transfer rate|
|Direct without APC||35.534 s||28.14||349.06 Kb/s|
|Phar without APC||30.614 s||32.67||466.57 Kb/s|
|Direct with APC||22.377 s||44.69||554.20 Kb/s|
|Phar with APC||21.613 s||46.27||660.74 Kb/s|
|Direct with APC, apc.stat=0||22.731 s||43.99||545.86 Kb/s|
|Phar with APC, apc.stat=0||32.931 s||30.37||433.84 Kb/s|
This was a bit of a surprise for me: The pages are delivered fastest when the Phar was used. Reason is probably the saved filetime lookups APC does to check if the bytecode cache is stale.
To rule out the cache check performance, I set apc.stat=0 and ran the tests again. Now what? The application was slower! The Phar was even slower than without APC! I guess this is because apc.stat set to 0, combined with relative includes (which SemanticScuttle uses everywhere) make a really bad combination.