Christians Tagebuch: peardoc

The latest posts in full-text for feed readers.

Net_Webfinger 0.3.0 released

Webfinger - a way to discover information about people by just their email address - changed quite a bit since I wrote the first version of Net_WebFinger, a PHP library to do this discoveries.

Changes

The now 13th iteration of the spec got rid of RFC 6415, requiring only a single HTTP request to fetch the information:

http://example.org/.well-known/webfinger?resource=acct:bob@example.org

The default serialization format now is JRD, the JSON version of XRD.

CORS is now mandatory, so that web-applications can fetch the files, too.

Package releases

To accommodate these changes, I released version 0.3.0 of Net_WebFinger, together with version 0.3.0 of XML_XRD that is used to parse the underlying XRD/JRD files.

I also took the time to update Net_WebFinger's and XML_XRD's documentation .

Net_Webfinger now supports the new Webfinger draft, but is still able to fall back to the old system - many providers, Google among them, didn't make the switch yet.

XML_XRD fully supports reading and writing JRD files now.

Happy discovery.

Published on 2013-08-09 in pear, peardoc, php, web

WebFinger library for PHP released

I'm implementing OpenID for SemanticScuttle, your self-hosted social bookmark manager. To log in with OpenID, you need to know your OpenID URL, which many people do not know, and don't want to know. Most know their email address, and thanks to WebFinger, this is all you have to know!

WebFinger enables applications to discover information about people by just their e-mail address - for example their OpenID URL!

I didn't find a single standalone WebFinger library for PHP, so I asked on StackOverflow, but did not get any responses. Failed to stand on the shoulders of giants, I went the hard way and implemented it all myself: Net_WebFinger, based on XML_XRD.

Implementation

WebFinger weaves RFC 6415: Web Host Metadata with LRDD which both use XRD files.

Thus the first step was to build a clean XRD library for PHP, with an intuitive API and 100% unit test coverage. I proposed the XML_XRD package on 2012-02-01, called for votes 8 days later. It was accepted with 11 votes. Extensive documentation does also exist now.

After the foundation was laid, I proposed the Net_WebFinger package. It was accepted as new PEAR this night, and just some minutes ago it got its first official release and a lot of documenation.

Usage

So, discovery is easy now! First, install the PEAR package:

$ pear install net_webfinger-alpha

Now the PHP code:

<?php
require_once 'Net/WebFinger.php';
$wf  = new Net_WebFinger();
$react = $wf->finger('user@example.org');
if ($react->openid !== null) {
    echo 'OpenID provider found: ' . $react->openid . "\n";
}

//list all other links:
foreach ($react as $link) {
    echo 'Link: ' . $link->rel . ' to ' . $link->href . "\n";
}
?>

WebFinger CLI

Net_WebFinger ships with a command line client that you can use to try it out. Find it with

$ pear list-files net_webfinger|grep cli
doc  /usr/share/php/docs/Net_WebFinger/examples/webfinger-cli.php

Yahoo and Google already support WebFinger. Distributed social networks like status.net (that powers identi.ca) and Diaspora use WebFinger to distribute public encryption keys, OStatus and Salmon URLs. You can try one of those user addresses, too.

$ php /usr/share/php/docs/Net_WebFinger/examples/webfinger-cli.php klimpong@gmail.com
Discovering klimpong@gmail.com
Information secure? false
OpenID provider: http://www.google.com/profiles/klimpong
Link: http://portablecontacts.net/spec/1.0: http://www-opensocial.googleusercontent.com/api/people/
Link: http://portablecontacts.net/spec/1.0#me: http://www-opensocial.googleusercontent.com/api/people/102024993121974049099/
Link: http://webfinger.net/rel/profile-page: http://www.google.com/profiles/klimpong
Link: http://microformats.org/profile/hcard: http://www.google.com/profiles/klimpong
Link: http://gmpg.org/xfn/11: http://www.google.com/profiles/klimpong
Link: http://specs.openid.net/auth/2.0/provider: http://www.google.com/profiles/klimpong
Link: describedby: http://www.google.com/profiles/klimpong
Link: describedby: http://www.google.com/s2/webfinger/?q=acct%3Aklimpong%40gmail.com&fmt=foaf
Link: http://schemas.google.com/g/2010#updates-from: https://www.googleapis.com/buzz/v1/activities/102024993121974049099/@public

$ php /usr/share/php/docs/Net_WebFinger/examples/webfinger-cli.php singpolyma@identi.ca
Discovering singpolyma@identi.ca
Information secure? false
OpenID provider: http://identi.ca/singpolyma
Link: http://webfinger.net/rel/profile-page: http://identi.ca/singpolyma
Link: http://gmpg.org/xfn/11: http://identi.ca/singpolyma
Link: describedby: http://identi.ca/singpolyma/foaf
Link: http://apinamespace.org/atom: http://identi.ca/api/statusnet/app/service/singpolyma.xml
Link: http://apinamespace.org/twitter: https://identi.ca/api/
Link: http://schemas.google.com/g/2010#updates-from: http://identi.ca/api/statuses/user_timeline/15779.atom
Link: salmon: http://identi.ca/main/salmon/user/15779
Link: http://salmon-protocol.org/ns/salmon-replies: http://identi.ca/main/salmon/user/15779
Link: http://salmon-protocol.org/ns/salmon-mention: http://identi.ca/main/salmon/user/15779
Link: magic-public-key: data:application/magic-public-key,RSA.jylO6IUdOFhUadS0bkvq4Vkx_fh...
Link: http://ostatus.org/schema/1.0/subscribe: http://identi.ca/main/ostatussub?profile={uri}
Link: http://specs.openid.net/auth/2.0/provider: http://identi.ca/singpolyma

Published on 2012-02-24 in pear, peardoc, php, semanticscuttle, web

PEARhd steaming on

My last weeks have mostly been spent - beside work and normal housekeeping tasks - getting PEAR's documentation system (peardoc) to base on PhD, PHP's very own DocBook rendering system.

PhD, initiated as a $evilsearchenginename summer of code project, is a fully php-based tool to convert the documentation for PHP, which is written in Docbook 5, into XHTML, PDF and manpages. The reason for PhD to exist was that the previously used DSSSL based system was slow: a full build (all formats and all languages) took 24 hours to complete. Further, the tools the system based on were old, rusty and nobody understood why they broke on some machines, but also why they worked on other ones. Having a php-based system for PHP ensures that there is always someone around who can fix it if it's broken. This wasn't the case with the old documentation build system.

In PEAR and peardoc, we based on the same tools. The structure is a bit different as were the styles, but the foundation was equal. This didn't bother me much until serveral months ago when - out of sudden - peardoc wouldn't build for me anymore either. We have had some reports that people didn't get it working on their machines, but for most of us it just worked. Until now.

For someone feeling responsible for PEAR's documentation, not being able to build the docs is a serious problem. So after having followed the dribbling lonely mailing list posts about peardoc and PhD in the last year, I finally took the time to fully converting peardoc to shiny new PhD.

P H what?

The first "issue" to solve was getting PhD actually working on my system. PhD releases are installable via its very own PEAR channel, but version 0.2 was too outdated compared to the state in CVS. This version needed PHP5.3 - yet unreleased - so I had to install that from CVS HEAD. It wasn't hard, and PhD worked on phpdoc.

Conversion to Docbook 5

The most far reaching feature of PhD is that it works on Docbook 5 files only. phpdoc had been converted from Docbook 4 to 5 already, and now I had to do the same with peardoc. docbook.org offers a db4-upgrade.xsl upgrading script, but it has several flaws:

It uses XSLT, which is really fine for transforming XML. Unfortunately, reading that XML loads and replaces all entities. Since we use entities to include the thousands of files in peardoc, the script could only be used to convert the whole manual into one large file.
The xsl script does not do everything right. Table title tags are converted like normal title tags and put into an info section, which is not allowed in tables.

Luckily, Brett had already written a script that took the single XML files, escaped entities into comments, did the same for CDATA sections and piped that prepared data through the conversion script. It had a flaw that pages with multiple CDATA sections had only the contents of the first one after transformation, but that was fixed easily. It also kept failed to convert charsets, so I ended up having mixed utf-8 and iso-8859-1 chars in the same file at first.

I spent three days tweaking the xsl script until the converted files satisfied xmllint. Unfortunately, my now written configure.php told me that there are still more, subtle errors that break validation against the Docbook 5 DTD. We were using lines like

<parameter>$mode = &true;</parameter>

a lot. &true;, &false; and &null; were replaced with <constant>(true|false|null)</constant> - so we had a <constant> tag in <parameter> which is not allowed by Docbook 5. Since the entities should be kept, using xslt to transform them away was no option.. I had to add fixes to the conversion script which slowly grew into a small monster. After spending a day working full time on the conversion, the english version validated fully against the DTD.

P H fast!

Now that at the XML was shiny, too, it was time to actually use PhD on it. The numbers were amazing: While a build for one format and one language took around 40 minutes on my system (dual core Macbook with 2GHz and 2GiB RAM), building the same with PhD takes 45 seconds!

Having a fast build system is essential, if not crucial: When a newbie translator/documentor writes his first manual page, he doesn't know much about docbook, about its tags and so. But he wants to see something, and if it's only a clear message what he did wrong. Since it was so hard to setup the old build system with DSSSL, people committed files that had not been tested at all - the build broke, and if the commit happened saturdays or sunday mornings, the weekly manual rebuild on the live server was broken.

While I really hope that the new build system lowers the entry barrier for package developers to write nice documentation, experiences of the phpdoc people are disenchanting: No new documentors appeared, some old ones even have not been seen since.

Language support

PhD itself only swallows a huge xml file and spits out the desired format and theme, be it chunked xhtml, a big pdf page or files for pearweb. The translation system in peardoc (and phpdoc) works with entities: There is one large chapters.ent file that contains entities for all files in peardoc. When you want a different language, the entities need to reference different files, for example ja/package/mail.xml instead of en/package/mail.xml.

This is one of the things PhD does not do itself (although it's planned as PhDsetup). We need a config script which does it. While this in the beginning had been a file of 11 lines of PHP code, it has been growing to 216 lines at the time of writing this, with full command help, docblocks and all.

configure.php currently does three tasks:

Creating chapters.ent for the selected language, falling back to the english version if there is no translation for a file.
Creating a giant xml file out of the single one. This is not necessary, but it speeds up PhD quite a lot.
Validating the manual against the Docbok 5 DTD.

So after language selection was implemented, I could validate the translations which went relatively problem-less. Now the moment had come and I fully cleared peardoc's cvs module (after tagging the old state of course) and committed the Docbook 5 based files.

We're done! Well, not yet..

Now that we could tan ourselves in the reflections of shiny PEARhd (peardoc + PhD), thoughts drifted and I remembered the problems of the old xml structure. One of the biggest problems was that every package category was an own chapter, and the packages themselves had only section tags available to use.

This is a real problem since one could not properly structure a package's documentation. Also, integration of external documentation was nearly impossible. For example, Laurent Laville wrote TDGs for his packages - full <book>s in docbook format. But since they were books and not sections, there was no way to include them into peardoc.

So the xml files themselves needed to be restructured: Every package should get its own <book> tag. This transition was really daunting. It took a whole week with three new conversion scripts and a lot of manual fixing. I did what I could, but without the help of David and Ken, the french and japanese translations still would not be done now.

While working to get the translations build, I came across a problem that phpdoc solved with entity files: Package category pages in translations are often not updated when a new package has been documented, leaving the translation documents without even the english version of the package manual. Now, we have $category-entities.xml files for each category. They contain the list of package entites for that category and are shared between all translations. We should do the same thing in the packag docs themselves, but that's yet to be done.

A look ahead

There are still many things left to do. The manual itself needs a restructurization to make it easier to find answers to questions like "What is PEAR?" "Do I need to recompile PHP to use PEAR?" and so on. Currently, the manual starts with the developers guide which is not what most people expect.

Another task is thinking about renaming ids. xml:id attributes in sections, chapters and books determine their name in the chunked (multi-file) versions of the rendered manual. It would be cool to have all classes available under pear.php.net/manual/class.$classname.php, and packages as package.$packagename.php - without all the category fuzz. The category structurization is needed internally, but not for the generated files.

We also need to find a way to put examples into own files so we don't need to copy&paste them into xml. This would allow us to easily run the examples without extracting them first, and even to e.g. automatically pull package examples from the manual into package releases. Including external files can be done using xinclude, and I've already done this in php-gtk-doc.

Also on the TODO list is to make it easy to link to the API docs. The manual should give an overview about a package, show examples and explain how things work. It is not the place to tell the user about function parameters and return values but instead should link to that API doc files. Currently, linking to methods or classes is really hard, and that needs to be made really simple.

PhD?

While the things I wrote here seem to suggest that everything is done and we need to polish things only a bit, that's wrong. PhD renders only the tags that are used in phpdoc, and it generated HTML that is not XHTML and not valid. It does TOCs in peardoc wrong. I've already been fixing things, but there's quite a bunch of work left. We also need to get the new build system setup on pear.php.net which currently does not update the manual anymore. Our documentation coverage tool needs to be updated. We need to get CHM compilation working. And ...

No, we're not done yet. But peardoc did a great leap forward, and we're steadily getting closer to 100%.

Ah and many thanks to Hannes for all his help on #php.doc!

Published on 2008-10-15 in pear, peardoc, phd, php