Adding the source URL to an image's meta data

Sometimes I download images from The Internet™ for later use. For reference I'd like to store some meta data inside the image itself:

URL of the image I downloaded
URL of the website that linked to the image

The question is now: Which meta data field should I use for those URLs?

Finding the right property

EXIF

Basic meta data can be stored in the image's EXIF data, but there is no URL field:

$ exiftool -list -EXIF:All|grep -i url
$

The Exif 2.3 metadata for XMP document also does not list a single URL field, so exiftool was right.

XMP

The XMP standard is another way to store meta data in files. XMP Specification Part 1 defines multiple fields that one could use:

Possible XMP properties
Property	Type	Description
`dc:relation`	Unordered array of Text	A related resource.
`dc:source`	Text	A related resource from which the described resource is derived.

Unfortunately there are no "real" URL fields.

IPTC

There is another vocabulary, the IPTC Photo Metadata Standard.

Possible IPTC properties
Property	Type	Description
`Iptc4xmpCore:Source`	Text	The name of a person or party who has a role in the content supply chain.
`Iptc4xmpCore:CreatorContactInfoCiUrlWork`	URL, multiple	The creator's contact information provides all necessary information to get in contact with the creator of this item and comprises a set of sub-properties for proper addressing. The contact information web address part. Multiple addresses can be given, separated by a comma.
`plus:ImageSupplier`	Seq ImageSupplierDetail	Identifies the most recent supplier of the item, who is not necessarily its owner or creator. For identifying the supplier either a well known and/or registered company name or a URL of the company's web site may be used.

Only Iptc4xmpCore:CreatorContactInfoCiUrlWork sounds like it could be used to identify the web site that linked to the image, but I think it is meant to directly link to the creator's homepage - and not to a random URL that just contains a image tag.

Metadata Working Group

The Metadata Working Group published the Guidelines for Handling Image Metadata spec in 2010, and it contains a tag that actually matches my idea of "URL of website that linked to the image":

Possible MWG properties
Property	Type	Description
`mwg-coll:CollectionURI`	URI	URI describing the collection resource.

A "collection" in MWG speak is a group of images that this specific image is part of. And a website that links to the image can be seen as such a group.

Conclusion

I'm not satisfied with the available properties I found. But instead of inventing my own namespace with source and website properties, I'll simply use the Dublin Core XMP properties:

XMP properties I am using now
Property	Usage
~~`dc:source`~~ `rdf:about`	URL of image that was downloaded
`dc:relation`	URL of website that linked to the image

exiftool

Let's say that I visited http://cweiske.de/bdrem.htm and downloaded the image http://cweiske.de/graphics/bdrem/html.png. Now I want to add the website URL and image URL to its meta data.

Embedding the URLs in the downloaded image is easy with exiftool:

$ wget http://cweiske.de/graphics/bdrem/html.png -O bdrem-html.png
$ exiftool -source=http://cweiske.de/graphics/bdrem/html.png\
           -relation=http://cweiske.de/bdrem.htm bdrem-html.png
Warning: [minor] IPTC:Source exceeds length limit (truncated)
    1 image files updated

Despite the warning, the full source URL is stored in the image file. But on JPG files the source is really truncated:

$ exiftool -S -source -relation bdrem-html.jpg
Source: http://cweiske.de/graphics/bdrem
Relation: http://cweiske.de/bdrem.htm

To work around this issue, we force exiftool to use the XMP source property instead of the IPTC source property:

$ exiftool -XMP:source=http://cweiske.de/graphics/bdrem/html.png\
           -XMP:relation=http://cweiske.de/bdrem.htm bdrem-html.jpg
    1 image files updated

Extracting the data is also possible:

$ exiftool bdrem-html.png
ExifTool Version Number         : 9.46
File Name                       : bdrem-html.png
...
MIME Type                       : image/png
Image Width                     : 463
Image Height                    : 122
...
Software                        : Shutter
Source                          : http://cweiske.de/graphics/bdrem/html.png
XMP Toolkit                     : Image::ExifTool 9.46
Relation                        : http://cweiske.de/bdrem.htm
Image Size                      : 463x122

$ exiftool -S -source -relation bdrem-html.png
Source: http://cweiske.de/graphics/bdrem/html.png
Relation: http://cweiske.de/bdrem.htm

Browsers

Adding the meta data manually is possible, but it would be best if they were added automatically when saving-by-right-clicking the image in my brower. ~~Unfortunately, no browser supports this.~~

MacOS

On MacOS, downloaded files have the download source in the "Where from" file information. Safari, Chrome and Firefox (bug, commit) support this.

It is stored as extended attribute com.apple.metadata:kMDItemWhereFroms in the file system, so it is not tied to the file itself (but also does not modify the file, and works for all types of files).

2024-01: Kelvin Thompson sent me an e-mail explaining that exiftool allows to access this attribute and copy it into a different tag in the file itself:

$ exiftool '-XMP:source<MDItemWhereFroms' filename.jpg

curl

The XDG defines a list of Common Extended Attributes, among them is user.xdg.origin.url. It shall be used as extended file system attribute, similar to what MacOS does.

curl supports writing this file system attribute:

$ curl --xattr --output html.png http://cweiske.de/graphics/bdrem/html.png
$ getfattr --dump html.png
# file: html.png
user.mime_type="image/png"
user.xdg.origin.url="http://cweiske.de/graphics/bdrem/html.png"

Chromium once supported it, but removed it in 2019 because metadata doesn't provide any security guarantees on Linux, and is a privacy risk.

wget

wget, just like curl, supports the --xattr option:

$ wget --xattr http://cweiske.de/graphics/bdrem/html.png
$ getfattr --dump html.png
# file: html.png
user.xdg.origin.url="http://cweiske.de/graphics/bdrem/html.png"

Firefox

Firefox has a feature request open since 2011 for supporting user.xdg.origin.url.

But it does write the origin URL to gnome gvfs meta data:

$ gio info --attributes=metadata:: html.png
[...]
attributes:
  metadata::download-uri: http://cweiske.de/graphics/bdrem/html.png

Update 2020-04: rdf:about

Daniel Aleksandersen suggested that we already have a property that means "URL of the thing we talk about", and that is rdf:about:

[...] goes on to identify the resource the statement is about (the subject of the statement) using the rdf:about attribute to specify the URIref of the subject resource.

Published on 2016-07-05 in photos, tools

Claws Mail: Open .eml file

My wife forwarded an e-mail as attachment to me, and I could not open it in Claws Mail. The attached e-mail is stored as .eml attachment, and Claws has no way to open it.

Another user had the same problem in 2004, and the answer then was to use the eml2mbox tool. This Ruby-based tool was written in 2004, and it uses a parsedate library function that has been removed in Ruby 1.9. In 2023 we have Ruby 3.1, and eml2mbox does not work anymore.

In the end I used the Python 3 based emlToMbox, which converted the .eml to a .mbox that I could import into Claws.

It worked, but it still sucks that Claws does not open .eml files natively. Forward-as-attachment is a standard procedure.

Published on 2023-10-08 in mail, tools

Printing a large image on multiple pages

The Solution: Use PosteRazor.

The task
How to print?
Other tools

The task

We needed some table decoration for a birthday. Since the subject to party with is a hobby photographer, we thought of a roll of film with pictures of the person that will lay in the middle of the tables - printed on normal paper of course since negatives are not so easy to look at. There were around 12 meters of table to decorate, and we had A4 paper to print on.

So at first we collected/scanned all the images we wanted to appear on the "roll of film" and gathered around 45 of them. The next part was the easy one: Generating an image that looks like a film roll with all the images in it. Gimp, the GNU Image Manipulation Program, already has a plugin for that: Filters > Combine > Filmstrip. I adjusted the height to 2048px and the color of the numbers and selected all images. After some minutes, the film strip was generated. A whopping 53432x2048 pixels in size! Here a scaled down version of the first part of the image:

Now how does one print such a huge image on a normal printer? The idea was to print on landscaped A4 paper with some seam on each page so the individual pages could be glued together afterwards. Quick check:

Size of A4: 29.7x21.0cm
Landscape: 21.0cm x 29.7cm
Size of image: 53432px x 2048px
$ 2048/210*297
2896
$ 53432/2896
18.45

Each paper gets 2896 of the 53k pixels; that's 19 pages of landscaped A4 sheets!

How to print?

Still the same problem: How to print? Gimp itself does not have a "print across multiple sheets" functionality. There are several threads on The Internet^TM about printing on multiple pages on Linux, but apart from suggesting the poster application for Postscript files, no automated solutions were given.

PosteRazor

PosteRazor is the tool you want. It takes an image, lets you configure your desired sheet size and creates a PDF file with as many sheets it takes to reach it. Printing that file is easy!

It has predefined paper sheet sizes (A4, Letter) and supports custom ones. You may define your printer margins. Setup overlapping. Specify the alignment of the pictures that are not filled. Just everything you need to print that image on a poster.

Had I known it a year ago, things would have been easier and this blog entry would not exist.

If you want to reset the settings in PosteRazor to their default values, you have to remove the configuration file that is located in ~/.fltk/CasaPortale.de/PosteRazor.prefs.

kprinter

kprinter, the KDE printing dialog, already supports scaling on multiple sheets, but only for "normal" sizes: A4 to A3 or A2 or A0 like that, no special sizes like my image was. It's on the "Poster" tab on the printer settings:

OpenOffice

OpenOffice.org Draw also has the ability to print large drawings and images on several pages.

Set the sheet size as large as your result shall be, i.e. 5 x A4.
Click File and Print
In the lower left corner, click "Options" button (German: "Zusätze")
Page options: "Tile pages"
Print

KIPI-Plugins

Any application making use of the KIPI library can be used to print images on several sheets. In Showfoto, the special printer dialog was not available but in Gwenview, you can find it at
Plugins → Images → Print images → Options → Image settings → Multiple pages

Unfortunately it is not really intuitive and you have to try a lot with the PDF printing to get what you want. Currently I cannot recommend it.

My solution: convert

So kprinter could not be used in my case. The tip that Scribus would support wallpaper printing cannot be confirmed by me; I did not find such an option there.

So what's left? Yep, the command line tools. The ImageMagick suite contains the swiss knife for CLI image manipulation, convert. convert lets you crop a region out of an image, which is what we needed:

convert kombi.jpg +repage -crop 2896x2048+0+0 single-1.jpg
convert kombi.jpg +repage -crop 2896x2048+2896+0 single-2.jpg

This two commands crop the first two A4 landscape sized sheets from the large kombi.jpg image file and save them as single-1.jpg resp. single-2.jpg.

Now that needed to be done 19 times; not something I wanted to do manually. Luckily for me, bash supports for loops:

i=0
for (( x=0; x<53432; x=x+2896 ))
do i=`expr $i + 1`
    echo $i
    convert kombi.jpg +repage -crop 2896x2048+$x+0 single-$i.jpg
done

You can type that with semicolons instead of newlines: i=0; for (( x=0; x<53432; x=x+2896 )); do i=`expr $i + 1`;echo $i; convert kombi.jpg +repage -crop 2896x2048+$x+0 single-$i.jpg; done

No two minutes later, I had 19 single images to print out! Opening them in some graphics program just for printing is nonsense, so I went the CLI way again:

lp -d Laser -o media=a4 -o landscape -o fitplot single-1.jpg

or in a loop:

for i in *.jpg; do lp -d Laser -o media=a4 -o landscape -o fitplot $i; done

Our printer has a 0.5cm margin on each side, so there was no need to print an extra glue margin. After cutting the paper a bit and glueing everything, we have a 6m x 0.21m foto strip as table decoration!

Other tools

2022-08: pdfposter

Published on 2010-02-28 in linux, tools

Generating Matroska tag files from TMDb

I recently ripped my "James Bond" DVD collection with Handbrake to Matroska .mkv files. To complete the process, I wanted to have all possible meta data inside the video files: Title, summary, cover image, director and actors.

Matroska has extensive support for tags. They can be specified as XML file, which I already created from Kdenlive projects when cutting my own videos.

Unfortunately there were no tools that created such a mkvtags.xml file by just specifying the movie title. (Creating them myself for the 24 Bond movies was no feasible).

It turns out that the popular Amazon-owned IMDb has no open API; you have to request a license key by e-mail and pay for it. The Movie Database on the other hand has a usable API that one can get an API key for by just registering an account.

With the TMDb API in hand, I wrote a small script that takes a 2-letter language code and the movie title as parameter, and then generates the mkv tags XML file and downloads the cover and backdrop images. Those can then be used with mkvtoolnix-gui to generate a full-featured .mkv video file.

The code for tmdb2mkvtags is available at git.cweiske.de/tmdb2mkvtags.git, with a mirror at GitHub.

Example mkv tags XML file




 
 
  
   70
  
  
   TITLE
   James Bond Filmreihe
   de
  
 
 
 
  
   50
  
  
   TITLE
   James Bond 007 - Casino Royale
   de
  
  
   SUBTITLE
   Jeder hat eine Vergangenheit - Jede Legende einen Anfang.
   de
  
  
   SYNOPSIS
   Sein erster Auftrag, nachdem er die Lizenz zum Töten erhalten hat, führt den MI6-Agenten James Bond nach Madagaskar, wo er auf den Terroristen Mollaka angesetzt wird. Zwar verläuft nicht alles nach Plan, doch als Bond auf eigene Faust weiter ermittelt kommt er auf die Spur von Le Chiffre, dem Bankier einer weltweit operierenden Terror-Organisation. Dieser plant, das Vermögen seiner Organisation durch ein illegales Pokerspiel im „Casino Royale“ von Montenegro um ein vielfaches zu erhöhen, wofür natürlich auch ein hoher Einsatz nötig ist. Der MI6 sieht daher die Chance, die Terroristen in den finanziellen Ruin zu treiben und beauftragt James Bond, die Pläne von Le Chiffre zunichte zu machen.
   de
  
  
   DATE_RELEASED
   2006-11-14
  
  
   GENRE
   Abenteuer
   de
  
  
   GENRE
   Action
   de
  
  
   GENRE
   Thriller
   de
  
  
   RATING
   3.75
  
  
   TMDB
   movie/36557
  
  
   IMDB
   tt0381061
  
  
   ORIGINAL
   
    TITLE
    Casino Royale
    en
   
  
  
   ACTOR
   Daniel Craig
   
    CHARACTER
    James Bond
   
  
  
   ACTOR
   Eva Green
   
    CHARACTER
    Vesper Lynd
   
  
  
   ACTOR
   Mads Mikkelsen
   
    CHARACTER
    Le Chiffre
   
  
  
   ACTOR
   Judi Dench
   
    CHARACTER
    M
   
  
  
   ACTOR
   Jeffrey Wright
   
    CHARACTER
    Felix Leiter
   
  
  
   ACTOR
   Giancarlo Giannini
   
    CHARACTER
    René Mathis
   
  
  
   ACTOR
   Caterina Murino
   
    CHARACTER
    Solange Dimitrios
   
  
  
   ACTOR
   Simon Abkarian
   
    CHARACTER
    Alex Dimitrios
   
  
  
   ACTOR
   Isaach De Bankolé
   
    CHARACTER
    Steven Obanno
   
  
  
   ACTOR
   Jesper Christensen
   
    CHARACTER
    Mr. White
   
  
  
   ACTOR
   Ivana Miličević
   
    CHARACTER
    Valenka
   
  
  
   ACTOR
   Tobias Menzies
   
    CHARACTER
    Villiers
   
  
  
   ACTOR
   Claudio Santamaria
   
    CHARACTER
    Carlos
   
  
  
   ACTOR
   Sébastien Foucan
   
    CHARACTER
    Mollaka
   
  
  
   ACTOR
   Malcolm Sinclair
   
    CHARACTER
    Dryden
   
  
  
   ACTOR
   Richard Sammel
   
    CHARACTER
    Adolph Gettler
   
  
  
   ACTOR
   Ludger Pistor
   
    CHARACTER
    Mendel
   
  
  
   ACTOR
   Joseph Millson
   
    CHARACTER
    Carter
   
  
  
   ACTOR
   Darwin Shaw
   
    CHARACTER
    Fisher
   
  
  
   ACTOR
   Clemens Schick
   
    CHARACTER
    Kratt
   
  
  
   ACTOR
   Emmanuel Avena
   
    CHARACTER
    Leo
   
  
  
   ACTOR
   Tom Chadbon
   
    CHARACTER
    Stockbroker
   
  
  
   ACTOR
   Ade
   
    CHARACTER
    Infante
   
  
  
   ACTOR
   Urbano Barberini
   
    CHARACTER
    Tomelli
   
  
  
   ACTOR
   Tsai Chin
   
    CHARACTER
    Madame Wu
   
  
  
   ACTOR
   Lazar Ristovski
   
    CHARACTER
    Kaminofsky
   
  
  
   ACTOR
   Veruschka von Lehndorff
   
    CHARACTER
    Gräfin von Wallenstein
   
  
  
   ACTOR
   Charlie Levi Leroy
   
    CHARACTER
    Gallardo
   
  
  
   ACTOR
   Tom So
   
    CHARACTER
    Fukutu
   
  
  
   ACTOR
   Andreas Daniel
   
    CHARACTER
    Dealer
   
  
  
   ACTOR
   Carlos Leal
   
    CHARACTER
    Tournament Director
   
  
  
   ACTOR
   Christina Cole
   
    CHARACTER
    Ocean Club Receptionist
   
  
  
   ACTOR
   Jürgen Tarrach
   
    CHARACTER
    Schultz
   
  
  
   ACTOR
   John Gold
   
    CHARACTER
    Card Players
   
  
  
   ACTOR
   Diane Hartford
   
    CHARACTER
    Card Players
   
  
  
   ACTOR
   Leo Stransky
   
    CHARACTER
    Tall Man
   
  
  
   ACTOR
   Paul Bhattacharjee
   
    CHARACTER
    Hot Room Doctors
   
  
  
   ACTOR
   Crispin Bonham-Carter
   
    CHARACTER
    Hot Room Doctors
   
  
  
   ACTOR
   Rebecca Gethings
   
    CHARACTER
    Hot Room Technicians
   
  
  
   ACTOR
   Peter Brooke
   
    CHARACTER
    Airport Policemen
   
  
  
   ACTOR
   Robert G. Slade
   
    CHARACTER
    Pilot
   
  
  
   ACTOR
   Félicité Du Jeu
   
    CHARACTER
    French News Reporter
   
  
  
   ACTOR
   Michaela Ochotská
   
    CHARACTER
    Shop Assistant
   
  
  
   ACTOR
   Michael G. Wilson
   
    CHARACTER
    Chief of Police
   
  
  
   ACTOR
   Valentine Nonyela
   
    CHARACTER
    Nambutu Embassy Official
   
  
  
   ACTOR
   Phil Meheux
   
    CHARACTER
    Treasury Bureaucrat
   
  
  
   ACTOR
   Alessandra Ambrosio
   
    CHARACTER
    Tennis Girls
   
  
  
   ACTOR
   Vlastina Svátková
   
    CHARACTER
    Waitress
   
  
  
   ACTOR
   Ivan G'Vera
   
    CHARACTER
    Venice Hotel Concierge
   
  
  
   ACTOR
   Richard Branson
   
    CHARACTER
    Man at Airport Security (uncredited)
   
  
  
   ACTOR
   Martin Campbell
   
    CHARACTER
    Airport Worker (uncredited)
   
  
  
   ACTOR
   Tara Cardinal
   
    CHARACTER
    Woman in Casino (uncredited)
   
  
  
   ACTOR
   Ben Cooke
   
    CHARACTER
    MI6 Agent (uncredited)
   
  
  
   ACTOR
   Simona Roman
   
    CHARACTER
    Dossier Girl (uncredited)
   
  
  
   ACTOR
   Greg Bennett
   
    CHARACTER
    Airport Driver, Miami (uncredited)
   
  
  
   ART_DIRECTOR
   Michael Lamont
  
  
   WRITTEN_BY
   Paul Haggis
  
  
   EDITED_BY
   Stuart Baird
  
  
   WRITTEN_BY
   Ian Fleming
  
  
   ART_DIRECTOR
   Peter Francis
  
  
   COSTUME_DESIGNER
   Lindy Hemming
  
  
   ART_DIRECTOR
   Steven Lawrence
  
  
   PRODUCER
   Barbara Broccoli
  
  
   DIRECTOR
   Martin Campbell
  
  
   DIRECTOR_OF_PHOTOGRAPHY
   Phil Meheux
  
  
   WRITTEN_BY
   Robert Wade
  
  
   WRITTEN_BY
   Neal Purvis
  
  
   PRODUCER
   Michael G. Wilson
  
  
   LEAD_PERFORMER
   Chris Cornell
  
  
   ART_DIRECTOR
   James Hambidge
  
  
   ART_DIRECTOR
   Dominic Masters
  
 
]]>

Published on 2021-05-02 in tools, video

Lyrics in ogg/vorbis and mp3 files

My music library consists of mostly ogg/vorbis files, and I wanted to store the text that is sung directly in the files.

This is mostly uncharted territory - at least for .ogg files. It took three weeks of research to learn about the possibilities, solutions in other file formats and problems.

Lyrics formats

Lyrics can be synchronized or unsynchronized:

unsynchronized lyrics is just the song text as you see it in the booklet of a CD

It's plain text with newlines .
synchronized lyrics is the text plus timing information, which allows media players to show the exact line that is currently being sung. Karaoke depends on this.

In addition to the text lines, timing information is needed - be it for lines or even single words or syllables.

Standalone lyrics file formats

There is a huge number of different formats that allow you to store lyrics/subtitles in separate files.

LRC: Easy to read and easy to write text-only format: One line of text per lyrics line. Seems to be standard today for music.
srt: Very popular for movies. More verbose than LRC, but still plaintext.
MP3+G: Binary format used in Karaoke CDs. The karaoke text is ripped into .cdg files which must have the same name as the audio files.

MP3

Let's have a look at MP3 first because tool support is better and there is an official standard for both synchronized and unsynchronized lyrics.

"Official" is not really correct here; the MP3 standard does not define how to store meta data in a file:

The MP3 standards do not define tag formats for MP3 files, nor is there a standard container format that would support metadata and obviate the need for tags.

However, several de facto standards for tag formats exist. As of 2010, the most widespread are ID3v1 and ID3v2, and the more recently introduced APEv2.

Wikipedia: MP3 Metadata

Data storage

The ID3v2 standard has meta tags for both synced and unsynced lyrics:

USLT: Contains the encoding of the text, the language, a description and the actual song text. Multiple entries allowed, which means you can have lyrics in several language.
SYLT: Also contains encoding, language, the time stamp format (MPEG frames or milliseconds), type and time-stamped text in a defined format.

Tools to add lyrics

On Linux, there is kid3 and SYLTeditor (which did not install on Ubuntu 14.04).

For Windows, you have MiniLyrics, SYLTeditor and there once was Window Media Player 11 which could add synchronized lyrics to mp3 files.

kid3-cli can be used on the shell to add lyrics stored in .lrc files to mp3:

$ kid3-cli -c "set SYLT='countdown.lrc' ''" countdown.mp3

Extracting the lyrics via CLI is also possible:

$ kid3-cli -c "get SYLT:/dev/stdout" countdown.mp3

Displaying lyrics

~~On Linux, there is not a single application that displays synchronized lyrics stored in the file when playing an music track.~~

As of 2021, only Lollypop supports displaying synchronized lyrics stored in the file when playing an music track. All other players that claim lyrics support either rely on local .lrc files or try to download the lyrics from somewhere on the internet.

On Windows you may use MiniLyrics which reads and displays SYLT data in mp3 files without problems.

Musique and Rhythmbox at least show embedded unsynchronized lyrics.

ogg/vorbis

ogg/vorbis is a patent-free audio codec and thus my preferred choice over mp3.

Tool support for lyrics is nearly non-existent.

Data storage

A small list of meta data tags for ogg/vorbis is listed in Vorbis I specification: 5.2.2. Content vector format and standalone in comment field and header specification. In addition, the xiph wiki lists proposed field names and links to three websites that propose some additional ones ( 1, 2, 3). Neither of them mentions lyrics.

The topic was mentioned on the vorbis mailing list in 2006 and 2008 with the conclusion that there is no standard, one should try CMML. I did not find any tools, example files or supporting clients for it, though.

OggKate can be used for synchronized lyrics, and kid3 uses the LYRICS tag to store unsyced lyrics in ogg files.

OggText was brainstormed in 2008 as well, but was never brought any further.

OggKate

In 2008 OggKate, a format for synchronized lyrics in ogg/vorbis files, was invented and implemented.

Kate is an overlay codec, originally designed for karaoke and text, that can be multiplexed in Ogg.

It does not only provide plain text lyrics but also animations, different colors, fonts and other styles.

Stream

It is not some data in a meta data tag field, but a separate stream inside an .ogg file, accompanying the music stream (just like movie files have both video and audio streams):

Lyrics in meta data

+----------+--------------------------+
| Metadata | Audio stream             |
+----------+--------------------------+

Lyrics as stream

+----------+--------------------------+
| Metadata | A u d i o   s t r e a m  |
|          |  L y r i c s   s r e a m |
+----------+--------------------------+

This has several implications:

✔ When playing the file, the text is available when it needs to be displayed.

No need to allocate extra memory for the lyrics, and no need for additional timing information in the lyrics data.
✔ Interweaving the audio and lyrics data makes it suitable for live audio streams that simply cannot have all the lyrics available at the beginning.
✘ Adding lyrics is not simply setting a tag but breaking up the whole file and re-assembling the whole file.
✘ Extracting lyrics requires reading the whole file and re-assembling the single pieces.

Players will need to do this if they not only want to display the current line or syllable, but provide a glimpse to the following ones.

The 99% standard use case - lyrics for music files of max. 10 minutes length - does not benefit from this format. Even operas with a length of several hours only have about 100kiB data of timed lyrics (LRC), which is nothing for yesterday's hardware.

Another argument for streaming was made on the Xiph MIME Types and File Extensions talk page:

If you want it without the timing, you have to store it in headers, as streaming it will get you the text only as its presentation time is reached.

You could do that if you were loading from a file, but that's only a special case, so it's best to leave that text interleaved with other streams.

A player wanting to display the entirety of the lyrics at once would have to, if possible (eg, if not streaming), scan the entire file to recover the text. Parsing Ogg packets is relatively fast, so a threaded player could do this while starting streaming a file and have the text ready in under a second for a typical song I suppose.

I don't believe the "special case" argument; even when you're playing the middle of a ogg/vorbis file via HTTP you still have to fetch the beginning of the file to get information about the header and general structure.

Tool support

Creating and reading OggKate streams is done via kateenc and katedec. oggenc's --lyrics only works if kate support is compiled in, which is not the case on Ubuntu 14.04.

The only tool able to display OggKate lyrics streams in audio files I found is VLC 2.2.2, if you start playing and then select the subtitle. (2.1.6 does not detect the subtitle stream at all).

Creating OggKate streams

kateenc from libkate-tools on Ubuntu 14.04 allows one to convert .lrc and .srt into kate streams, which then can be muxed into ogg audio files with oggz:

$ kateenc -o lyrics.ogg -t lrc -l en countdown.lrc
$ oggz-merge -o countdown-with-lyrics.ogg countdown.ogg lyrics.ogg

Extracting the lyrics as LRC is possible with katedec:

$ katedec -t lrc oggz-kateenc-countdown.ogg

Playing files with Kate streams

Playing kate'd ogg/vorbis files works fine on VLC, mplayer, armarok, xine, audacious and ogg123 (even if they do not show the lyrics).

GStreamer-based applications like Rhythmbox and Totem open a new window that looks for extensions that understand "Kate decoder", but fail. Totem plays the file nontheless, but Rhythmbox does not. Amarok 2.8 sometimes opened that window, but also plays the file.

Patches

The OggKate author - who never left his name anywhere by the way - provided OggKate patches for several media players:

xine

The mailing list discussion did not lead to anything as it seems. Xine does not display kate lyrics.

mplayer

The patch was rejected; the mplayer authors didn't want to support another subtitle format and preferred ASS.

gstreamer

The patch got included in 2009 into gst-plugins-bad 0.10.

Unfortunately it is not included in gstreamer 1.0 (yet), which explains the installation popup.

Thoggen (DVD-Ripper)

The feature request is still open for Thoggen.

Unsynced lyrics in .ogg

There is no officially endorsed vorbis tag/field for unsynchronized lyrics. The only tag in the wild I found was LYRICS which is created by the kid3 editor and lyrico.

We can add it with vorbiscomment from a lyrics text file:

$ vorbiscomment -a file.ogg -t LYRICS="$(cat lyrics.txt)"
$ vorbiscomment --list  countdown-unsync-vorbiscomment.ogg
title=Countdown aligned
artist=cweiske
date=2016
genre=Speech
encoder=Lavf53.21.1
LYRICS=ten
nine
eight
seven
six
5
4
3
2
1
0

oggz-comment can also be used to add unsynchronized lyrics to .ogg files:

$ oggz-comment file.ogg -o file-with-lyrics.ogg LYRICS="$(cat lyrics.txt)"
$ oggz-comment -l file-with-lyrics.ogg
Vorbis: serialno 0033215013
        Vendor: Lavf53.21.1
        title: Countdown aligned
        artist: cweiske
        date: 2016
        genre: Speech
        encoder: Lavf53.21.1
        LYRICS: ten
nine
eight
seven
six
5
4
3
2
1
0

Both Lollypop and Rhythmbox display them.

WMA

Windows Media Audio files support synced and unsynced lyrics in their meta tags. The ASF specification unfortunately does not define meta data tag names and thus also no meta data lyrics formats.

kid3 shows the synchronized lyrics in the .wma file with a tag name of WM/Lyrics_Synchronized.

Tools to add lyrics

Windows explorer on Windows XP is able to add unsychronised lyrics, Windows Media Player 11 could write both synchronized and unsynchronized lyrics.

Displaying lyrics

No tool I tested is able to display the lyrics embedded in .wma files while playing music.

exiftool at least shows them on cli, and kid3 notices that there are synchronized lyrics inside - but does not show them.

AAC

The Advanced Audio Coding format is mostly used by Apple's iTunes. I saw screenshots of the iTunes meta data editor that had a (unsynchronized) lyrics field. I did not follow this any further.

MiniLyrics claims to support lyrics in .m4a/.aac files.

Tools

The list of tools I found that could work with lyrics embedded in audio files, alphabetically sorted:

Clementine

Clementine is a music player that is able to show unsynchronized lyrics from mp3 files. Does neither support ogg lyrics nor synchronized text.

eyeD3

eyeD3 is a linux command line application that reads and writes mp3 tags, supporting ID3v1, 2.3 and 2.4.

Unfortunately it is not able to extract a single tag in a way that's readable by other programs, and is also not able to extract synchronized lyrics:

songname/content description (TIT2): Countdown aligned>



Hardware and settings used for encoding (TSSE): Lavf53.21.1>
]]>

It is able to add unsynchronized lyrics to mp3 files:

$ eyeD3 --lyrics="eng::ten
nine
eight
seven
six
5
4
3
2
1
0" countdown.mp3

exiftool

exiftool is the swiss army knife of meta tags. It's able to extract data from most image, document and audio file types.

It shows synchronized lyrics from mp3 and wma files, and unsynchronized from mp3, ogg and wma:

$ exiftool countdown-sync-kid3-id3v2.4.mp3 |grep -i lyr
Synchronized Lyrics Type        : Lyrics
Synchronized Lyrics Description :
Synchronized Lyrics Text        : [00:00.00].ten, [00:01.00].nine, [00:02.00].eight, [00:03.00].seven, [00:04.00].six, [00:05.00].5, [00:06.00].4, [00:07.00].3, [00:08.00].2, [00:09.00].1, [00:10.00].0, [00:11.00].

$ exiftool countdown-sync-wmp11.wma |grep -i lyr
Lyrics                          : ten..nine..eight..seven..six..5..4..3..2..1..0
Lyrics Synchronised             : ..vsynchronized lyricstennine�.eight�.seven�.six�.5�.4�.3p.2X.1@.0(#

It does not show OggKate texts.

Foobar2000: show lyrics 3 plugin

The windows-only Foobar2000 supports plugins, and one of the plugins/"components" is Lyrics Show Panel 3 (foo_uie_lyrics3).

While it claims to be able to read both synced and unsynced lyrics from ogg and mp3 files, it does so by using its own (configurable) meta tag, storing LRC in the tag for synced text. It does not support SYLT/USLT tags.

kid3

The meta data editor kid3 runs on Linux and supports mp3, ogg and wma.

It has a usable synced lyrics editor and documentation for it. Importing LRC is supported.

kid3-qt was the only software that actually supported storing unsynced lyrics inside ogg/vorbis files by setting the LYRICS tag.

kid3-cli is a command line tool that can be used to read and write .lrc files from and to mp3 files.

Lollypop

Gnome application Lollypop supports SYLT in mp3 files, at least those that have been added by SYLT Editor, but not those by kid3.

It also reads .lrc files.

lyrico

lyrico automatically finds and downloads unsynchronized lyrics and is able to embed them into mp3, ogg, wma and m4a files.

It does not support synchronized text.

MiniLyrics

MiniLyrics is not available for Linux.

It reads synchronized and unsynchronized lyrics from mp3 meta data, but does not support the WMP11 generated lyrics in wma files, nor does it support .ogg. It claims to support M4A (AAC) files, but I did not test that.

It includes an editor for synchronized lyrics and can import LRC files.

Musique

Musique version 1.4 is able to read and display unsynchronized lyrics from mp3 files.

oggz-comment

oggz-comment from oggz-tools is able to read and write ogg/vorbis meta tags.

See the example how to use it to add unsynchronized lyrics.

Rhythmbox

Gnome's Rhythmbox displays unsynchronized lyrics in .ogg files because I made a patch for it.

SYLT Editor

SYLT editor is Freeware for Windows and Linux, but it did not install on Ubuntu 14.04 correctly.

On windows it worked, but proved to not support SYLT lyrics embedded by other editors. It even once did not read SYLT from a file that it wrote itself previously..

I can't recommend it.

VLC - VideoLan Client

VLC was the only application I found able to display OggKate lyrics in .ogg files - when enabling audio visualization, the text is printed on top of that. (version 2.2.2 worked, 2.1.6 did not)

It does not support SYLT, but you can load .srt subtitle files when playing music with audio visualization enabled.

vorbiscomment

vorbis-comment from vorbis-tools is also able to read and write ogg/vorbis meta tags.

See the example how to use it to add unsynchronized lyrics.

Windows Explorer

Windows XP's Explorer is able to show and edit meta data of .wma files - and it allows you to add unsynchronized lyrics. You cannot set the lyrics language as you could with Windows Media Player 11, though.

Windows Media Player 11

Windows Media Player 11 was available on Window XP and Windows Vista. To my knowledge it is the only application able to add synchronized lyrics to .wma files via its Advanced Tag Editor.

This tag editing capabilities were removed in Windows Media Player 12, much to the dismay of users.

Windows Media Player was not able to actually display the synchronized lyrics, which maybe was the reason for its removal. But the removal of the lyrics editing feature from WMP means that Microsoft officially has given up on lyrics in .wma, and maybe even on .wma itself.

If you ever try to install Windows Media Player 11 on an old copy of Windows XP, you'll notice that it will fail because it cannot prove the XP installation as "genuine" anymore.

You can work around this by manually extracting wmp11-windowsxp-x86-DE-DE.exe with 7-zip and manually executing all the wm*.exe files.

Hardware

List of hardware music players with lyrics support:

iPod classic
iPod nano (4th gen): Clicking 3 times on the wheel button shows the lyrics. source

Demo files

During the course of my research I created a number of audio files with the various tools. As base I used countdown.ogg from Corsica_S's countdown.wav but had to align it to the exact full seconds with Audacity. Then I created a lyrics .lrc file and used the various tools to add lyrics to it.

Audio files with embedded lyrics
Tool	File	Note
Source files
ffmpeg	countdown.aac
ffmpeg	countdown.flac
ffmpeg	countdown.mp3
unknown	countdown.ogg
ffmpeg	countdown.wav
ffmpeg	countdown.wma
Synchronized lyrics
kid3	countdown-sync-kid3-id3v2.3.mp3	ID3 v2.3
kid3	countdown-sync-kid3-id3v2.4.mp3	ID3 v2.4
MiniLyrics	countdown-sync-minilyrics.mp3
oggz-merge	countdown-sync-oggz.ogg	OggKate
SYLT editor	countdown-sync-sylteditor.mp3
Windows Media Player 11	countdown-sync+unsync-wmp11.mp3	Both synchronized and unsynchronized
	countdown-sync-wmp11.mp3
	countdown-sync-wmp11.wma
Windows Explorer	countdown-unsync-explorer.wma
Unsychronized lyrics
eyeD3	countdown-unsync-eyeD3.mp3
Foobar2000	countdown-unsync-foobar2000.mp3
Foobar2000	countdown-unsync-foobar2000.ogg
kid3	countdown-unsync-kid3.flac
	countdown-unsync-kid3.mp3
	countdown-unsync-kid3.ogg
	countdown-unsync-kid3.wma
MiniLyrics	countdown-unsync-minilyrics.mp3
oggz-comment	countdown-unsync-oggz-comment.ogg
vorbiscomment	countdown-unsync-vorbiscomment.ogg
Windows Media Player 11	countdown-unsync-wmp11.mp3
Other
Foobar2000: Show lyrics 3	countdown-sync-foobar2000-show_lyrics_3.mp3	Custom tag name
Foobar2000: Show lyrics 3	countdown-sync-foobar2000-show_lyrics_3.ogg	Custom tag name

Apps missing embedded lyrics support

The following applications miss support for embedded lyrics:

Amarok: Bug 184325: amarok wish: fetch lyrics from the ID3 lyrics tag since 2009
EasyTAG: Bug 769201: Add lyrics tag support since 2016
eyeD3: Issue #103: Display synchronized lyrics since 2016
GStreamer: Bug 784634: unsynchronized lyrics not extracted from mp3 files since 2017
music player daemon: #3100: Add support for lyrics tag since 2010
Musique: Support for synchronized lyrics in MP3 files since 2016; Read lyrics from ogg/vorbis files since 2016
Quod Libet: #1989: synchronized lyrics plugin: Support embedded lyrics since 2016
Rhythmbox: Bug 463978: Lyrics reading embedded tags since 2007. I added support for ogg/vorbis lyrics tag support in 2017.
VLC: #17207: Display mp3 synchronized lyrics SYLT tag as subtitle since 2016

Conclusion

I'm dissatisfied with the state of embedded lyrics support on Linux.

For ogg/vorbis files - of which my music library consists mostly - I did not find a satisfying solution. OggKate is too complex for most needs; something like SYLT for vorbis would be better suited - or even simply embedding LRC into a SYNCLYRICS tag.

The ID3v2 SYLT format is binary and would have to be base64 encoded, as it is done with album art in vorbis files, too.

And then, tool support would again be missing :(

Published on 2016-07-26 in music, tools

LDAP-Adressbuch in die FritzBox

Nach 10 Jahren habe ich meine ISDN-Telefonanlage Auerswald COMpact 3000 ISDN in den Ruhestand geschickt und durch eine FritzBox 7390 ersetzt (gebraucht für 15€). Zwei der drei Telefone habe ich direkt an die FritzBox-eigene DECT-Basisstation angehangen, und das dritte über den ISDN-Anschluss. Die Türklingel hängt noch am analogen Telefonanschluss.

Damit die Telefone wieder Namen von im zentralen LDAP-Adressbuch gespeicherten Rufnummern anzeigen, musste ich die Daten irgendwie auf die FritzBox bekommen. Obwohl die Liste der Telefonbuchprojekte lang ist, konnte keins davon die Daten aus LDAP extrahieren.

Ich habe mir also in 30 Minuten selbst eins gebastelt: ldap2fbxml. Das Script fragt den LDAP-Server ab und generiert eine XML-Datei, die dann manuell im Telefonbuch-Webinterface der Fritzbox hochgeladen werden kann ("Telefonbuch wiederherstellen)".

Ein regelmäßiger automatischer Sync ist nicht notwendig, da ich zu Hause sowieso bald von LDAP auf CardDAV umsteigen will.

Published on 2021-02-04 in hardware, ldap, tools

wget -O empties file on error

wget can be used to fetch a file via HTTP, and it supports -O filename.html to force the downloaded file to have that exact name. If the file exists, it is simply overwritten with the new file content.

If the request fails, the local file is empty - but that's often not desired.

The reason for this has been explained in 2006 already:

-O, as currently implemented, is simply a way to specify redirection. You can think of it as analogous to "command > file" in the shell.

Hrvoje Niksic, wget@sunsite.dk mailing list

The solution to this problem is to not use wget but curl:

$ curl -f http://nonexistent/file.jpg -o localfile.jpg

It will keep the current file contents if the request fails, and overwrite it if the request succeeds.

Published on 2019-01-10 in http, tools

Avoid OpenShot

I spent most evenings of the last two months creating a wedding movie from about 150 single videos recorded with our camcorder. The non-linear video editor I chose was OpenShot, version 1.4.0 with libmlt 0.7.6. I still regret it, even though we finally managed to finish the video.

I'll list ~~all~~some of the bugs that we came across in the 2 months, and a better tool.

Frequent crashes
Transition problem
Audio problems
General slowness
For short movies only
Craftsmanship
Binary blobs are in
Various other issues
It's the lib!
An alternative
Other people

Frequent crashes

We had more than 350 crashes in the two months. The last days were so bad that you could do about 1 action and save before OpenShot crashed - e.g. add clip, save, crash. Start OpenShot again, move clip, save, preview, crash.

Transition problem

The worst bug of all, even worse than the crashes: Fade in/out glitches

When adding a transition (e.g. fade) between two video clips, now and then the target clip is fully visible at the beginning of the transition.

If that happens, the track is "tainted" and all following clip transitions will have the same problem. All of them. Every single one.

The only workaround is to add a new track and move all remaining clips onto that. For our movie, with a length of not quite an hour, we ended up with 22 tracks. Together with the frequent crashes, this was the total disaster.

Audio problems

On some scenes we added some ambient music with 25% or 30% loudness. When playing the preview (and the final rendered video), the music was missing.

The bug for that is Decreasing audio volume of a clip doesn't work and has been fixed in 2009. Unfortunately, my clock shows 2011 and the bug is there - again.

The problem here is localization (and the OpenShot developers not being aware of it): The English decimal point is a dot ".", while the German one is a comma ",". OpenShot (or MLT) expresses the volume as floating point number from 0.0 (no sound) to 1.0 (100% volume) or higher. When OpenShot generates the MLT project file, a volume lower than 100% generates a number like "0,25" with a German locale. The MLT parser expects a dot as decimal point and throws away everything it does not understand/expects, and this is the comma and everything behind.

To make it work, we had to start OpenShot in english:

$ LC_ALL=C openshot

Update: Jonathan Thomas wrote that they are aware of locales, know about the problems and are sure that the problem does not exist:

OpenShot works fine in every culture we've localized it for, including many that use commas.

I still wonder why I clearly have the problems.

General slowness

The first problem we had when beginning our project was adding many files blocks interface.

That problem is not as bad as the following: Clip/Videos properties window is too slow. To change video and audio transition settings, or the loudness of a clip, you need that window. It takes three seconds(!) to close that window, which interrupts every workflow.

And that one is not as bad as: Video preview keeps going for some time after pausing. Imagine you were running the preview and want to continue working. Pressing the pause button to stop preview only reacts 15 seconds after you pressed it. Unbearable.

On my 4 CPU system, only one CPU was utilized. Yay. Did I already tell you that OpenShot uses 100% CPU when I do nothing?? The devs say "behavior is by design"...

For short movies only

The timeline view in OpenShot is your main work tool to arrange videos, music and transitions. It has a zoom setting which allows you to determine the resolution you want to see: Let 5 seconds of the timeline fill the screen? That's ideal for fine-tuning transitions and clip alignments. Let 30 minutes fill the screen? Good to get an overview and jump quickly to a specific place.

Timeline only shows 320 times the zoom slider setting breaks everything. It means that you have the full detail zoom only for the first 10 minutes, and need to use the 12 seconds setting to be able to access minutes 50-60. With that bug, only coarse clip alignment is possible after the first 10 minutes.

Craftsmanship

So now you have 30 images and want to arrange them on the timeline sequentially, and add a fade between each of them - a classic slideshow. Sounds simple? ~~It is - but not in OpenShot.~~

~~You have to do it manually. Add each of them on the timeline. Set the length for each of them (remember the 3 seconds properties window problem). Add a a transition for each of them.~~

~~That's why~~ Applying "Effects" to a "Group" of clips should be implemented, but isn't yet.

Update: Jonathan Thomas replied that OpenShot has this feature; I did not read the manual properly: In the file list, select many clips and right-click them. Select "Add to timeline" and a new window will pop up. Here you can add transitions that will be applied to all of them. Unfortunately, the length is fixed to 5 seconds each picture, which is not always what I need.

Binary blobs are in

Having no batch mode would not be that bad if I could modify the project files by hand through writing some XML. ~~Unfortunately, the OpenShot developers decided to make it as hard as possible for the users to use additional tools and use a binary project file format.~~

I hope that Use a text-based project file format gets implemented some day.

Update: Jonathan Thomas wrote that OpenShot saves its files in a text-based format, but unfortunately the tools I used (less, gedit and file) told me it's binary. Maybe it's because I started the project with OpenShot 1.3 - anyway, I had the problem.

Various other issues

As if the bad bugs I listed up to here are not enough, did I encounter many small usability issues. Listed in no particular order:

It's the lib!

I got a mail from Jonathan Thomas (OpenShot developer) telling me my blog article is unconstructive and that most of the problems I experienced are not OpenShot's fault but that of MLT, the video library that is currently used.

While I can understand that technically, it is a reasoning that does not make OpenShot better or more usable. OpenShot crashes frequently, be it OpenShot itself or an underlying library - I do not care. It just doesn't work.

An alternative

Some days ago I got some feedback from Mark Emerson:

I'm running on Debian and having virtually all of the major problems you describe. I'm deep into a 1-hour, 60-file project now, and getting to the "1 edit, save, crash" stage. I must decide whether to abandon my editing and start over in another video editor. What editor do you recommend?

After finishing this one movie, I ditched OpenShot and have been using Kdenlive for the next two movies without any major problems. If you are looking for a tool that let's you finish your movie, try it.

Other people

The state of video editing on Linux tells you that, two years later, most of the bugs I experienced are still there and OpenShot is alpha-quality at best.

Tomasz Borek has a totally different experience; he had no problems whatsoever in mid 2013.

Another user, this time mid 2014:

Subject: Re: Avoid OpenShot
Date: Wed, 23 Jul 2014 20:55:16 -0700
Dear Sir: I just wanted to relate my problems with Openshot. I'm using v. 1.43 and didn't have any problems until I tried to open my project after saving it. Crashes the program every time. I'm using Ubuntu 14.04_64. I also tried to open my project with Mint 16-same result.

In the beginning of 2016, Louigi Verona made a video Video editing on Linux: Openshot in which he visually describes some of the problems with OpenShot.

Mentioned in Why it is ok to criticize FLOSS .

Published on 2011-12-01 in bigsuck, tools, video

Transkript für Logbuch Netzpolitik #232

Einer von mir gern gehörten Podcasts ist Logbuch Netzpolitik und ich finde den Inhalt so wichtig, daß ich gerne eine Niederschrift (Transkript) der Gespräche hätte - das erhöht die Findbarkeit mit Suchmaschinen enorm.

2015 machte ich schon mal einen Versuch, einzelne Folgen per Crowdsourcing über einen inzwischen nicht mehr verfügbaren Webdienst zu transkribieren, allerdings fanden sich nicht genug Leute - es wurde nicht mal eine Folge fertig.

Das Thema ließ mich nicht los, und vor zwei Monaten beschäftigte ich mich wieder mal damit. Schnell war klar, daß ich es diesmal mit technischer Unterstützung angehen wollte, da Spracherkennung ja inzwischen verbreitet ist.

Sprache zu Text

Zur Texterkennung probierte ich Google Cloud Speech für 0.6ct pro Minute, was allerdings kein Problem war, weil man beim initialen Anmelden 300US$ Guthaben bekommt.

Ich schaute mir einige Tools an (transcribe_audio, podcast-transcriber, transcribe-podcast) und bastelte mir dann ein eigenes Script.

Die LNP-Folgen liegen als .opus-Datei vor, und Google Cloud Speech unterstützt laut Dokumentation dieses Format .. allerdings klappte es bei 4 Versuchen nicht. Der Support meinte, die Opus-Unterstützung ist noch experimentell.

Ich konvertierte die 25MiB .opus-Datei in eine 500MiB .flac-Datei (mono!) und lud diese hoch. Einige Stunden später hatte ich eine JSON-Datei mit Wörtern und deren zeitlicher Position. Ein Beispiel:

Es gibt mehrere Probleme:

Keine Satzenden.
Keine Sprechersegmentierung. Der von unterschiedlichen Personen gesprochene Text ist ein einziger großer Block.
Fehlerhafte Erkennung. Bei Standardwörtern ist die Erkennung ok, aber es gibt Wörter wie "Netzpolitik", die immer falsch erkannt wurden. Manchmal waren komplette Sätze falsch.
Das Transkript ist Wort-für-Wort, d.h. mit Verzögerungslauten wie "ähm" und Wortwiederholungen. Das liest sich nicht gut.

Insgesamt war das ganze aber immer noch besser als alles selbst tippen zu müssen. "Nur" korrigieren :)

Sprechersegmentierung

Das manuelle Aufteilen der Sätze auf verschiedene Sprecher wollte ich auch nicht machen. Auf meiner Suche fand ich spokendata.com, die zwar kein Deutsch unterstützen, dafür aber eine ziemlich brauchbare Sprechererkennung ("Diarization") haben (und das kostenlos!).

Man kann ihnen im Webinterface die URL zur .opus-Datei hinwerfen und bekommt eine Stunde später eine Mail, daß die XML-Datei fertig ist.

Zwar wurden bei der LNP-Folge 232 (die 2 Sprecher hat) insgesamt 38 Sprecher erkant, allerdings konnte ich das mit dem Transkriptionsprogramm transcriber ziemlich schnell auf 2 reduzieren. Man kann dort Sprecher komplett ersetzen.

Zusammenführen

Jetzt hatte ich den Text in der .json-Datei von Google, und die Sprechersegmentierung in der XML-Datei. Als Entwickler bastelte ich mir ein kleines Script, welches die beiden kombiniert.

Die besten Ergebnisse bekam ich bei der Nutzung der Wortendezeiten.

Transcriber

Auf meinem Laptop läuft aus Angst-vor-dem-Update-Gründen noch Ubuntu 14.04, und dort gibt es ein nutzbares Audiotranskriptionsprogramm: transcriber. Es ist ziemlich alt (TCL/TK!), aber doch brauchbar.

Nach der Konvertierung der Audiodatei in .wav und dem Schreiben eines Konvertierungsscripts von dem segmentierten XML in das von transcriber unterstützte .trs-Format konnte ich endlich anfangen.

Zwischendurch merkte ich, daß transcriber noch Leerstellen einfügt wenn die aufeinanderfolgenden Segmente zeitlich nicht auf die Millisekunde passen. Weiter gab es durch die Sprecherreduzierung viele aufeinanderfolgende Segmente, die denselben Sprecher hatten. Um das nicht alles manuell im Programm beheben zu müssen baute ich noch ein Script, was .trs-Dateien kompakt macht.

LNP Folge 232 war 1h17m lang, und ich brauchte für die reine Korrektur des kompakten Transcripts um die 3h, ein Verhältnis von etwa 2:1.

HTML

Zum Schluss soll das ganze noch ins Netz, also brauchte ich das ganze als .html-Datei.. Ja, wieder ein Script, aber diesmal kein PHP sondern XSL: trs2html.xsl.

Im HTML wird ein Audioplayer eingebunden, und man kann per Abspielknopf vor jedem Satz zu exakt dieser Stelle im Podcast springen!

Das Ergebnis könnt ihr hier sehen:

Transkript von Logbuch Netzpolitik #232: Der böse Kleber aus Deutschland.

Links

Bei meiner Recherche bin ich auf einige interessante Blogposts, Dienste und Tools gestoßen. Hier unkommentiert die Linkliste:

Speech Recognition Services

Google Cloud
Google Speech API
Bing
IBM Bluemix
Wit-AI8
Sphinx

Transcription services

spokendata.com
- free automatic transcription
- paid human transcription
- currently no support for German
deepgram.com
- free automatic transcription
- does not support German.
speechlogger
- play podcast in media player, use pavucontrol to change record source of browser to media player
- takes as long as the podcast.
- no newlines..
swiftscribe

Blogposts

Formats

Examples

Code

transcribe_audio (see blogpost, works offline!), multiple clouds, hardcoded filename/language
podcast-transcriber (google cloud)
transcribe-podcast - (google cloud)
http://otranscribe.com/ - web app to manually transcribe audio
- source code
video-transcriber - webapp to transcribe
cweiske podcast-transcriptions - my own things

Published on 2017-11-15 in podcast, politik, tools

Validating an Atom feed locally

Atom feeds have been invented in 2005. I prefer Atom over the four incompatible-with-each-other RSS formats because it is properly standardized.

After making an Atom feed, it is important to validate it to see if it's correct and every feed reader is able to understand it.

Online services

There are two web services to validate feeds:

feedvalidator.org
validator.w3.org/feed/, which validates by URL and copy&paste input.

Offline validation

At the time of writing, feedvalidator.org was broken and could not be used. Also during development, the feed most often is not available at a publicly accessible URL and thus validation by URL does not work. And copy&pasting is cumbersome. Validating the atom feed on your own machine without network requirements is to be preferred.

Atom feeds have to be validated on two levels:

XML well-formedness
Schema validity

Well-formedness

To check if your feed complies to the XML rules, simply check if it is well-formed:

$ xmllint --noout /path/to/feed.atom

If you get no output all is fine and the feed is valid XML (e.g. its tags are properly nested).

Schema validity

Apart from following the XML rules, Atom feeds also have to adhere to the rules that RFC 4287 defines. The RFC even contains a machine-readable Atom feed schema in appendix B: RELAX NG Compact Schema.

Unfortunately xmllint is not able to work with RELAX NG compact files, but trang can be used to convert .rnc to "normal" .rng files:

$ trang -I rnc -O rng atom.rnc atom.rng

Now we can use the atom.rng schema file to validate our feed:

$ xmllint --noout --relaxng atom.rng http://cweiske.de/tagebuch/feed/
http://cweiske.de/tagebuch/feed/ validates

XML schema

At the time of writing in 2017, I know of not a single working XML schema file for the Atom feed specification.

www.kbcafe.com/rss/atom.xsd.xml does not even detect a missing <id> tag thus cannot be trusted.

The OASIS CMIS atom feed schema is broken; xmllint reports an error when I try to use it:

complex type 'atomPersonConstruct': The content model is not determinist.

Simply use the atom.rng file linked above instead.

Published on 2017-10-20 in offline, tools, web, xml

Christians Tagebuch: tools

Adding the source URL to an image's meta data

Finding the right property

EXIF

XMP

IPTC

Metadata Working Group

Conclusion

exiftool

Browsers

MacOS

curl

wget

Firefox

Update 2020-04: rdf:about

Claws Mail: Open .eml file

Printing a large image on multiple pages

The task

How to print?

PosteRazor

kprinter

OpenOffice

KIPI-Plugins

My solution: convert

Other tools

Generating Matroska tag files from TMDb

Example mkv tags XML file

Lyrics in ogg/vorbis and mp3 files

Lyrics formats

Standalone lyrics file formats

MP3

Data storage

Tools to add lyrics

Displaying lyrics

ogg/vorbis

Data storage

OggKate

Stream

Tool support

Creating OggKate streams

Playing files with Kate streams

Patches

Unsynced lyrics in .ogg

WMA

Tools to add lyrics

Displaying lyrics

AAC

Tools

Clementine

eyeD3

exiftool

Foobar2000: show lyrics 3 plugin

kid3

Lollypop

lyrico

MiniLyrics

Musique

oggz-comment

Rhythmbox

SYLT Editor

VLC - VideoLan Client

vorbiscomment

Windows Explorer

Windows Media Player 11

Hardware

Demo files

Apps missing embedded lyrics support

Conclusion

LDAP-Adressbuch in die FritzBox

wget -O empties file on error

Avoid OpenShot

Frequent crashes

Transition problem

Audio problems

General slowness

For short movies only

Craftsmanship

Binary blobs are in

Various other issues

It's the lib!