(also: KItinerary + Browser Integration = <3)
KDE Itinerary is a project to get your travel itinerary presented to you in a unified, well structured and always up to date fashion, by extracting structured data from emails, boarding passes, and other sources. I successfully traveled the world with it!
Step 1: The crazy idea, is it viable?
Since I’m always looking for new innovative features to add to Plasma Browser Integration, having KItinerary not only look at your emails but also websites seemed like a natural evolution. During the Nürnberg Megasprint™ in June I pitched the idea to Volker Krause and he talked me through how all of this structured data and boarding pass magic worked. I then wrote a quick and dirty browser extension that scanned your open tabs for any such annotations, so we got a sense of how common they actually are in the open web.
It turns out: very. It is of course in a website’s own interest to add those attributes for higher ranking in search engines as well as richer and more useful search results. This got me very excited but unfortunately I didn’t come around to working on it any further until Akademy. Also check out Volker’s blog post for more infos on our little research effort.
Step 2: KItinerary groundwork
Fast forward two months and during Akademy Volker and I met again. This time he showed me how KItinerary builds a sanitized data structure off the various sources it can process, such as PDFs, HTML, pkpass files, and so on. Now that Plasma Browser Integration gained a toolbar popup in its latest release, my idea was to add some useful actions and details there about the contents of the currently viewed website. Volker suggested to focus on Hotels, Restaurants, and Events.
In order to keep KItinerary from becoming a giant over-engineered mess entity types and attributes are added only as they are actually encountered in the wild. This meant that at this point it could only deal with reservations (e.g. a boarding pass is a “flight reservation”, which is typically what you get in an email) but not actual establishments like hotels or restaurants. Once that was fixed, I checked random restaurants and hotels in the vicinity to get some actual test cases. Thanks to Carl Schwan, our website guru, we also got machine-readable event details on our own Akademy 2019 page!
Step 3: Making it happen!
KItinerary comes with an external extractor binary which was added so 3rd party projects can use it without having to link against it. When a user clicks on the toolbar popup, the HTML of the currently viewed website is sent to the extractor process which then does magic and returns a normalized JSON structure I can then pour into a (for now pretty crude) user interface. Depending on the type of entity and data available I show things like start and end date of an event, the address of the venue, as well as contact information. The most useful part would of course be using KDE Connect to send the dates to your KDE Itinerary Android app (get it on F-Droid or directly from us), giving the organizers a call, or just adding the event to your calendar.
A website can even add custom “potential actions”, such as a direct link to a booking page, or their own internal search page. Once we get a date and time picker control in Kirigami we could then send the restaurant information to Itinerary and let you enter when you’re actually going there.
For now it’s just a crude badge using a fitting Emoji, including the one for “Hotel” which looks more like a hospital :-)
As an additional gem I wanted to be able to process tickets viewed in the browser’s internal PDF viewer as well, since if you get your boarding passes as PDF download the browser will probably just display it internally. Moreover, if you then download such a file it will also offer to add it to your KDE Itinerary.
Step 4: You, giving it a try!
If you want to give it a try, you need a recent git master build of the kitinerary library. See also the troubleshooting guide, which also comes with instructions on how to build the optional image recognition feature needed for reading barcodes inside PDF files using zxing library.
For the browser side of things, check out the itinerary branch of the plasma-browser-integration repository and follow the “How to install” instructions from source code on its wiki page. Once up and running, visit your favorite event, restaurant, and hotel sites and see if they have any structured data we can process!
You might also want to build the kitinerary-workbench repository which contains a developer tool that lets you inspect the extractor output. Just paste the URL of the website you’re interested in, change the “source type” to HTML and compare what the “Extractor” tab says to what the “Post-processed” tab says. The contents of the latter is what the extension ends up showing in the popup.