WTF is DIME again? - writing a scanning tool for my HP LaserJet
TL;DR: I got pissed with HPLIP not working, then the HP Smart app requiring account registration and reverse engineered the network communication and the HP Smart app to develop a tool called HPSimpleScan written in Go, that can be used to scan (not only) from this printer. In the process I’ve written a Kaitai Struct definition of the long-forgotten DIME format and contributed it to the Kaitai Struct formats repo.
As a Linux user, I can safely say that printing on Linux is awesome. It really is. 98% of the time, you just connect the printer somehow, doesn’t matter if through USB or a network, and it just works. No driver installing or anything, thanks to CUPS(or similar) and widely supported PDLs it just works out of the box. Problems may arise once you try to scan, however. Moreover scanning over the network. There are few “widish-ly” supported standards (like eSCL, WSD etc.), but often only new/certain scanners support them and a lot of the time they still need proprietary software. This brings me to my printer.
HP LaserJet 100 colorMFP M175nw
It’s quite old but has everything you could ever want from a printer/scanner All-in-One. It prints colour, has an ethernet port, ADF, no BS online printing services if you don’t want them - just perfect. Printing was never a problem with this baby (
PCL 6 ftw), but scanning… oh boy… scanning. According to the docs, it supports
Twain and Windows Image Acquisition (WIA), both of which are AFAIK USB only. So how do you scan via network? On Windows, you just download the correct driver from the HP website and you’re good to go. On Linux, well…
HPLIP is HP’s partly-opensource-partly-proprietary Linux driver for HP devices. That’s great when it works, but sucks bad when it doesn’t. It worked for me for quite a long time, it was always a huge hassle to set up in the beginning, but then it worked okay. Until I reinstalled my laptop in early 2019. After that, I never managed to get it working again. I tried manual discovery, I tried different protocols, I tried different versions, I tried
sudo systemctl stop firewalld, I tried different SANE drivers, like sane-airscan - nothing helped. When I needed to scan something, I would spend full evening tickling with HPLIP, then give up at like 2:00 AM and scan either from Windows or the mobile HP Smart app. After like 5-6 of those evenings I gave up completely and used the app right away and then transferred the file to my computer. Which worked flawlessly until…
“What if we make them create an account?”
I can hear some clever head at HP business strategy meeting ask that question. Now you had to register to use the app, to use YOUR printer. Along with, you guessed it, agreeing with all the possible usage of your juicy personal data in the world. That extremely pissed me. Now I was on, this was personal. And I thought to myself: how hard could it be to write my own driver?
Developing my own scanning tool
PCAPs, JADX & chill
I downloaded an older version of the Android app from apkmirror. Loaded it up in an emulator, fired up Wireshark, started scanning and soon enough:
It was a SOAP API, which as always with SOAP was on one side disappointing, but on the other side relieving, because it could have been something much worse (in the 90s-2000s people^W Microsoft experimented with all sorts of things). The request was clearly getting
ScannerElements, presumably possible scanner options, and the printer returned them! From the capture, I reconstructed how the app requests and retrieves the scan (in the non-ADF mode):
The app also interlaces those requests with
GetScannerElements, probably just to check if everything is happening as it should. The
GetJobInfo requests and responses looked very clear as well. The
RetrieveImageRequestResponse however was weird. It was clearly some binary format combining XML and JPEG into one response? What the heck?
This bugged me for a long while, at the time I completely missed the
Content-Type: application/dime. So instead I decompiled the app with JADX and to see how it was parsed. After an hour or so I found the code that seemed responsible for saving the image to disk:
If we skip the unimportant, it gets the
JobID, then feeds it to
b function that probably returns an object representing that Job, this object and a filepath ending with
/dime_message is fed to function
a that returns another object on which is then checked if it is binary, then the
file property is somehow iterated over and if the type equals to
image/jpeg that part gets saved to a file
<CURRENTMILLIS>.jpeg. Now it finally occurred to me to google
dime_response and I found out about…
Do you know MIME? It is a way of embedding multiple files of different Content-Types into a single file. It was developed in the 90s to be used in email, where you often want to embed text, HTML, pictures or other attachments at the same time into a single text message you then actually send over. In the beginning of the message, a boundary string is specified that divides the separate parts and before each part, there is a header specifying what Content-Type given part is (and how is it encoded etc.). Simple and plaintext:
MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=frontier This is a message with multiple parts in MIME format. --frontier Content-Type: text/plain This is the body of the message. --frontier Content-Type: application/octet-stream Content-Transfer-Encoding: base64 PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg== --frontier--
DIME is Microsoft’s early 2000s try at making MIME more efficient by making it binary. For example: instead of having
Content-Type: before each type, you just specify it to be at a given offset and of a given length and save space and bandwidth. It never even made it to RFC and it’s still a draft.
Okay, so that explains it. The response is multipart: part XML, part JPEG. The JPEG part is big so it’s divided into multiple ones with metadata in between. The question now was, how to get the JPEG out of it. Because DIME is truly an obscure thing, the only parsers I could find were in Perl, Java or PHP. No Golang :( So what to do now?
To quote from the official website:
Kaitai Struct is a declarative language used to describe various binary data structures, laid out in files or in memory: i.e. binary file formats, network stream packet formats, etc.
The main idea is that a particular format is described in Kaitai Struct language (.ksy file) and then can be compiled with ksc into source files in one of the supported programming languages. These modules will include a generated code for a parser that can read the described data structure from a file or stream and give access to it in a nice, easy-to-comprehend API.
This seemed like the ideal tool for the job, so with the help of the draft, this Microsoft article from 2002 and this very helpful article by Imran Nazar, that helped me understand the format more quickly, I declared the format in Kaitai Struct and it worked like a charm!
And now for the “easy part”. Using this awesome tool I converted all the XML to go structs. Because this is SOAP, I had to also make a
prepare method for each struct that would set the static attributes as schemas, XSD URLs, encoding styles etc. Then I used the Kaitai Compiler to compile the
.ksy definition to Go. Then all that was left was some basic CLI code on top of it :)
sijisu@ThinkSUSE ~ $ hpsimplescan NAME: HPSimpleScan - simple scanning for some older HP printers/scanners, especially the HP LaserJet 100 colorMFP M175nw USAGE: hpsimplescan [global options] command [command options] [arguments...] VERSION: v0.2 COMMANDS: status, i get the current scanner/printer status scan, s scan from the scanner platen to file scanadf, sa scan from the scanner ADF to folder help, h Shows a list of commands or help for one command GLOBAL OPTIONS: -i IP IP or hostname of the scanner/printer to connect to (default: "192.168.1.3") -p port port of the SOAP API on scanner/printer (default: 8289) --debug, -d debug output (default: false) --verbose, -v verbose output (default: false) --help, -h show help (default: false) --version, -V print the version (default: false)
Recently I’ve added support for the ADF (because I needed it lol), but many features are still missing. We will see if they will ever make it.
While writing this article I found that the SOAP API is in fact almost identical to WSD (the kinda scanning standard from the beginning, do you remember?). The CreateScanJobRequest, GetScannerElementsRequest are identical,
GetJobInfo is missing and RetrieveImageResponse explicitly specifies MIME as the response format. So maybe my printer has some first unfinished prototype of WSD?
I think sometimes it’s okay to get angry at things because it can force you into making them better and, if nothing else, learn something in the process. If I just kept using an older version of the app (or just used Windows :P), I would have been fine, but would never learn about the intricate world of scanning and obscure formats.
You can find the project on my Gitlab