DjVu File Handling on the macOS Command Line with DjVuLibre
About DjVu
DjVu is a web-centric format and software platform for distributing documents and images. DjVu can advantageously replace PDF, PS, TIFF, JPEG, and GIF for distributing scanned documents, digital documents, or high-resolution pictures. DjVu content downloads faster, displays and renders faster, looks nicer on a screen, and consume less client resources than competing formats.
DjVuLibre
DjVuLibre is a popular open-source library and set of tools for working with DjVu files. It provides command-line utilities that allow you to perform various operations on DjVu files.
The DjVuLibre documentation is at times hard to follow, here is an overview of the commands I use most.
Install djvulibre on mac
Prerequisites: we will install djvulibre with the package manager homebrew.
- Open Terminal.
- Run `brew install djvulibre`
Structure and metadata with djvudump
This command displays information about the structure and metadata of a DjVu file.
`djvudump <input.djvu>`
For example, if you have a DjVu file called “document.djvu” and you want the structure and metadata, the command would look like this:
djvudump document.djvu
You can drag and drop the file into the Terminal to add the correct document path.
Extract text from a DjVu file
To extract text from a DjVu file, you can use the djvutxt
command. Here's the basic usage:
djvutxt <input.jvu> <output.txt>
For example, if you have a DjVu file called “document.djvu” and you want to extract the text layer into a file called “text.txt”, the command would look like this:
djvutxt document.jvu text.txt