Introducing Representations: Transform your documents and images to fit your needs
At Box, one of our goals is to help you get more value out of all the content you store in Box. Beyond just providing secure, compliant cloud storage, Box offers a number of value-added services on top of basic storage that developers can use to build content-centric apps in the cloud.
Whenever a file is uploaded to Box, it goes through a process to generate various digital representations of that file. In our web application, we leverage many of these different representations for various purposes, including thumbnail images that we display in the grid view, single-page PDF representations that we use to power our Preview functionality, and extracted text that we index for search to allow users to search within text documents. Today, we’re enabling you to take advantage of these different assets with a new service in our APIs called Representations. You can learn more about representations by reading more below, visiting our reference documentation, or checking out Representations in the Box API Navigator tool. You can also check out this tutorial for using Representations to fetch watermarked PDFs from Box.
With Representations, you can retrieve various digital assets, automatically generated when a file is uploaded to Box, to use in your own way. For example, you could upload a document, including Word documents, PowerPoint presentations, and Excel spreadsheets, to Box via the API and using a simple GET request, retrieve the extracted text from that document to store elsewhere. You might even turn unstructured data like a document into structured data by sending that extracted text to a text translation service like the Google Cloud Translation API to receive a translated response for that document.
The following Representations are available via the Box API:
- PDF Representations for all document file types, including presentations and spreadsheets. If the document has a watermark object applied to it, you can retrieve a watermarked PDF.
- Thumbnail Representations for all document and image file types. Thumbnails are available as JPGs for 32x32, 94x94, 160x160, 320x320, 1024x1024, and 2048x2048 sizes and PNGs for 1024x1024 and 2048x2048 sizes. You can specify one or more of these formats or sizes in your request.
- Single-page Image Representations for all document and image file types. Single-page images can be used to request PNGs (as 1024x1024 or 2048x2048) for specific pages of a file. You can think of them as full-resolution thumbnails for each page of a document.
- Text Representations for all document file types, including presentations and spreadsheets. Text Representations provide you with unformatted text contained in a document as a .txt file.