Production Media Storage, Security & Workflow
A Guide to Collecting Media from the Field
No Big d.
I like being presented with a challenge. One day, I was asked to take thousands of photographs from a bunch of different sources and catalog each one using a spreadsheet.
My first thought was, “There has got, to be, a better way!”
So I ran through all the reasons I was being asked to do this thing:
- The photos were taken on a series of days by multiple photographers during multiple shoots at various locations across the country, and all of this information was relevant.
- The photos had personalities, objects and events in them that needed to be referred to.
- The photos represent a large investment on the part of the production company and have to be kept secure from unauthorized use.
Two of these goals can be achieved by filling out a spreadsheet, of course, but what if a filename changes? What if the spreadsheet is lost?
To address all three goals I made things a bit more complicated.
Stay with me here.
The Alpha and the Meta
Cameras encode image files with extra information. This photo of my bonsai, for example:
Camera: Lge Nexus 5
Lens:4 mm (Max aperture f/2.4) (shot wide open)
Exposure: Auto exposure, Program AE, 1/36 sec, f/2.4, ISO 323
Flash: Off, Did not fire
Date: June 9, 2015 12:10:09PM (timezone not specified)
Location: Camera Pointing: Northeast
File: 3,200 × 2,368 JPEG (7.6 megapixels) 2,446,340 bytes (2.3 megabytes)
Color Encoding: Embedded color profile: “sRGB”
All of the data you see above was extracted from the jpg file that is my beautiful bonsai. This extra information is called metadata, and for images, it comes in a few different flavours:
The International Press Telecommunications Council Information Interchange Model (IPTC IIM) was developed in the 1990's to describe images as well as other types of media. This encoding format became widely accepted by photographers, and is used in large by European news agencies.
The Exchangeable Image File Format is used for images and sounds that are recorded by digital cameras and scanners. The Japan Electronic Industries Development Association produced the first version of Exif, and the metadata tags defined by this standard are extensive.
The Extensible Metadata Platform is an ISO standard created by Adobe Systems Inc. that applies to a wide variety of digital documents and data sets. This encoding format is largely superseding all others as the working standard for images and more.
That’s most of your organizational infrastructure already in place. Let’s look at my beautiful bonsai again:
This image is located in my Downloads folder on my mac, and so it has been automatically indexed. We can see this indexed information by getting the info on this file with the keystroke combination ⌘ + i .
Under the ‘more info’ section we can even see that macOS has added additional information to the file; the web address it was downloaded from.
Indexing is seen in action when we type “nexus” into the search field of a Finder window. The file is not named “nexus”, but an instance of “nexus” does appear in the metadata encoded as part of that file, so the file ends up in our search results.
Leveraging the power of metadata, we have the beginnings of a system for categorizing and organizing all the media we want to, including images.
For the modern photo catalog, metadata is hugely important. When we need resources to create fresh content, we need those resources now. The archivist makes this possible, and Adobe Systems Inc. has got your archivist’s back.
We search by date, time, place, photographer, camera, lens, people, objects, actions; descriptors of all kinds. Keeping it all in order and available for indexing is what software like Adobe Lightroom can do.
Lightroom comes with a powerful metadata editor. This allows the archivist to select individual files or entire series and apply meta information to them. The archivist corrects for things such as date and time, and will also add location and content specific information.
Images can also be tagged by countless keywords, and producers and directors may have specific requests for the archivist in this regard.
Other, free editors also exist that can edit metadata. I’ve used Lightroom here as the simplest and most convenient example for myself.
Roping in Media
Each photographer had their own way of getting their data to the production office. Some out-of-province photographers would use file sharing services like Dropbox while others would travel to the office with hard drives.
These methods work, but they are not secure, nor efficient. File sharing outfits (like Dropbox) are subscription services that cost muchos de dinero, especially if you want any substantial amount of storage space.
It’s also not a very pleasurable idea to know that some Internet company has all of your photos in their possession before you do (insert NSA joke here). These accounts are also easily (and frequently) hacked, so I discourage the use of such services for anything other than cat memes.
Having photographers bring their media into the production office won’t always be possible if they’re shooting in another region of the world, and even if they’re not, hard drives can get lost, damaged or stolen on the trip from point A to point B.
It’s also quite expensive to have to provide hard drives to everyone from whom you want to collect data.
To overcome these hurdles we need a streamlined solution for moving information from point-to-point without exposing our data to theft or degradation. At the risk of sounding technical, this means we need a node, several bins, and a connection to the Internet.
A node, simply put, is a thing that connects to a network. Some examples of nodes include routers, switches, hubs, bridges, printers, telephones and desktop computers.
In the context of collecting media from your crew, our kind of node is a dedicated server. The idea is to have a single gateway through which all of the media you want to collect will travel to its final storage location, and our node is our gatekeeper. Our gateway offers us a measure of security because we have full control over access to the system.
If you haven’t been living on Mars, in a cave, with your fingers stuck in your ears for the past few decades, you’ve probably already encountered this kind of setup. SFTP (FTP over SSH) is a protocol for transferring files securely over the Internet.
With SFTP, we can have our field agents push their content directly to the production office without having to worry about possible theft or data corruption, and without having our media copied to some weird corner of the Internet where who knows what’ll happen to it.
The Power of macOS
Apple’s famous operating system, macOS, which I’m sure that you’re familiar with, is jam packed with all of the tools we need to achieve our goal of creating a server to collect media from the field.
It can also run Adobe Lightroom, with which we’ll fully integrate our media into the production workflow. This allows our archivist to work on the same machine that collects the data they are to work with. This dual-function serves us well in maintaining a streamlined, inexpensive approach to our solution.
There are countless online tutorials that spell out the process of setting up a server in this way, and I will likely write all of those steps into a fresh, new article (The steps are numerous and quite complicated). For now, let’s assume that it’s all up and running. ; )
For our remote, Internet-connected agents, there’s a tool that runs on both Mac and Windows platforms called CyberDuck.
This handy, dandy, open-sourced tool allows anyone running a Mac or Windows machine to use the SFTP protocol, and to drag-and-drop their content to upload it. There’s no mussin’, nor indeed any fussin’ with this beauty.
These two sets of tools are all that we’re going to need to facilitate our secure transfer of media. The server accepts and secures incoming connections from the internet via SSH; the incoming user is restricted to a single folder and to the SFTP protocol; our user dumps their images into the incoming bin, then logs out. Done.
Bins, Bins, Bins
A bin is just somewhere you put something. It can be abstract, like the memory location of a particular data structure, or it can be a literal bin, like the kind you chuck things into. The premis remains the same. We want to chuck things into our bin.
When our crew members connect to our server, what they’re really doing is connecting to a bin where they intend to dump the media that they are being paid to produce. The process is quite simple, no need for extra hard drives, or trips to and from the office, or countless hours of syncing to the cloud.
For our purposes, we’re going to need three different bins for three very specific reasons.
An incoming bin — A storage space used to capture incoming media from the field. This type of storage location is accessible only through a secure (encrypted) connection. Files received here are then moved to another secure location, obfuscating them away from the potential for unauthorized access.
A working bin — The active storage location for media that is “in production” or referred to frequently. This bin physically resides within the production environment. In our case, an external drive attached to our server is where our files are going to end up. This makes it easy to pick-up and move our media to any location we want.
This bin is the ‘working’ bin because it will be used by the archivist to encode the incoming media with information appropriate to production, including keywords and other metadata.
The working bin is also a part of the production workflow. Our creative professionals will connect to this location to find what they are looking for.
When data from the working bin is copied to other workstations to be used in production, every file access is logged to monitor use. This important security feature can offer us some metrics in terms of how our media is being accessed, as well as offer a measure of security against access that has not been authorized.
Files in the working bin are setup as ‘read-only’ to all users except the archivist. This creates an environment of non-destructive editing and locks all original media away from accidental modification or deletion.
A backup bin — An archival storage location that is synced directly with the working bin. The backup bin is usually housed in a facility other than that used for production in case of fire, and is only accessible through the syncing function.
This separation of functionality between bins contributes to a more secure and efficient use of bandwidth across both internal and external networks.
Moving Things Around
The flow of information goes from the world-wide-interweb (our field agents), to the incoming bin (our secure server), to the working bin (an external hard drive), and finally the working bin is regularly backed up to an off-site location.
Our archivist is only interested in the contents of the working bin, so we automate the process of moving content from the incoming bin to the working bin. This is achieved using the program rsync. Rsync can also traverse the Internet and sync our working bin directly with a backup server using a securely encrypted connection. This tool comes with macOS and is ready to go when you are.
You Get The Gist
That’s pretty much it. This inexpensive, secure, efficient model is a perfect solution for collecting and organizing media. It leverages the power of the existing functionality of macOS, protocols such as SSH, and software like Adobe Lightroom to streamline the process of collecting, managing, and distributing large stores of media.