The Technical Revolution of Photography | Towards AI
Part one of this three-part series reviewed the 190-year history of photography and it’s technical evolution thru early digital transformation. Part two covered a range of recent technical developments in photography and computer vision.
In this final part of the series, we’ll cover some of the challenges presented by the new technology, efforts to mitigate and possible solutions, ending with thoughts on future opportunities.
Many of the technologies mentioned in part 2 present capabilities that aren’t necessarily new. Humans have been able to use software or analog methods to accomplish many of the same tasks for years. Yet the time-intensive nature of arranging pixels manually has restricted the output. But increasingly, as machines are able to perform tasks that only humans could do previously at their famously fast pace, we’re seeing a huge increase in output for both legitimate and questionable purposes. The issue is largely one of scale as the volume of fake imagery explodes along with obvious implications for the integrity of our media.
The Subjective Eye
Imagery has never been a literal representation of reality and therein lies the challenge of detecting fakes. Every visual capture device and medium of representation has its inherent biases and departures from reality. We each see the same thing from a slightly different perspective filtered through our perception, physiology and life experience. Ansel Adams was famous for stating, “The [film] negative is comparable to the composer’s score and the print to its performance. Each performance differs in subtle ways.”
Photographers take many liberties with their imagery to clarify a point of view whether simply over/underexposing an image to removing distracting elements. Even in this digital era, each brand of camera’s RAW data will be slightly different and the software to render that data has a further impact on its representation. More powerful hardware and software algorithms are capturing an increased amount of high fidelity data yet its output is subject to increasingly powerful editing tools that enable subjective interpretations.
Values play a role in this conundrum as well. A distinction between malicious fakes and artistic edits is not always clear, a determination that can vary from person to person such as how fashion photography is retouched to create a more ideal image by some standards yet seen as an unrealistic and damaging portrayal of beauty by others. Relatedly, concern over output from the iPhone’s new front-facing selfie camera and its tendency to smooth facial features has been labeled “Beautygate” though Apple states the result is due to advanced algorithms and “Smart HDR.” Even unaltered imagery can be used to misrepresent when it’s accompanied by a caption to imply a context that supports a political viewpoint.
We’re on The Highway To…
Suffice to say that imagery has always had a subjective nature that we’ve learned to filter into our interpretations. But the new technology is testing the limits of our trust and abilities and compensate for what in the past have been subtle alterations from reality. The new technology has the potential to present a dramatically different and believable depiction that entertains us in movies but horrifies us in politics and the social sphere.
In an effort to forestall governmental misuse, concern for civil liberties and erroneous results, San Francisco banned the use of facial recognition by city agencies as have Somerville, Massachusetts and Oakland, California. Indeed it’s a technology that is proliferating quickly. While the ban is controversial, it’s also forward-thinking. Once the genie is freed, it’s hard to put it back in the bottle and this is far from the first time technology has moved faster than policy. An independent evaluation of the London police’s facial recognition system determined the results wouldn’t hold up in a court challenge as it found an error rate of 81%. Accuracy will certainly improve but bans and questions over facial recognition will hopefully force a healthy policy debate over its implementation.
DeepNude, software enabling the removal of women’s clothing to produce realistic-looking images via neural networks made a brief appearance until the developer agreed to remove access, recognizing how quickly it was proliferating and acknowledging its likely misuse and harm. Unfortunately, it’s not likely this or similar software has disappeared permanently. There could be useful applications for a feature of this sort in medicine for say, reconstructive surgery but it’s such a short distance between proper and improper or malicious use, which illustrates the challenge for how software technology is managed.
Rescue from Fakery?
The alteration of imagery isn’t new. One of the earliest examples is an iconic photo of Abraham Lincoln from 1860 where his head was placed on another person’s body. Stalin was known for airbrushing his enemies out of the picture.
But with the dissemination of fake media including Russian interference in the 2016 US presidential election, increase in revenge porn and personal character attacks, efforts to identify suspect media have escalated. Detection of altered imagery first became a bigger focus around 2004 and has intensified since using various methodologies. Many companies including Facebook, realizing the potential damage, risk and some degree of responsibility, are developing tools to detect the image and video fakes. Even early versions of scanning software and Photoshop had tools to prevent the replication of paper currency.
Adobe and UC Berkeley shared initial efforts to detect altered faces. The US government is fostering a detection platform through DARPA’s Media Forensics unit. Other organizations are popping up to defend against AI run-amok like the AI Foundation which is creating Reality Defender, a browser plugin to alert users to suspected fakes, and SurfSafe, a Chrome plugin by a couple of UC Berkeley undergrads that compares an image to 100+ trusted sites.
The New York Times in partnership with IBM Garage is experimenting with securing image and video metadata via blockchain technology in The News Provenance Project. The approach may enable readers to determine an image or video’s source and whether the media was altered after publication.
Battling fakery can’t succeed entirely through robot police patrolling the internet. There must be some ownership and accountability over the creation and distribution of content production tools. Architects of physical spaces recognize how structure affects the nature and quality of human interaction and community. A big part of the architecture is devoted to designing buildings and homes to improve the quality of life. Today, software architects and engineers are able to leverage powerful open source software and produce finished products with easy access to global distribution at virtually no cost. There’s no reason why concerns applied to physical spaces can’t be applied to software in a similar fashion. Some level of self-regulation would be wise for both social good and to forestall cumbersome external regulation and licensing that other professions face that also have a large impact on public welfare — architects, builders, lawyers, medical professions. The recent emphasis on “empathy” as a value in software design is encouraging but is only a small step forward.
Anticipating misuse of a valuable tool can be challenging. At their time of introduction, it’s doubtful anyone could have envisioned what Facebook, Twitter, and Instagram would become today, with all their benefits and challenges. Yet, the software is a tool and like any tool, it can be used for good or ill. Thoughtful consideration as to the potential for harm should be exercised throughout the design, development, and distribution of new software. And in some cases, automated monitoring of tools-in-use may have value.
With respect to media and content authoring, one approach may be for content software to write a history of edits and dates into an image or video file to enumerate changes, metadata that can be read by open source tools as a means of evaluation and validation. It’s not a foolproof approach but it may stem the tide and offer a level of transparency.
Detection is a cat-and-mouse game that likely has no end as hackers find new ways to disrupt. The ultimate solution will require multifaceted approaches combining author accountability, policy, law, specialized tools and a healthy dose of skepticism.
What, Me Worry?
Civil society depends on accurate information. There’s a good reason for the First Amendment of the US Constitution as a centerpiece of our democratic society to ensure everyone has a voice to express truth. Various legal challenges over time have further defined the boundaries of this right, enforcing limits to speech that harms, defrauds or defames. Yet the introduction of machines that operate near light speed and in potentially anonymous online venues presents significant challenges, especially with fake imagery and video that are becoming so convincing. Some believe the current political polarization in the US and UK, particularly with regard to Brexit, is due in some part to outside forces manipulating information in social and news channels. At scale, fake news and imagery become very serious problems.
The human brain has evolved to process imagery at a deep emotional level. Various studies demonstrate the impact of comforting or threatening imagery on physiological measures like heart-rate and blood-pressure. We choose to have artwork on the walls of our homes for its influence on our mood and well-being. Imagery is very evocative — it’s challenging for us to supersede our emotional response with an intellectual response like doubt over its authenticity.
It behooves academia and industry to instill a sense of social responsibility among the software development disciplines with an awareness of the potential damage to our social and political fabric. The status quo of allowing our fascination with technology to drive its dissemination with little regard for its impact has undermined our well being in many ways. The creation of new roles for social and ethical governance, complementing those already focused on privacy and security would help in this regard.
Keeping Eyes Out
From the looks of it, photography isn’t dead but appears quite vibrant. Some of the trends are concerning and photography as a whole may be harder to define — the discipline encompasses more than it has in the past.
Smartphones have led to broader use of photography for documentation and communication of personal experiences. In a time-constrained era, static imagery and video have become the de facto means to transmit a lot of info quickly though in many cases, less precisely than words.
In the creative realm, photography and imaging benefit from consideration as a performance. Marshaling resources, location, timing, editing and the objective behind a shot are all part of the performance and provide context for understanding. The moment of exposure is more about capturing data whereas everything prior and up to and including publishing contribute to its meaning.
What’s next for photography and imaging? Looking into the crystal ball offers a few clues.
- Curved image sensors will lead to significant improvements in image quality. They approximate the curvature of a lens and are not far from commercialization and promise to solve some of the distortion and loss in resolution and light when distance increases from the axis of the lens — commonly seen in the corners of today’s imagery.
- The opportunity behind the computational photo has only just been tapped. Neural network training and algorithms will continue to improve, enabling a better quality of output which is also a dependency for self-driving vehicles.
- Going forward, we’ll see more data incorporated into algorithms to create smarter imaging platforms and new insights that reach beyond image data. Merging imagery with spatial data will enable dimensional awareness of rooms, places, and terrain which offers many direct benefits apart from enabling additional capabilities, AR and VR being obvious uses.
- Medical applications offer big opportunities, especially when large datasets of photographic information are combined with genetic data, medical treatment, and outcome data and other indicators such as lab results and existing imaging like thermal, CT and MRI scans. Neural networks will uncover new associations and indicators of health and disease. When applied longitudinally over time, this has the potential for medical insights at a personalized level and earlier identification of risk and detection of disease.
To those of you following along, thank you for reading! It’s certain our collective imagination and learnings will bring other interesting developments to impact not only medicine but entertainment, art, commerce, design, and the environment as well. Interesting times. Stay tuned!