In 2001, the digitisation of a photo archive made international headlines. Bill Gates’ stock photo company Corbis announced that it would relocate the world-renowned Bettmann photo archive, which contained around eleven million images, from a site in New York City to a ‘strange underworld.’ To protect the collection from ‘earthquakes, hurricanes, tornados, vandals, nuclear blasts and the ravages of time’ Corbis rented an old mine where the pictures could be stored free from humidity and in sub-zero temperatures. David Levi Strauss noted that the transfer would probably ‘preserve the images,’ while making them ‘totally inaccessible’ at the same time. Planning to sell scans of the images, Corbis had only digitised a mere 2% of the collection (around 225,000 negatives) in the six preceding years. At that rate, Strauss noted, it would take another ‘453 years to digitize the entire archive.’ Did Gates acquire the rights to show the photographs or to ‘burry them in a tome?’1
Connected to concerns over accessibility, other scholars worried that the digitisation of Bettmann would entail a fatal loss of context. Gail Buckland argued that the loss of the paper archive, the proper context of the collection in her eyes, would result in many less serendipitous finds: ‘You stumble on things you would never find on your computer.’ Pointing to a connected progress, she noted that digitisation would surely make popular Bettmann photographs, such as Einstein sticking out his tongue, even more popular. While these ‘money-making 20th-century icons’ had already been digitised, many other photographic gems, which were not in line to be scanned anytime soon, would be condemned to be preserved for eternity without being seen ever again.2
This special issue of TMG Journal for Media History examines what happens to pictures of the news and the way(s) we can see them when they are moved from the analogue to the digital realm. What do researchers of photojournalism gain and lose when the photographs we study are made accessible in a digital form? What kind of questions can we ask in and of the digital photojournalistic archive? Can computational techniques provide new kinds of access to and identify visual patterns in digitised photojournalistic collections? How does this change our understanding of the history of photojournalism?
Next to introducing the contributions to this special issue, this introduction observes that digitisation and the use of computational methods have resulted in three different levels on which pictures of the news can be studied: as (small collections of) material artefacts, as sets of metadata points, and, finally, as large collections of unstructured visual data. While researchers are often tempted to either unequivocally praise or denounce the digitisation of collections, we urge them to move past the rigid material-digital dichotomy. All three levels of research come with their own specific methodological benefits and pitfalls. Instead of digging their heels into the sand of their preferred level of research, scholars of photojournalism should seek to bring different levels into fruitful contact. Applying theories and concepts from literary studies, we argue that this means that we must work in between the material and digital archive and across computational distant reading and traditional close analysis.3
Level one: close reading photographs as material artefacts
Traditionally, historians of photojournalism have applied several theoretical frameworks, such as iconography, iconology, visual semiotics, and visual rhetoric, to analyse a single picture, or a small collection of related images, such as a photo reportage, a series, a photo project, a publication, or an album.4 Next to these visually oriented frameworks, other work highlighted the interaction between images and words. Susan Sontag already noted that a caption anchors a photograph in the plurality of meanings that it might convey.5 More recently, multimodality theory has fleshed out the different ways in which images and texts produce meaning in conjunction.6
While historians of photojournalism frequently neglect to mention which specific theoretical lens they use, close readings of individual pictures and photographers has led to exemplary work. For example, taking a forensic approach, Vincent Lavoie examined Robert Capa’s iconic 1936 photograph Falling Soldier. After meticulously reconstructing the controversies that arose around the photo’s authenticity in the 1970s, he builds on the work of Foucault to analyse the ‘regimes of truth’ that different parties in the debate mobilised to determine the veracity of the picture.7 In 2014, Capa’s Magnum colleague Henri Cartier-Bresson was the focus of a major retrospective at the Parisian Centre Pompidou. In the accompanying publication, the exhibition’s curator Clément Chéroux argued that Cartier-Bresson’s famous concept of le moment décisif (the decisive moment) had led to a narrow understanding of his work.8 He also highlighted the necessity of taking the material aspects of photographs into account. For the first time in decades, an exhibition of Cartier-Bresson’s work included vintage prints, where previously exhibitions were mainly composed of more recent exhibition prints from the so-called Master Collection, a set of 400 pictures that were selected and approved by the photographer himself.
Close reading photographs brings their materiality to the surface. Using theoretical concepts from anthropology, Edwards and Hart noted that photographs are ‘both images and physical objects that exist in time and space and thus in social and cultural experience.’9 They derive their meaning from ‘a symbiotic relationship between materiality, content and context.’10 The reverse sides of historical press photographs provide a perfect display of this symbiotic relationship as they include extensive annotations, such as captions, instructions for the designer and printer, and notes on the publication history of the image. Prints, negatives, colour slides, and contact sheets are the basic ingredients for the printed image in every publication and reflect the daily practice of photojournalism. Recent work on (the history of) photojournalism includes both the image and its carrier in its analysis.11 The development of photographic technology, printing and distribution systems all determined the specific shape of visual news. In a digital environment, these material aspects might be quickly overlooked or forgotten. However, the shape of born-digital visual news is, of course, also determined by material infrastructure, such as file formats, photo editing software, website interfaces, servers, and cloud storage.
The material approach to photojournalism has led to the concept of the ‘resourceful archive.’12 As Asser and Salvatori demonstrate in their contributions to this special issue, the archive shapes and directs the analysis of scholars through tangible aspects, such as prints, negatives, annotations, folders, boxes, and classification systems. In her ‘biography’ of the Spaarnestad Collection in the Dutch National Archives, Asser shows that transformation of this archive from an editorial archive to a historical picture library reflects both the prevailing photographic practice at the Spaarnestad publishing company and developments in the international image industry. Salvatori similarly notes that the material and organisational structure of the archive of the artist-illustrator Ugo Matania embodies the creative methods he applied in his work for the Neapolitan illustrated press during the interbellum. Collecting thousands of press photographs, press clippings, sketches, and drawings, Matania created a large ‘toolbox,’ which he used to provide the magazine Il Mattino Illustrato with covers that referenced current events. The digitisation of these types of archives enables new kinds of research and can forge connections both within and between photo archives. However, as Asser and Salvatori argue, digitisation can also obscure important material aspects of photographs and the archives that store them, limiting idiosyncratic searches and serendipitous finds.
Close reading has provided valuable insights into the meaning of individual photographers and their work. However, partly because of this intense focus, the method has also led to selection bias in the study of photojournalism. Describing the concept of distant reading, literary scholar Franco Moretti noted that the method of close reading, intensively scrutinising a single or a small number of texts, had resulted in the formation of a canon, a small number of books that were (re)interpreted over and over again, and a non-canon: a comparably large number of books that were all but forgotten.13 The same process can be observed in the study of photojournalism. Researchers have zoomed in on a small selection of photographers and pictures, ignoring vast swathes of photographic production. Analogous to Cohen’s concept of the ‘great unread’ for novels, we can describe these millions of pictures as the great unseen.14 The adjective ‘great’ not only references the great number of unseen pictures but also the fact that they can be as beautiful, interesting, or revealing as any iconic picture.
Next to possible selection bias, close reading individual photographs stands in a tense relation to the practice of photojournalism in the last hundred years. From the early 1920s, newspapers and magazines started to publish thousands of pictures of the news each day. In addition, as Allbeson and Colquhoun show in their contribution to this special issue, famous magazines like Picture Post were marked by overproduction, commissioning many more images than they published. In this sense, photojournalism has always been a numbers game, focused on quantity first and quality second. Rather than studying the flood of images in its entirety, researchers traditionally zoomed in on the work of a small number of star photographers and an even smaller number of their pictures, which are frequently described as ‘iconic.’15 This scholarly practice – elevating a small number of pictures from the daily image stream – has led to a specific type of circular scholarly reasoning: pictures are studied because they are special and special because they are studied.
The canonisation of a small number of photographers and photographs can also be connected to a specific professional ethos of photojournalism. In 1952, Cartier-Bresson coined the concept of the decisive moment to describe the ability of photographers to perfectly frame the essence of a fleeting event or moment. Instead of making an image, as painters had done for centuries, the genius of the photographer lay in instinctively knowing the right time to capture a scene. Cartier-Bresson used the concept to lift his own work from the day-to-day stream of images that he and his colleagues produced for newspapers and magazines. In their turn, researchers applied his concept retroactively, justifying their selection of a small number of images by pointing to the extraordinary combination of content and formal aspects of their sources. As the constant discussions surrounding photojournalism awards shows, what makes a picture of the news special is subject to constant contention and change.16 Building on the work of Bourdieu (1965), researchers have argued that what constitutes a ‘good’ photograph, a perfect capture of a decisive moment, is mainly determined by cultural actors, such as star photographers, magazine editors, World Press Photo award jurors, and academics, that have the power to shape this category.17
Level two: metadata networks
How long does an image of the news retain monetary value? Since the start of the profession in the late-nineteenth century, photojournalists have made pictures of the news for news publications. However, these pictures did not necessarily lose their value after what they showed became yesterday’s news. Famous publications and publishers set up extensive image collections to sell pictures as many times as possible.18 These image banks devised elaborate index systems to provide possible buyers with easy access to images of specific photographers, topics, objects, scenes, and themes.
In this special issue, Asser and Allbeson and Colquhoun describe the formation and history of the Dutch Spaarnestad Collection and the Hulton Archive, the photo archive of Picture Post, respectively. Asser recounts how, in the 1980s, the editorial photo archive of De Spaarnestad, the largest publisher of illustrated magazines in the Netherlands, was transformed into an image bank, which has provided historical press photographs for a wide range of commercial and cultural purposes ever since. In contrast to similar ventures, the commercial exploitation of De Spaarnestad’s images always fully benefited the preservation of the physical photo archive. Connecting the two contributions in this issue, Asser notes that De Spaarnestad followed Edward Hulton’s way of setting up an image bank by creating a surplus of photographs through overproduction and the acquisition of other historic image archives. In their contribution, Allbeson and Colquhoun argue that interdisciplinary research methods and curatorial approaches can facilitate new routes into the vast Hulton Archive and other archival sources that represent the production, circulation, and consumption of the Picture Post. Resulting from a partnership between the Tom Hopkinson Centre for Media History and Amgueddfa Cymru/National Museum Wales, the project presents the origins, impact, and legacy of Picture Post in various outputs for different audiences, including an exhibition, a public program, and several publications.
In 1996, Getty Images, the largest for-profit international image bank, acquired the Hulton Archive. Because the margin on the sale of banked images has always been small, commercial collections merged into ever larger conglomerates to increase their profits. In addition to the Hulton Archive, Getty also acquired the distribution rights of Gates’ Corbis collections, world-renowned archives, such as Bettmann, illustrated publications, such as the French Paris Match, and agencies, like Gamma and Sygma. While the monetisation of pictures in image banks should not be seen as a direct result of the rapid proliferation of the internet, this process certainly exacerbated the market concentration.
Next to commercial parties, (semi-)public institutions, such as libraries and archives, have also started to digitise their photojournalistic collections. For example, the Dutch National Archives, holding the largest collection of press photographs in the Netherlands, made one million of their 15 million pictures available online, including c. 400,000 images as open data under CCO. Similarly, the American National Library of Congress provides free access to several famous collections, such as the 39,744 digitised glass negatives (1900-1922) of Bain, one of America’s earliest news picture agencies, and 175,000 black-and-white negatives of the Farm Security Administration Office, taken between 1935-1944. In France, Gallica, the digital portal of the Bibliothèqe nationale de France (BnF), offers access to 24,845 digitised photographs of the famous photographer Nadar and 119,443 pictures of the Monde & Camera picture agency.
The digitisation of historical collections of pictures of the news is an ongoing process. In this special issue, Kraus and Retter describe the large-scale digitisation of the STERN Photo Archive. The publishers of the famous German magazine donated the collection to the Bayerische Staatsbibliothek in Munich under the condition that the 15 million press photos would be made available online for a broad public. In their contribution, Kraus and Retter explain how they tackle the various challenges involved in such a large-scale digitisation project. They also note how the transformation of an analogue archive to a structured digital image collection allows researchers to explore and examine photojournalistic collections in new ways. Digitisation always involves adding metadata to pictures. These metadata points can be used to study patterns in larger collections. Which photographers were given the most assignments? What were they told to photograph? What did they actually shoot? Which publications bought the images? What is the relation between photojournalism and commercial and artistic forms of photography? In her recent work on the Magnum, reviewed in this special issue, Bair studies these questions to explain the continued notoriety of the agency and its members.19
Studying patterns in metadata, scholars of photojournalism can move past the traditional focus on the work of a small number of photographers and their pictures. In their overview of photojournalism, Thierry Gervais and Gaëlle Morel rightly emphasise the pivotal role of the picture editor.20 Bair takes an even broader view and includes ‘writers, spouses, secretaries, editors, darkroom assistants, publishers, corporate leaders, and museum curators’ in her analysis.21 She argues that this networked approach to photojournalism can serve as a critique of the pervasive professional ethos of the decisive moment. By examining photojournalism as a collective rather than an individual practice, it becomes clear that decisive connections rather than moments can explain the lasting relevance of particular photographers and pictures. Building on this type of work, research on photojournalism could surely benefit from a more direct engagement with concepts developed in STS and media studies, such as media ecology, actor-network theory, or media infrastructure, that explain historical change by pointing to the interaction between actors and technologies.
By studying archives and analysing meta data points instead of only looking at individual images, the work of Bair moves from a close to a distant reading of Magnum’s archive: a zoomed-out view of many pictures instead of a close-up study of a small number of images. By focusing on the large number of images that are marked by their ordinary nature rather than a couple of extraordinarily beautiful pictures, this kind of research chips away at the great unseen and enhances our understanding of photojournalism. Bair’s focus on the commercial work of Magnum photographers or, in this issue, Allbeson’s and Colquhoun’s focus on specific sets of images of the Picture Post are good examples of this type of research.
While distant reading is often conceptualised as being directly linked to digitisation, Bair’s work shows that it can also emerge from a critique of prevailing close reading methods. Of course, digitisation of image collections and digital methods have made it easier to pursue zoomed-out studies of the photojournalistic archive. For example, in this special issue, Kraus and Retter note that researchers had to consult a complex set of different index systems to navigate the analogue archive of stern. Using OCR technology, the digitisation project included the development of a new metadata system that could be uniformly applied to the entire archive. In their turn, researchers can use this metadata to discover large-scale patterns in the STERN Photo Archive.
Level three: finding patterns in unstructured visual data
What is the difference between meta and normal data? Digital collections that are structured by metadata points still contain massive amounts of unstructured visual data: the images themselves. Researchers can extract metadata from pictures by simply examining and annotating them one by one. A computer, in contrast, cannot describe images in the same contextual way as humans. It does not see pictures but processes them as collections of numbers: the pixel values that make up a digital image. We know that these numbers must contain patterns because images show the same or similar things to us: the same persons, the same object, the same scenes, or the same colours. Rapid developments in the field of computer vision, which uses computational means to gain a high-level understanding of digital images, provides researchers with new ways to extract all sorts of patterns from large collections of pictures.22
Pioneering a new way of thinking about collections of images as unstructured data, Lev Manovich proposed generating ‘numerical descriptions of various visual characteristics’ and visualising ‘the complete image set organized by these characteristics.’23 For his project TimeLine, he extracted the colours from all the covers of Time magazine and placed them on a timeline, making clear how different decades were marked by different colour use. Even without the extraction of visual features, computational visualisation of large collections of images can provide insight into a collection. Begley’s (2017) digital installation ‘The News is Breaking’ rapidly displays every front-page of The New York Times (since 1852) in a one-minute clip, showing how the newspaper transformed from a text-heavy broadsheet into a colourful visual spectacle. These projects are good examples of distant reading, which seeks to identify patterns in the entire collection instead of looking at images one by one.
New forms of computer vision, such as image classification, facial recognition, and object detection, enable researchers to extract more complex numerical features from large collections of images. For example, Wevers and Smits applied facial recognition to study the representation of gender in one million historical newspaper advertisements.24 Arnold and Tilton applied Facebook’s Detectron model to automatically identify objects in 1610 pictures of the famous Farm Security Administration photographic division.25 In their contribution to this special issue, Wevers, Vriend and De Bruin use scene detection to enrich a collection of two million images of the Dutch De Boer press agency. Instead of using computer vision to identify objects, they show how scene detection can identify the environment in which these objects sit. For example, a car (object) in a parade (scene) or a chair (object) in a living room (scene).
Buckland claimed that digitisation would hamper the ability of researchers to ‘stumble upon’ interesting, unexpected, and unforeseen material. She is certainly right that by examining photographs in their material and three-dimensional form as well as their original and physical context, researchers can connect images in highly complex and contextual ways. However, computer vision techniques can also be used to explore archives, enabling novel serendipitous encounters with the great unseen. For example, Lee extracted visual content from around 16 million digitised newspaper pages from the National Library of Congress and provided users with new ways to explore this visual side of the news.26 Similarly, the application Pixplot uses computer vision techniques to cluster visually similar images.27 In this project, the computer vision model is not looking for a specific object or person but was trained to extract visual patterns unsupervised. Applied to around 24K digitised pictures of the Bain agency (1910-1912), PixPlot identified large clusters of boxers, baseball, cars, portraits, piano’s etc.
Directly analogous to Moretti’s concept of distant reading, this type of research has sometimes been described as distant viewing.28 While computational techniques can identify patterns in the great unseen, a focus on these techniques might also lead to unexpected blind spots and new forms of canonisation. In her contribution to this special issue, Salvatori points out that many small or local photojournalistic collections will never be digitised. Other collections, even some famous ones, are only partially digitised. For example, a quick search for Bettmann on Getty’s website returns 205,732 hits, which is less than 2% of the entire collection. In this sense, Strauss’ prediction came true: Getty seems to have bought the right to bury the majority of the Bettmann archive in a mine while it keeps re-circulating the ever-more popular all-star hits of the collection.
Finally, the capabilities of computer vision techniques are often overestimated. Tech companies promote their models by noting that they can see better than humans. However, even the most sophisticated techniques are only able to recognise around 100 different objects, which shows that these models are far removed from the complex and contextualised understanding of images by humans.29 Discussing this gap between ‘how a person interprets an image and what can be automatically extracted [by computer vision],’ Van Noord surveyed the potential benefits and limitations of using machine learning models to study iconic photographs.30 While being positive about the possibilities of these techniques, he concludes that they are still far removed from being able to capture the plurality of meanings that iconic images have. In a more forceful critique of distant viewing techniques, Lewandowski’s contribution to this special issue points out that they can only be applied if the raw data is easily accessible. Relying on a qualitative coding of 240 digitally archived front pages of The New York Times, published between January 2000 to January 2020, she charts the aesthetic shifts of the paper. Analysing everyday news photography rather than iconic images, her article constitutes a ‘medium viewing’ of the formal qualities of the paper’s front page. In relation to digitisation, Lewandowski signals that large news organisations, such as The New York Times, are not inclined to invest in better archival preservation or open access because their archives primarily serve editorial and commercial interests.
Conclusion: rescuing the great unseen from the photographic morgue
This special issue of TMG resulted from the online symposium Open Up the Morgue! How Press Photo Archives are Enabling a New History of Photojournalism, which the Rijksmuseum in Amsterdam organised in collaboration with this journal on 2 July 2021.31 The title referred to the legendary photo archive of The New York Times: the building where photographs of the news were stored after their live on the newspaper page had ended. As the digital team of the newspaper recognised during the symposium, digitisation and digital techniques might be able revive the pictures in the morgue and give them a second life. We hope that the symposium and this special issue provide examples of how researchers can use digitisation and apply digital methods to open the doors of the morgue and rescue some of the great unseen.
We hope that this introduction and the contributions to this special issue demonstrate that the analogue-digital dichotomy and the associated methodological close-distant reading divide unnecessarily limit research on photojournalism. Researchers should work in between the analogue and digital archive and across the three levels of research. Concerning the archive, we should always be aware of the possibilities and limits of the analogue and digital form. If researchers start from the analogue archive, they should consider if digital access to this or other archives (if available) might shed new light on their research question. Similarly, researchers that start in the digital realm should always consider whether examining the material contexts of (a selection of) their sources might strengthen or change their analysis. In both cases, scholars should move beyond canonical pictures collections. By being aware of the blind spots that both analogue and digital access to collections entails, we can turn a new eye to the great unseen that lay hidden in the archive.
The same goes for the three levels of research. Close and distant reading are often presented as mutually exclusive but, as literary studies scholar Ted Underwood rightfully notes, these two different ways of examining (visual) sources could also inform and reinforce each other.32 Patterns derived from distant reading can lead to the close analysis of previously understudied material or new readings of canonical images. Conclusions derived from close reading could be scaled-up and tested by applying distant reading methodologies. This combination allows us to study the shape of the great unseen without having to look at every image and find out if iconic pictures adhere to wider trends.