ARTIDIGH 2021 Abstracts


Area 1 - ARTIDIGH

Full Papers
Paper Nr: 1
Title:

Scene Detection in De Boer Historical Photo Collection

Authors:

Melvin Wevers

Abstract: This paper demonstrates how transfer learning can be used to improve scene detection applied to a historical press photo collection. After applying transfer learning to a pre-trained Places-365 ResNet-50 model, we achieve a Top-1 accuracy of .68 and a Top-5 accuracy of .89 on our data set, which consists of 132 categories. In addition to describing our annotation and training strategy, we also reflect on the use of transfer learning and the evaluation of computer vision models for heritage institutes.

Paper Nr: 2
Title:

Balancing Performance and Effort in Deep Learning via the Fusion of Real and Synthetic Cultural Heritage Photogrammetry Training Sets

Authors:

Eugene Ch’ng, Pinyuan Feng, Hongtao Yao, Zihao Zeng, Danzhao Cheng and Shengdan Cai

Abstract: Cultural heritage presents both challenges and opportunities for the adoption and use of deep learning in 3D digitisation and digitalisation endeavours. While unique features in terms of the identity of artefacts are important factors that can contribute to training performance in deep learning algorithms, challenges remain with regards to the laborious efforts in our ability to obtain adequate datasets that would both provide for the diversity of imageries, and across the range of multi-facet images for each object in use. One solution, and perhaps an important step towards the broader applicability of deep learning in the field of digital heritage is the fusion of both real and virtual datasets via the automated creation of diverse datasets that covers multiple views of individual objects over a range of diversified objects in the training pipeline, all facilitated by close-range photogrammetry generated 3D objects. The question is the ratio of the combination of real and synthetic imageries in which an inflection point occurs whereby performance is reduced. In this research, we attempt to reduce the need for manual labour by leveraging the flexibility provided for in automated data generation via close-range photogrammetry models with a view for future deep learning facilitated cultural heritage activities, such as digital identification, sorting, asset management and categorisation.

Paper Nr: 5
Title:

Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass

Authors:

Nikolay Banar, Walter Daelemans and Mike Kestemont

Abstract: Iconclass is an iconographic classification system from the domain of cultural heritage which is used to annotate subjects represented in the visual arts. In this work, we investigate the feasibility of automatically assigning Iconclass codes to visual artworks using a cross-modal retrieval set-up. We explore the text and image branches of the cross-modal network. In addition, we describe a multi-modal architecture that can jointly capitalize on multiple feature sources: textual features, coming from the titles for these artworks (in multiple languages) and visual features, extracted from photographic reproductions of the artworks. We utilize Iconclass definitions in English as matching labels. We evaluate our approach on a publicly available dataset of artworks (containing English and Dutch titles). Our results demonstrate that, in isolation, textual features strongly outperform visual features, although visual features can still offer a useful complement to purely linguistic features. Moreover, we show the cross-lingual (Dutch-English) strategy to be on par with the monolingual approach (English-English), which opens important perspectives for applications of this approach beyond resource-rich languages.

Short Papers
Paper Nr: 4
Title:

Automatic Annotations and Enrichments for Audiovisual Archives

Authors:

Nanne Van Noord, Christian G. Olesen, Roeland Ordelman and Julia Noordegraaf

Abstract: The practical availability of Audiovisual Processing tools to media scholars and heritage institutions remains limited, despite all the technical advancements of recent years. In this article we present the approach chosen in the CLARIAH project to increase this availability, we discuss the challenges encountered, and introduce the technical solutions we are implementing. Through three use cases focused on the enrichment of AV archives, Pose Analysis, and Automatic Speech Recognition, we demonstrate the potential and breadth of using Audiovisual Processing for archives and Digital Humanities research.