The Finnish postdoctoral researcher, Tuomo Hiippala, is part of a new generation of humanists researchers using powerful computing resources to boost their research. With a PhD in English Literature and a great interest in computer vision and machine learning, Hiippala is developing new ways to manipulate large collections of images.
“About 30 years ago, computers transformed the empirical study of language,” says Tuomo Hiippala.
“I’m working on using computer vision and machine learning to manipulate large collections of images”
“The optical character recognition, a subfield set of artificial vision, enables the creation of massive collections of words ( corpora ) to study them . Now, taking into account the progress in the description of images, I think the artificial vision can be implemented again to understand more forms of visual communication. Working in the use of computer vision and machine learning to manipulate large collections of images and group them according to similarity, and help researchers in the field of visual communication working with large volumes of visual data available on social networks and historical archives “.
Tuomo Hiippala specializes in the study of the interaction between various forms of communication, interaction and language, graphic design and images in magazines, newspapers and other printed or digital texts. This phenomenon is generally called “multi-modality”. Hiippala published a book on empirical research in multimodality, based on his doctoral dissertation for which he created a set of tourist brochures published by the City of Helsinki, Finland between 1967 and 2008 to keep track of the structure and change over time.
Now Hiippala, who is working at the Center for Studies of Applied University of Jyväskylä Languages, is turning its attention to Helsinki again, using computational resources scientists provided by the CSC, the Finnish IT Center for Science and network research and education FUNET:
“The city of Helsinki gave me a scholarship to analyze the photographs that tourists share through the social network Instagram. The purpose of this it is to discover what tourists want to show your closest networks and extended when visiting Helsinki.”
“I’m currently working on data cleansing, since Instagram is chock full of memes and screenshots, and distinguishing between local users and tourists. Once finished, my plan is to identify the best algorithms to group similar images. The reason which work with photos taken by tourists is that we already have several theories about the photographic behavior of tourists, what I want is to discover whether we also have an empirical basis for these theories. ”
“The obvious problem here is that there are simply too much data to be processed by a human analyst, why am introducing artificial vision and the infrastructure provided by the CSC.”
In the future, Tuomo Hiippala plans to explore how to use computer vision to automatically identify components based on pages and partially automate the process of annotation (document description ). This will significantly speed up the process of building corpora and facilitate empirical research on the complex phenomenon that is multimodality.
Above: A photo of Finnish Tuomo files to which Hiippala applied some basic computer vision techniques, ie, identification of key issues and their links to the images.
For more information please contact our contributor(s):