Upcoming January 2021 talk – Elena Simperl

We have Elena Simperl, a Professor of Computer Science at King’s College London, talking to us in January. See details below. A Zoom link will be sent via the mailing list, or please email the Data Viz Organisers group grp-viz-org@groups.bristol.ac.uk


Title: Pie chart or pizza: identifying chart types and their virality on Twitter

Wednesday 13 January 2021, 13:00 – 14:00 (online event)

Abstract: We live in a world full of data, in which charts are routinely used to communicate complex insights more effectively than spreadsheets or reports. Twitter is no exception – tens of thousands of data visualisations on virtually any topic are shared every day. We aim to understand how data, rendered visually as charts or infographics, “travels” on social media. To do so we propose a neural network architecture that is trained to distinguish among different types of charts, for instance line graphs or scatter plots, and predict how much they will be shared. This poses significant challenges because of the varying format and quality of the charts that are posted, and the limitations in existing training data. To start with, our proposed system outperforms related work in chart type classification on the ReVision corpus, a benchmark from the literature. Furthermore, we use crowdsourcing to build a new corpus, more suitable to our aims, consisting of chart images shared by data journalists on Twitter. We evaluate the system on the second corpus with respect to both chart identification and virality prediction, with promising results.

Our system and findings could be used in different scenarios, from generating automatic text captions and recommending chart improvements in data visualisation tools to informing marketing strategies for brands that use data visuals to gauge customer engagement. In addition, our approach, including both the neural architecture and the method to create labelled data, could form the basis for the development of visual question answering solutions tailored to data visualisations, with applications in fact checking and misinformation online.


Biography: Elena Simperl is professor of computer science at King’s College London, a Fellow of the British Computer Society and former Turing fellow. According to AMiner, she is in the top 100 most influential scholars in knowledge engineering of the last decade, as well as in the Women in AI 2000 ranking.  Before joining King’s College early 2020, she held positions at the University of Southampton, as well as in Germany and Austria. She has contributed to more than 20 research projects, often as principal investigator or project lead. Currently, she is the PI of two grants: H2020 ACTION, where she develops human-AI methods to make participatory science thrive, and EPSRC Data Stories, where she works on frameworks and tools to make data more engaging for everyone. She authored more than 200 peer-reviewed publications in knowledge engineering, semantic technologies, open and linked data, social computing, crowdsourcing and data-driven innovation. Over the years she served as programme and general chair to several conferences, including the European and International Semantic Web Conference, the European Data Forum and the AAAI Conference on Human Computation and Crowdsourcing.

Upcoming Nov 2020 talk – Roy Ruddle

This event will be via Zoom, and you need to sign up using Eventbrite https://www.eventbrite.co.uk/e/visualizing-the-scale-complexity-of-data-quality-tickets-127981589379

Visualising the Scale and Complexity of Data Quality

24th November, 12:00-13:00

Abstract: Descriptive statistics are typically presented as text, but that quickly becomes overwhelming when datasets contain many variables or analysts need to compare multiple datasets. In this seminar, I will describe visualization designs for three categories of descriptive statistic (cardinalities, distributions and patterns), which scale to more than 100 variables and use multiple channels to encode important semantic differences (e.g., zero vs. 1+ missing values). I will also describe a novel tool, which exploits set visualization techniques to allow users to explain patterns of missing values that involve many fields. The visualizations were evaluated using large (multi-million record) datasets of electronic health records (EHRs), and provided users with a variety of important insights.

Bio – from https://www.turing.ac.uk/people/researchers/roy-ruddle)

Roy Ruddle is a Professor of Computing at the University of Leeds, and Deputy Director (Research Technology) of the Leeds Institute for Data Analytics (LIDA). He has worked in both academia and industry, and researches visualization, visual analytics and human-computer interaction in spaces that range from high-dimensional data to virtual reality. In a 12-year collaboration with pathologists at the Leeds Teaching Hospitals NHS Trust (LTHT), he developed the Leeds Virtual Microscope (LVM) for visualizing tera-pixel image collections on Powerwall and ultra-high definition displays, leading to its use for pathology training in NHS hospitals and commercialisation by Roche.