Upcoming January 2021 talk – Elena Simperl

We have Elena Simperl, a Professor of Computer Science at King’s College London, talking to us in January. See details below. A Zoom link will be sent via the mailing list, or please email the Data Viz Organisers group grp-viz-org@groups.bristol.ac.uk

 

Title: Pie chart or pizza: identifying chart types and their virality on Twitter

Wednesday 13 January 2021, 13:00 – 14:00 (online event)

Abstract: We live in a world full of data, in which charts are routinely used to communicate complex insights more effectively than spreadsheets or reports. Twitter is no exception – tens of thousands of data visualisations on virtually any topic are shared every day. We aim to understand how data, rendered visually as charts or infographics, “travels” on social media. To do so we propose a neural network architecture that is trained to distinguish among different types of charts, for instance line graphs or scatter plots, and predict how much they will be shared. This poses significant challenges because of the varying format and quality of the charts that are posted, and the limitations in existing training data. To start with, our proposed system outperforms related work in chart type classification on the ReVision corpus, a benchmark from the literature. Furthermore, we use crowdsourcing to build a new corpus, more suitable to our aims, consisting of chart images shared by data journalists on Twitter. We evaluate the system on the second corpus with respect to both chart identification and virality prediction, with promising results.

Our system and findings could be used in different scenarios, from generating automatic text captions and recommending chart improvements in data visualisation tools to informing marketing strategies for brands that use data visuals to gauge customer engagement. In addition, our approach, including both the neural architecture and the method to create labelled data, could form the basis for the development of visual question answering solutions tailored to data visualisations, with applications in fact checking and misinformation online.

 

Biography: Elena Simperl is professor of computer science at King’s College London, a Fellow of the British Computer Society and former Turing fellow. According to AMiner, she is in the top 100 most influential scholars in knowledge engineering of the last decade, as well as in the Women in AI 2000 ranking.  Before joining King’s College early 2020, she held positions at the University of Southampton, as well as in Germany and Austria. She has contributed to more than 20 research projects, often as principal investigator or project lead. Currently, she is the PI of two grants: H2020 ACTION, where she develops human-AI methods to make participatory science thrive, and EPSRC Data Stories, where she works on frameworks and tools to make data more engaging for everyone. She authored more than 200 peer-reviewed publications in knowledge engineering, semantic technologies, open and linked data, social computing, crowdsourcing and data-driven innovation. Over the years she served as programme and general chair to several conferences, including the European and International Semantic Web Conference, the European Data Forum and the AAAI Conference on Human Computation and Crowdsourcing.

Upcoming Nov 2020 talk – Roy Ruddle

This event will be via Zoom, and you need to sign up using Eventbrite https://www.eventbrite.co.uk/e/visualizing-the-scale-complexity-of-data-quality-tickets-127981589379

Visualising the Scale and Complexity of Data Quality

24th November, 12:00-13:00

Abstract: Descriptive statistics are typically presented as text, but that quickly becomes overwhelming when datasets contain many variables or analysts need to compare multiple datasets. In this seminar, I will describe visualization designs for three categories of descriptive statistic (cardinalities, distributions and patterns), which scale to more than 100 variables and use multiple channels to encode important semantic differences (e.g., zero vs. 1+ missing values). I will also describe a novel tool, which exploits set visualization techniques to allow users to explain patterns of missing values that involve many fields. The visualizations were evaluated using large (multi-million record) datasets of electronic health records (EHRs), and provided users with a variety of important insights.

Bio – from https://www.turing.ac.uk/people/researchers/roy-ruddle)

Roy Ruddle is a Professor of Computing at the University of Leeds, and Deputy Director (Research Technology) of the Leeds Institute for Data Analytics (LIDA). He has worked in both academia and industry, and researches visualization, visual analytics and human-computer interaction in spaces that range from high-dimensional data to virtual reality. In a 12-year collaboration with pathologists at the Leeds Teaching Hospitals NHS Trust (LTHT), he developed the Leeds Virtual Microscope (LVM) for visualizing tera-pixel image collections on Powerwall and ultra-high definition displays, leading to its use for pathology training in NHS hospitals and commercialisation by Roche.

Online talks October 2020 – Network Visualisations

Our second set of online talks, on the theme of Network Visualisations, took place on 19th October 2020.

 

Talk 1: MiRANA: Visualising networks in genetic epidemiology”, Chris Moreno-Stokoe

MiRANA is an upcoming visualisation tool which is intended to help genetic epidemiologists explore and evaluate network effects in their data. MiRANA arranges estimates for the effects of traits on eachother (e.g., the effect of BMI on diabetes) to produce a public health network. Aimed for use in Mendelian Randomisation research, Chris Moreno-Stokoe demonstrated the ease of use of this tool and showed output visualisations of network effect (including use in a data exploration game). Chris is a third year PhD candidate studying genetic epidemiology and interactive data visualisation. MiRANA is in development for an official launch next year.

Chris provided the following links for people to use the tool with an example dataset:

http://www.morenostok.io/mirana/

http://www.morenostok.io/mirana/exampleMRdata.csv

Chris’s slides can be downloaded here.

 

Talk 2A visualisation of Isambard Kingdom Brunel’s steamship social network”, Gareth Jones

The construction of the three great steamships, starting with the SS Great Western, involved many hundreds of people. In this project we built a visual social network using a d3 force graph to investigate the relationships between the key individuals involved in the construction of each ship. Working with Dr James Boyd at the Brunel Institute, the network was constructed based on the analysis of hundreds of letters of correspondence between Brunel and the engineers, architects and investors involved in each project. The network is still under development and is available at https://brunels-network.github.io/network/.

The simple simple force graph simulation example Gareth created using the d3 force library can be found here https://github.com/gareth-j/d3-react-example

Research Software Engineering run a mailbox for queries – ask-rse@bristol.ac.uk – code design, testing and performance questions.

Gareth’s slides can be downloaded here.