Key Takeaways from the Data Visualization Society’s Outlier 2021 Conference

The Data Visualization Society’s (DVS) first conference Outlier 2021 took place on 4th, 5th, and 7th February 2021. It was organized as an online conference, joined by about 1000 participants from their computer screens all over the world. 41 main talks, about 20min each, were presented, as well as dozens of smaller sessions.

Talks were distributed within a large time window, suitable (or not) for people in different time zones. I was only able to participate in full on sunday the 7th. But due to the presentations being prerecorded, and made available as videos immediately after each talk, I was able to see every talk.

To profit the most from this event, and process it in a structured way for myself, I shortly summarized the key takeaways from the talks. These summaries are listed here. The talks I found most interesting are summarized in more detail than the others. I’ll add further summaries as I process further talks. I’ll directly link to the corresponding videos once they are made available on youtube.

To process the content in retrospect, it also made sense for me to regroup the talks into categories. Categories that emmerged are: general methodology, tools, history, data art and experimental case studies, and case studies. My interest mainly lay in the methodological talks, followed by presentations of tools. The talks on data visualization history and data art provided some lighter content in between. The enormous breadth of case studies greatly contributed to the diverse and international atmosphere of this event. The talks that I most recommend watching in full are marked with an asterisk* below.

Data Visualization General Methodology

How to get your organization to value data visualization – and you! (Steve Wexler)*

Steve Wexler demonstrated how to convince people of the power of data visualization in company environments were people are still working with raw numbers in spreadsheets. By showing examples and asking questions people can experience for themselves that data visualizations allow to find answers much faster than tables. Dashboards can be made more attractive for people if they can see their own relative position in the data. Needless discussions about chart types and color choices can be avoided by having experiments at hand demonstrating your point, such as estimating the relative sizes of bubbles/circles versus bars.

Soft Landing, Firm Impact: Practical Tips On How to Give and Receive Meaningful Data Visualization Feedback (Candra McRae)*

Candra McRae gave practical tips on how to give and receive feedback. When giving feedback one should be self-aware of one’s tone, body language, and biases. Personal opions should be voiced in the form of „I“ and „me“. It is better to give feedback in a one-on-one setting than in a group. One should first seek to understand why things were done in a certain way. The given feedback should be clear and honest but also kind. Dataviz experts‘ (Tufte, Few) stances should not be used in a discussion. When receiving feedback one shouldn’t shut down and be argumentative. One should ask engaging open questions. It is also important to act upon the given feedback.

Side Projects (Jan Willem Tulp)*

Jan Willem Tulp elaborated what makes good side projects in data visualization. Such projects serve to learn something and to show something. For data visualization designers starting out, such projects usually serve to fill the portfolio. But they also make sense for seasoned professionals, because they can lead to paid projects. Side projects provide the opportunity to fully do you own thing, with your ideas, creativity, and skills. It is recommended to keep a notebook/spreadsheet of ideas and interesting datasets. Good side projects are relevant and original. Relevance can be achieved by using a well-known dataset, treating a current event, and by allowing people to find themselves in the data. Originality can be achieved by collecting one’s own data, redesigning an existing visualization, trying a new visualization concept, visualizing uncommon questions, and by creating engaging design people spend more time with. Mr. Tulp then discussed how his own and other people’s side projects meet the criteria of relevance and originality.

My Statistic Enemy, or Why Difficulties Make Better Data Visualization (Julie Brunet)*

Julie Brunet explained how she cooperates with people with different skillsets. The basic idea is to manage that which you don’t know. People in the data visualization community have very different backgrounds. There is a temptation to try to learn to do everything by oneself. But a better approach is to cooperate with people that have the skills that one lacks for a project. People can thus alternately take the lead for different parts of a project.

Personal comment: The slides of this presentation were probably the most beautiful of the conference.

Data Viz, the UnEmpathetic Art (Mushon Zer-Aviv)*

Mushon Zer-Aviv discussed how empathy can be achieved in data visualizations. Humans easily empathize with individuals but not with masses. Research has shown that people are willing to donate more than double the amount to save an individual (identifiable life) than to save the many (statistical lives). Even when the statistics are just shown aside the individual fates, the donations go down. This is called statistical numbing. Daniel Kahneman wrote about two systems of thinking. System 1 is fast, automatic, and involuntary, system 2 is slow, effortful and deliberating. Often system 2 rationalizes in retrospect, what system 1 has perceived. Empathy can be located in system 1. Or, speaking in data visualization terms, it can be called a preattentive attribute that focuses our attention. A good approach to reaching empathy with data visualization is thus to start with the individual fate and then zoom out to the bigger picture. But it is not enough to simply rouse people, there must also be a specific call to action. Not just the status-quo should be shown, but also the better situation that could be.

Personal comment: Especially in the Covid crisis, where statistical data represents thousands of deaths, this is a very pressing topic. Many great examples of empathic and unempathetic data visualizations have emerged in this context.

3 Languages, 3 Aesthetics, 1 Graphic: A Case Study of Visualization in a Multicultural Environment (Nilangika Fernando)*

Nilangika Fernando explained how she takes three different cultural aesthetics in Sri Lanka into account when designing data visualizations. The official languages of Sri Lanka are English, Sinhala, and Tamil. When she published data visualizations from an English context, translated into Sinhala, they would get little traction in Sinhala media. Looking at newspaper frontpages she noticed that each language and culture has it’s own look and feel. Newspaper try to make their frontpage as attractive as possible to the given audience, so they can be used to determine wether an audience has a different design aesthetic. These specific aesthetics could also be seen in online-memes of the different cultures. To analyze an aesthetic one should look at the layout, color, font, images, and narrative. Icons need to match the cultural context. For instance, a savings box in the form of a pig would not be understood in Sri Lanka, or even be considered offensive. Also, the hair and eye color of icons should be appropriate. Then she explained how to bridge this visual gap. She creates the infographic in the language of the primary audience, and then translate them into the others. She works with collaborators who are based in the different cultures. Finally she explained how data visualization can be presented in a non-data culture. She advised to use serve infographics in small doses, give a finished product that is attractive to publish, and to use storytelling.

Mind Games: The Psychology Behind Designing Beautiful, Effective and Impactful Data Viz (Amy Alberts)*

Amy Alberts talked about results of user research at Tableau. Using eye trackers she analyzed how people perceive dashboards. Such eye tracking studies are in themselves data visualizations because the results are shown and analyzed as gazeplots, heatmaps, and gaze opacity maps. Given 10 seconds people focused their attention especially on big numbers, high color contrast, pictures of humans, and maps. People also tended to read the dashboards starting in the upper left corner moving right and down. When the viewing duration was increased, the viewing patterns remained largely the same. But when a specific task was given when viewing a dashboard, the patterns fell apart. So humans are on the one side dumb monkeys, looking with little actual intent, but on the other side also very intelligent in navigating systems to reach a goal. These result are in line with UX research. The mentioned attention-getters can be used purposefully for designing dashboards, notably taking up corporate design elements. Priming can be also be used to focus attention, by saying or writing something related to what you want people to focus on before showing the dashboard.

Are Your Data Visualizations Excluding People? (Larene Le Gassick, Sarah Fossheim, Frank Elavsky)

Iron Quest: Lessons from the Community (Sarah Bartlett)

Data Designer: A Self Portrait (Valentina d‘Efilippo)

Labels Matter (Gaelan Smith)

Beyond Word Clouds: Visualizing the Linguistic Patterns of Political Speeches (Riva Quiroga)

Using Zipf‘s Law to help Understand COVID-19 (Howard Wainer)

Data Visualization Tools

Going Beyond Matplotlib and Seaborn: A Survey of Python Data Visualization Tools (Stephanie Kirmer)*

Stephanie Kirmer provided an overview of six Python data visualization libraries. She included the older standard libraries Mathplotlib (2003) and Seaborn (2012), and the newer libraries Bokeh (2012), Altair (2016), Plotnine (2017), and Plotly (2013). The target criteria she wanted libraries to meet are an easy learning curve, consistent grammar, flexibility, beautiful output, and interactivity. She tested each library with a set of standard charts, and then discussed how the target criteria were met. She advises against using the older libraries. In conclusion she showed for which individual target criterion which of the four newer libraries should be used. For an easy learning curve: Plotnine or Altair. For consistent grammar: Plotnine or Altair. For flexibility: Plotnine or Bokeh, For beautiful images: Altair or Bokeh. For interactivity: Plotly or Bokeh. Generally, Altair is only suitable for small datasets.

Navigating the Wide World of Data Visualization Libraries (for the Web) (Krist Wongsuphasawat)*

Krist Wongsuphasawat explained a framework for choosing data visualization libraries for the web, mainly Javascript libraries. He located libraries within a two-dimensional design space. The x-axis is the level of abstraction from 1 to 5. The y-axis are different categories of API design. Level of abstraction 1 is graphics libraries working on a low level. P5.js, Three.js, and Two.js fall into this category. Level 2 is low-level building blocks. D3, visx, cola, dagre, and others belong into this category. Level 3 is visualization grammars. Vega-lite, Chart Parts, Muze, and G2 are part of this category. Level 4 are high-level building block. Echarts, Highcharts, Plotly, Victory, React-Vis, and Semiotic belong into this category. Level 5 are chart templates. Chart.js and Nivo are part of this category. The other dimension, API design, consists of the categories JSON, JSON with callback, plain Javascript, and framework specific. He then showed how the different libraries are located within this dimension. He then explained how to choose a library. It should allow you to create what you need (custom, rare, or common data visualizations) within the time you have. Familiarity with a specific library plays a role here. Technical aspects that can be considered are performance, the used tech stack, and project lifespan (maintenance of the library in the long term).

ggplot Wizardy: My Favorite Tricks and Secrets for Beautiful Plots in R (Cédric Scherer)*

Cédric Scherer explained how he creates print-ready charts entirely programmed in R with the ggplot2 library and extensions. He refined his R skills mainly within the weekly TidyTuesday challenge. The R community shares extension packages for a big variety of graphs and extra functionalities. He then demonstrated the capabilities of the extension packages he regularly uses in his work. The package ggtext provides improved text rendering. The package ggforce provides annotations. The package ggdist is useful for visualizing distributions and uncertainty. Then he showed several tips for improving charts within the ggplot2 library by changing default parameters. Plot-titles and plot-captions can be aligned with the outer margins. The legend can be placed at the top of the chart. The legend formatting can be improved. The axis labels can be placed closer to the axes. The clipping of elements that protrude beyond the borders of the chart, such as long labels, can be shut off. The outer margin between chart and border of the image can be enlarged. An image can be added to the plot to make it more illustrative. Finally he showed how the patchwork package can be used to combine and arrange several plots.

Data Visualization History

Otto and Gerd in the Chauvet Caves (Nigel Holmes)*

Nigel Holmes explained how basic principles of information design can be traced back to early cave art. The earliest figurative cave art known to date is in Sulawesi from 45 500 years ago. Abstract marks from 70-100 000 years ago have been found in the Blombos cave. Such drawings might have been made by homo sapiens or other early homonids. Many of the known pictures of cave art are reproduced drawings, not actual photos of the art itself. Jumping forward to modern times, in the 1920s Otto Neurath and Gerd Arntz developed the Isotype graphic language to display statistical information. Neurath urged the artists to find the essence of the depicted object. Objects are shown in profile, from the side as a silhouette, omitting surface details. At first, icons were cut out from black cardboard, later they were printed as linocuts to obtain this simple appearance. A basic mechanism that is used in Isotype is to combine two icons into one. For instance, a waiter can be represented as a person with a coffee cup. The same principles of depicting the essential outline in sideview, and combining basic element into icons can be found in cave art. With combined elements, rhinos are shown wooly and with their summer coat. Thus it is valid to say that cave painter were the first information designers. “They were counting, recording, explaining, storytelling, while showing only the essentials.” Today the same principles can be found in roadsigns showing animal silhouettes, signs in airports, and emojis.

Florence Nightingale is a Design Hero (RJ Andrews)

Spotting Minard on the Corner Three (Senthil Natarajan)

Data Art and Experimental Case Studies

3D Geo Dataviz: From Insight to Data Art (Craig Taylor)*

Craig Taylor showed spectacular 3D visualizations of traffic data he develops at the company Ito. These cinematic visualizations serve to gather insight and for use as marketing material. He presented the project transit in motion which showed the change of patterns in public bus mobility during a Covid lockdown. He presented several possibilities of representing the data, some of which were quite experimental and artistic. Then he presented the project Europe’s quiet skies which shows the reduction of airplane flights in the Europe during the Covid crisis. In the Q&A session Craig Taylor explained that he uses QGIS and ESRI ArcMap for data preparation and visualizes the data using Houdini, Cinema 4D, and the Octane rendering engine.

Personal comment: This talk demonstrated the controversy around 3D data visualization and use of animations very well. On the one hand beautiful, spectacular images. On the other hand a way of presenting data that make it hard to derive deeper analytical insight.

Data Through Design: Creating a Data Art Exhibition (Sara Eichner)

Using Data in a Fine Art Practice (Wilma Wolf)

Loud Numbers: Telling Stories With Data and Music (Miriam Quick, Duncan Geere)

Step and Repeat: Visualizing Human Motion (Emma Margarite Erenst)

Coding with Fire: Cooking with Data (Ian Johnson, EJ Fox)

Data Visualization Case Studies

A Viral Map (Karim Douieb)*

Karim Douieb showed how he developed an animated visualization of the results of the U.S. presidential election of 2016. This animation went viral on social media. The animation visualizes the fact that land doesn’t vote, people do, by transitioning each state area to a bubble proportional to the population of the state. He presented a detailed walkthrough of how he developed this animation in Javascript, using the Observable working environment and the D3 library. He used a D3 force layout to distribute the bubbles, and Flubber for the animated transitions. He published his result as a looping gif on social media. The attention that his work received when posted by others exceeded that of his own posting. He noted that a watermark should be added, to avoid one’s work being shared widely without attribution.

Mapping The Covid19 Research Landscape: The Power of Data Viz Over Black Boxes (Caroline Goulard)*

Caroline Goulard presented a tool for visualizing scientific papers about Covid. There currently exist more than 50 000 publications on this topic. This make it very difficult for researchers to find the relevant ones. “Dark knowledge” is a big problem. 50 % of publication on Covid are not cited, 6 % are not in English. The currently available tools such as Pubmed, Scopus, and Google Scholar only display search results as paginated lists. It is not transparent how these ranked lists were generated. Also the user needs to precisely specify what he is looking for. Caroline Goulard proposes spatial mapping as part of the solution. This helps get a mental representation of the data, helps interaction, and helps memorization. They developed two approaches. The first approach is a citations network graph, implemented via a force-directed graph. The second approach is a dimensions reduction map. Here publications that have similar keywords are located closer together in two-dimensional space. This replicates walking through a library and looking into the nearby shelves. This second approach was favored by interviewed users. Clusters of publications were created using hierarchical clustering. Each cluster was assigned a color. In the interface colors can also be assigned to years of publication, fields of study, and keywords. The interface also allows to look at the detailed metadata of each publication. In user testing it was found that people mainly use search functionalities, and then look at the map for confirmation. Users found using the tool a “disturbing experience”. So a sexy interface will not guarantee, that a tool will actually be used next time, instead of the standard tools. In the Q&A section Caroline Goulard explained that the application was programmed with WebGL and the HDBSCAN library.

How Do We Translate Cultural Experiences Into Data Stories? (Mick Yang, Isabella Chua)

Narrating a Nation Through Numbers – India in Pixels (Ashris Choudhury)

Data Viz for Non-Profit (Guillermina Sutter Schneider, Luis Ahumada)

Data Points Are People Too (Bronwen Robertson, Saja Hathman, Joachaim Mangalima, Zdenek Hynek)

#BlackInDataWeek: Connecting and Celebrating Black People in Data Fields (Rith Agbakoba, Jarrett C. Hurms, Simone Webb)

Visualizing the history of mass incarceration (Sarah Fawson)

Visualizing Transgender Day of Remembrance: Lessons in Bearing Witness Through Making Losses Visible and Visceral (Kelsey Campbell, Cathryn Ploehn)

Visualization of Violence in Colombia (Gustavo Ojeda)

Are We Fine With Global warming? The Role of Nuclear Power & Low Carbon Energy (Harim Jung)

Using DataViz to Re-sensitive the World to Animals (Karol Orzechowski)

An Odd Couple’s Journey Towards SciArt: Design Meets Science and Vice-Versa (Greta Carrete Vega, Estefania Casal)

Shaping Data Viz Through Student Newsrooms (Raeedah Wahid, Jessica Li)

Becoming a data driven learner (Aminah Aliu)

At the conference Jason Forrest and Mary Aviles announced that Nightingale, the online publication of the Data Visualization Society, will also appear as a printed magazine.

Thanks to DVS Events Director Mollie Pettit and the rest of the volunteering organization team for this event: Duncan Geere, Evelina Judeikyte, Gabby Merite, Lloyd Richards, Maxene Graze, Marília Ferreira da Cunha, Frederic Fery, Céline Genest, Katy Liang,Jennifer Li, Bill Tran, Yi Ning Wong Isabella Chua, Akshit Aggarwal, Nöelle Rakotondravony, Naomi Smulders.