Australasian Mathematical Psychology Conference 2019

Analysing social media data: A mixed-methods framework combining computational and qualitative text analysis

Matthew Andreotta
School of Psychological Science, University of Western Australia
Robertus Nugroho
Data61, CSIRO
Soegijapranata Catholic University
Mark Hurlstone
School of Psychological Science, University of Western Australia
Fabio Boschetti
Ocean & Atmosphere, CSIRO
Simon Farrell
School of Psychological Science, University of Western Australia
Iain Walker
School of Psychology and Counselling, University of Canberra
Cecile Paris
Data61, CSIRO

To qualitative researchers, social media offers a novel opportunity to harvest a massive and diverse range of content, without the need for intrusive or intensive data collection procedures. However, performing a qualitative analysis across a large social media data set is cumbersome and impractical. Instead, researchers often extract a subset of content to analyse, but a framework to facilitate this process is currently lacking. We present a four-phased framework for improving this extraction process, which blends the capacities of data science techniques to project large data sets into smaller spaces, with the capabilities of qualitative analysis to address research questions. We applied the framework to 201,506 Australian tweets on climate change from 2016. Through combining Non-Negative Matrix inter-joint Factorisation (Nugroho, Zhao, Yang, Paris, & Nepal, 2017) and Topic Alignment (Chuang et al., 2015) algorithms with the qualitative techniques of Thematic Analysis, we derived five overarching topics of climate change commentary. Our approach is useful for researchers seeking to perform qualitative analyses of social media, or researchers wanting to supplement their quantitative models with a qualitative analysis of broader social context and meaning.

A preprint of this work is available at

Chuang, J., Roberts, M. E., Stewart, B. M., Weiss, R., Tingley, D., Grimmer, J., & Heer, J. (2015). TopicCheck: Interactive alignment for assessing topic model stability. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (pp. 175–184). Denver, Colorado: Association for Computational Linguistics.
Nugroho, R., Zhao, W., Yang, J., Paris, C., & Nepal, S. (2017). Using time-sensitive interactions to improve topic derivation in Twitter. World Wide Web, 20(1), 61–87.