9 Jan 2023 - 13 Jan 2023 11:00 - 16:00 (GMT) Daily Online


The Cambridge Digital Humanities Social Data School is an application-only online intensive teaching programme structured around the life-cycle of a digital research project, covering principles of research design, data collection, cleaning and preparation, methods of analysis and visualisation, and data management and preservation practices. The school welcomes applications from all backgrounds, including journalists, NGOs, activists, trade unionists and members of civil society organisations.

Visualising Data/Investigating Images

At the Social Data School, we will focus on the image as a place of inquiry for investigations. We will extract data from social networks, and multiple video and photographic sources, and learn how computer vision and automation can help in the investigation of public interest stories and projects. In short, we will develop a critical toolkit for interrogating the image-based cultures of the digital age.

Modules will cover the following content:

  • Methodology for digital investigations
  • Video and image data analysis
  • Geolocation and Open Source investigations
  • Introduction to Critical AI
  • Introduction to Social Media data mining, analysis and visualisation
  • Data spatialisation using Python and Blender (TBC)

Q&A Session

Do you have questions about how the school will be run, the content, and how to apply? Join us for this Q&A session with the school’s convenors.

When and Where

This school will be held online: 9 – 13 January 2023. Data School live sessions are timetabled daily from 11:00 – 16:00 (GMT). To convert this to your timezone you can use this time zone converter.

During the course, you will be provided with links to our virtual learning environment (Moodle) where we will publish course content and links to our online video delivery platform for teaching and social interactions.

This upcoming online version will run for one week instead of two, as in previous iterations of online Social Data Schools during the pandemic, but the amount of instruction is similar. An in-person Social Data School is also scheduled for March 2023, and those interested in attending under that format will be welcome to apply once the application process opens (though please note, content will differ and fees will be significantly more expensive).


Cambridge Digital Humanities is committed to democratising access to digital methods and tools, and is offering subsidised participation fees to encourage applications from those who do not normally have access to this type of training. In addition, a small number of bursaries are available to those who can demonstrate financial need. You can apply for this on the application form.

The fees include all teaching costs.

  • £245 (Standard Rate)
  • £45* (Concessionary Rate)

*Concessionary rate is for students, unemployed, community projects, unfunded projects, and Global South residents.

The deadline for payment is two weeks before the start of the School.


The Cambridge Data Schools are competitive and application-only schools. Places are given to those who we feel can make best use of the classes.

No previous experience in coding is required and there are no specific academic requirements, however, the course content is broadly suitable for those with an undergraduate degree or equivalent professional experience. The School is taught in English. You will need a reliable internet connection to join in, and the ability to download free, open software for use during the School.

We accept applications from everyone, active members of media organisations, freelance journalists, students, and so on. We are committed to facilitating participation by women, black and minority ethnic candidates as they have historically been under-represented in the technology and data science sector. We also welcome applications from outside the UK, assuming they can attend the live workshop slots from 11:00 – 16:00 GMT. Sessions will not be recorded and therefore live attendance is required.


  • Anne Alexander (Director of Learning, CDH)
  • Irving Huerta (Data School Convenor, CDH)
  • Hugo Leal (Research Associate, Minderoo Centre for Technology and Democracy)
  • Members of Amnesty International’s Digital Verification Corps (Centre of Governance and Human Rights at Cambridge University)
  • Nicholas Masterton (Researcher, Forensic Architecture) (TBC)

Sessions will include live-taught instruction, demonstrations and discussions online, with access to self-paced study materials and support via email-based discussion groups between sessions. Participants will need a laptop or desktop computer and internet access to participate in the sessions. Some sessions will require software installation – full instructions will be provided but please ensure you have access rights to install software on the device you will be using.

How to apply

Fill in the application form by 11 December 2022. You will hear whether your application was successful or not by 14 December 2022.

The Social Data School is application-only with limited places. During your application you should make best use of the free text sections to explain your current experience, and what you would get out of attending the School.



Contact us via dataschool@cdh.cam.ac.uk


Modules scheduled between 11:00 - 16:00 GMT

Methodology for digital investigations

Irving Huerta

This module addresses fundamental aspects of investigative practice in digital environments and dwells on the importance of using methodology(ies) for data inquiry. Researchers doing investigations using Open Source Intelligence (OSINT) tools, data collection and analysis, as well as developing automated tools for investigations will benefit from this module. It critically reflects on the essential phases of digital investigations at large: Identification of a Problem (formulation of hypotheses), Information Gathering, Preservation, Verification, Analysis, and Dissemination.

By the end of the module, participants would have the principles to conduct investigations that effectively identify, prove, and strategically disseminate issues in the public interest, with fairness and rigour. Its scope is meant to be applied along with the rest of tools and methods from SDS 2022 modules.

Video data analysis

Tom Kissock

Description coming soon.

Introduction to critical AI

Anne Alexander

The current generation of machine learning Artificial Intelligence systems are now widely deployed in contexts as diverse as recommender systems for online shopping and streaming music and video services, facial recognition and biometric systems used by state and private security agencies for the analysis, summarisation and generation of texts and images. This module will present the technical fundamentals of machine learning systems, exploring the challenges of structural bias, lack of transparency and the impact that the design of contemporary AI has on communities and individuals who face structural discrimination. We will demonstrate web-based platforms for creating Machine Learning models and learn about experimental techniques for exploring their potential and limitations.

Social network analysis with digital data

Hugo Leal

‘Social network’ has become a catch-all term for the online spaces where we connect with other people and trade information in exchange for our personal data and attention. Considering the societal impacts of data-driven economics and politics, knowing how to reclaim and reappropriate these data to trace the form and content of online social networks is a vital skill for journalists, civil society and academics alike.

This module will provide a gentle introduction to the field of social network analysis (SNA) with digital data. Social Data School participants will be given the opportunity to “learn by doing” the process of digital data collection as well as the basics of social network visualisation and analysis. After being introduced to the fundamental concepts of SNA, the participants will explore all stages of a social network analysis project, including research design, data collection, data wrangling, graph visualisation, and analysis with essential network measures. The focus will be on the retrieval of electronic archival data (e.g., social media platforms) for non-programmers, and on practical examples of network analysis with specialised software (e.g., Gephi). At the end of the two sessions, participants will be equipped with the basic tools to perform meaningful visualisations and analyses of network data. Typical use cases of SNA range from investigative journalism to NGO monitoring and academic research.

Geolocation and open source investigations 

Amnesty International and Cambridge University’s Digital Verification Corps

This session will cover geolocation, a crucial stage of any open-source investigation. Geolocation seeks to answer a key question: where did the events depicted happen? We will explore the basic principles of geolocation and introduce participants to a range of tools and techniques. We will cover essential resources including Google Earth Pro and Mapillary, and highlight the advantages of different data sources in a platform-agnostic manner.

This workshop aims to encourage a reflexive and critical approach to open-source data, introducing practical skills while emphasising the importance of ethical and transparent research methods. Drawing from the human rights sphere, this methodology is useful for scholars and citizens using open source data such as social media content, online databases and satellite images.

By the end of this session, participants will be able to identify useful clues in online content, perform reverse image searches and combine satellite information with street-view data.

Data spatialisation using Python and Blender (TBC)

Session 1 Nicholas Masterton

In the first session we will look at the interface of Blender, and talk about the various workspaces and 3D tools available and how they relate to data visualisation and spatialisation.  From here we will import a geographical shapefile using an addon called Blender GIS and manipulate it to create a 3D height field. We will use a node-based shader to apply a gradient to it.  Through this process we will develop an understanding of how to manipulate objects in 3D space and how to use colour and shading to communicate gradation within the data.

Session 2 Nicholas Masterton

In the second session we will look into the Blender text editor, the interactive console, and the system console to understand ways of working with python.  We will look at a script which is able to read a csv (comma-separated values) file. We will look at the process of iterating through columns and rows of data, using python to output a result into 3D space.  This will allow us to develop a methodology for spatialising datasets which are bespoke, hand-crafted, or obscure.

This programme may be subject to changes.


Upcoming Events


Tel: +44 1223 766886
Email enquiries@crassh.cam.ac.uk