Means Tests

Predicting the Future: Harnessing the Power of Probabilistic Judgements Through Forecasting Tournaments

April 29, 2025
by Christian Caballero. From the threat of nuclear war to rogue superintelligent AI to future pandemics and climate catastrophes, the world faces risks that are both urgent and deeply uncertain. These risks are where traditional data-driven models fall short—there’s often no historical precedent, no baseline data, and no clear way to simulate a future world. In cases like this, how can we anticipate the future? Forecasting tournaments offer one answer, harnessing the wisdom of crowds to generate probabilistic estimates of uncertain future events. By incentivizing accuracy through structured competition and deliberation, these tournaments have produced aggregate predictions of future events that outperform well-calibrated statistical models and teams of experts. As they continue to develop and expand into more domains, they also raise urgent questions about bias, access, and whose knowledge gets to shape our collective sensemaking of the future.

Sam Temlock

Data Science Fellow 2021-2022
School of Information

Sam (he/him) is a Master of Information and Data Science graduate student at the School of Information, with experience in Cybersecurity and Network Programming. He holds a BS in Computer Systems Engineering from Rensselaer Polytechnic Institute and has previous experience in consulting at Deloitte. He has experience with Python, R, SQL, machine learning, data analytics, statistical analysis, and research design.

Connor Haley

Data Science Fellow 2021-2022
Haas School of Business

Connor is an MBA/MEng student with an undergraduate degree in statistics. He spent the past three years in economic consulting, focused on designing competitive electric power markets to produce optimal outcomes. His technical background is in R, Excel, Visual Basic (VBA/macros), and statistical methods.

Frances Leung

Data Science Fellow 2021-2022
School of Information

Frances Leung is a master’s student at UC Berkeley School of Information where she focuses her studies in information and data science. She has a keen interest in leveraging data-driven insights to better understand consumer behaviors and the world around us. In her professional work as a management consultant, she advises retailers and consumer businesses on digital transformation and creating web/mobile experiences that delight consumers through a human-centered approach. Frances holds a Master in Business Administration from York University, Schulich School...

Enrique Valencia López

Data Science Fellow 2022-2023
Graduate School of Education

Enrique Valencia López is a PhD student in the Policy, Politics and Leadership cluster at the Graduate School of Education.His research interests relate to three broad areas: the stratification of education by gender, immigration status and ethnicity; the measurement of teacher working conditions and well-being; and education in Latin America.

Before coming to Berkeley, Enrique worked for Mexico’s National Institute for Educational Evaluation and Assessment (INEE) in both the Policy and Indicators area. During that time, he co-authored Mexico’s first report on the educational...

Sahiba Chopra

Data Science Fellow 2024-2025
Haas School of Business

I'm a PhD student in the Management and Organizations (Macro) group at Berkeley Haas. I have a diverse professional background, primarily as a data scientist across numerous industries, including fintech, cleantech, and media. I hold a BA in Economics from the University of Maryland, an MS in Applied Economics from the University of San Francisco, and an MS in Business Administration from UC Berkeley.

My research focuses on the intersection of inequality, technology, and the labor market. I am particularly interested in understanding how to reduce inequality in...

Jane (Mango) Angar

Data Science Fellow 2024-2025
Political Science

Hi! I am a PhD candidate in the Political Science Department at UC Berkeley. My dissertation traces the emergence of disability rights groups in Africa, focusing on Zambia and Malawi, and examines factors influencing their effectiveness. I use mixed methods, including archival work, field interviews, participant observation, and surveys for data collection.

My data analysis techniques include text analysis, social network analysis, means tests, and regressions. In my free time, I enjoy moderately difficult hikes, walks along the beach with my dog, Princess, and...

Causal Effect Estimation in Observational Field Studies of Thermal Comfort

April 1, 2025
by Ruiji Sun. We introduce and apply regression discontinuity to thermal comfort field studies, which are typically observational. The method utilizes policy thresholds in China, where the winter district heating policy is based on cities' geographical locations relative to the Huai River. Using the regression discontinuity method, we quantify the causal effects of the experiment treatment (district heating) on the physical indoor environments and subjective responses of building occupants. In contrast, using conventional correlational analysis, we demonstrate that the correlation between indoor operative temperature and thermal sensation votes does not accurately reflect the causal relationship between the two. This highlights the importance of causal inference methods in thermal comfort field studies and other observational studies in building science where the regression discontinuity method might apply.

Measuring Vowels Without Relying on Sex-Based Assumptions

April 8, 2025
by Amber Galvano. This tutorial builds on my previous post on Python for acoustic analysis, this time focusing on measuring vocal tract resonances without relying on sex-based assumptions. I demonstrate how to process audio files and vowel annotations using an adaptive method that optimizes the acoustic analysis across a recording. Instead of fixing parameters based on generalized vocal tract length correlations, this approach varies them within a defined range for greater accuracy. This not only enhances measurement precision but also avoids requiring (or assuming) speakers’ sex in data collection. Finally, I show how to filter for outliers and create high-quality vowel space visualizations.

María Martín López

Data Science Fellow 2023-2024
Psychology

María Martín López is a PhD student in the Cognition area within the Department of Psychology. Her research relates to cognitive computational and quantitative models of individual differences in behaviors, thoughts, and emotions. She is particularly interested in how we can create and leverage novel algorithms to understand, measure, and predict processes relating to externalizing psychopathology (e.g. impulsivity, aggression, substance use). She answers these questions using a range of computational and quantitive models including AI, NLP, SEM, time series analysis, multi-level...