Data science is a large and expanding field, and the issues it confronts vary greatly with each domain of application. To understand issues within each applied domain, one cannot simply read a book on education theory to comprehend it. Therefore, it is important to have a rich immersion and dialogue with the empirical domain. In doing so, one quickly realizes education is different from medicine, business, or the digital humanities. Not only are the problems of education different, but so are the ethical concerns, stakeholder interests, and the phenomena in question. To make data science of value to educational problems, there is a strong need for substantive immersion and dialogue across data science and education.
To this end, the Institute for Research in the Social Sciences’ Center for Computational Social Science hosted the Conference on Educational Data Science on September 18, 2020, facilitating an international conversation on bridging the fields of education research and data science, the merits of mixed methods research in this emerging field, and the insights gleaned from research in this area thus far. The conference also served as a satellite conference for the annual American Education Research Association (AERA) conference, and was also co-sponsored by the Stanford Graduate School of Education and Educational Testing Service (ETS). The conference organizers hailed from institutions across the country and are leaders in the field of education research and education data science:
Daniel A McFarland - Professor, Graduate School of Education, Stanford University
Ben Domingue - Assistant Professor, Graduate School of Education, Stanford University
Joanna Gorin - Vice President of Research, ETS
Felice Levine -Executive Director, AERA
Zach Pardos - Associate Professor, Graduate School of Education, UC Berkeley
Over 750 researchers, graduate students, policymakers, and practitioners signed up to participated in the conference. Many of the participants came from data science and education programs, but there were also many international attendees and professionals outside academia. As such, the conference acted much like a trading zone across fields and domains. Conference organizer and Center director Daniel McFarland commented: “The discussion sessions were brief but rich. Scholars not only discussed how data science perspectives and approaches could be of value to education research, but also how education research standards and concerns could greatly improve the quality of data science applications in the domain of education. Not only did we see how students and faculty are applying data science to educational problems, but how our understanding of education changes the way data science should be ethically, validly, and valuably performed on educational phenomena.”
All contributors recorded presentations on their topics and papers ahead of the conference so that participants could watch the recordings before the synchronous portion to come prepared with their questions and comments. This approach has the benefit of limiting the duration of the synchronous portion of the conference, and providing an online resource for those who wish the watch the presentations in the future. All presentations will continue to stay available on the agenda page.
Graduate students and professors from across the country submitted high-caliber papers to present at the conference. Among the submissions, two papers won the award for Best Paper Submission:
Graduate Student Category
Content Analysis of Textbooks via Natural Language Processing: Novel Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks - Lucy Li, PhD Candidate, School of Information, UC Berkeley & Dora Demszky, PhD Candidate, Linguistics Department, Stanford University
The Bias-Variance Tradeoff: How Data Science Can Inform Educational Debates - Shayan Doroudi, Assistant Professor School of Education, UC Irvine
The conference concluded with a poster session that reflected the innovative measures organizers are taking today to approximate the important personal interactions that take place at academic conferences. Synchronous conversations on the poster topics took place on Discord and in Mozilla Hubs virtual rooms, allowing for both written and verbal communication. Ultimately, the conference mirrored the virtual environments educators across the world are currently using to teach, and provided lasting resources to researchers, policymakers, and practitioners on educational data science.