Skip to content Skip to navigation

Agenda: Conference on Educational Data Science

September 18, 2020

7:50am to 8:00am - Greeting and Introduction

8:00am to 8:40am - Session 1 - Data Science and Education: Broad Strokes

woman's headshot
Professor of Education, Harvard University
Educational Data Science: Opportunities and Challenges
woman's headshot
Associate Professor, Learning Sciences, Northwestern University
Moving from Theory to Systems: Infrastructuring Hyper-Local Opportunity Landscapes
DAVID WILLIAMSON SHAFFER IS THE VILAS DISTINGUISHED PROFESSOR OF LEARNING SCIENCES, UNIVERSITY OF WISCONSIN-MADISON
The Role of Meaning in Educational Data Science

In the age of big educational data, researchers have tools to find ever more subtle patterns in data about teaching and learning – and about teachers and students. But big data presents challenges to traditional research methods, both qualitative and quantitative: challenges to our understanding of utility, reliability, validity, replicability, interpretability, and even significance itself. This talk looks at the reasons to – and ways to – address these challenges by keeping the concept of meaning central to the emerging field of educational data science.

View talk

View paper

ASSOCIATE PROFESSOR OF LEARNING SCIENCES/EDUCATIONAL TECHNOLOGY, NYU
The Complementarity of Human Insight and Computational Power in Learning Analytics

Learning analytics is a technology for enabling better decision-making by teachers, students, and other educational stakeholders by providing them with timely and actionable information about learning-in-process on an ongoing basis. To be effective learning analytics must thus not only be technically robust but also designed to support human use. While much has been said about the benefits that can be reaped by applying computational methods to educational big data, the role and importance of human insight in the creation and use of analytics is less clear. In this talk I'll present a variety of different ways that human insight can be incorporated into analytic design and interpretation to offers important and complementary value to that provided by algorithmic processing. 

View talk

Professor, Johns Hopkins Bloomberg School of Public Health
Discussant

8:45am to 9:25am - Session 2 - Learning Analytics and Online Learning

ASSISTANT PROFESSOR, UC IRVINE
Creating Scalable Models of Collaborative Interaction Dynamics and Outcomes

In the current globalized world, innovation in science and technology are vital for economic competitiveness, quality of life, and national security. This trend is accelerating the increasing reliance on virtual teams and their collaborative effort to solve complex environmental, social and public health problems. To contend with these dynamic conditions, communication, and collaborative problem-solving (CPS) competencies have taken a principal role in educational policy, research, and technology. Adaptive educational technologies provide a platform to deliver personalized training to improve learners’ CPS skills. However, for these systems to optimally tailor instruction, they must have key insights into learners’ interaction dynamics and team behaviors. We have been exploring these properties by employing Group Communication Analysis (GCA), a computational linguistics methodology for quantifying and characterizing the socio-cognitive processes between learners in online interactions. This talk will focus on recent studies where we have used GCA to gain a deeper understanding of role ecologies, learning and problem-solving, and issues of inclusivity in digitally-mediated group interactions. The scalability of GCA opens the door for future research efforts directed towards improving collaborative competencies and creating more inclusive online interactions.

View talk

ASSISTANT PROFESSOR OF COMPUTER SCIENCE, NORTHWESTERN UNIVERSITY
The Broader Impacts of Multimodal Learning Analytics
ASSISTANT PROFESSOR, CORNELL INFORMATION SCIENCE
Scaling Up Behavioral Science Interventions in Online Education: Part 1
ASSISTANT PROFESSOR IN THE COMPARATIVE MEDIA STUDIES/WRITING DEPARTMENT, MIT
Scaling Up Behavioral Science Interventions in Online Education: Part 2

Online education is rapidly expanding in response to rising demand for higher and continuing education, but many online students struggle to achieve their educational goals. Several behavioral science interventions have shown promise for aiding students’ persistence and completion in a handful of courses. In this study, we tested a set of behavioral interventions over two-and-a-half years, with ¼ million students, from nearly every country, across 248 online courses offered by Harvard, MIT, and Stanford.  Our iterative scientific process -- cyclically pre-registering new hypotheses in between waves of data collection -- enabled us to identify individual and contextual conditions under which the interventions can benefit students in developing countries in courses with an achievement gap between students in more and less developed countries. Our findings encourage funding agencies and researchers conducting large-scale field trials to reevaluate study guidelines that emphasize static investigations of average treatment effect over dynamic investigations of contextual heterogeneity.

View talk

ASSOCIATE PROFESSOR OF LEARNING SCIENCES/EDUCATIONAL TECHNOLOGY, NYU
Discussant

9:30am to 10:10am - Session 3 - Computer Science and AI Approaches to Education

Assistant Professor at Carleton College, Computer Science
Recognizing student strategies and misunderstanding using inverse planning

Online educational technologies provide opportunities for students to engage in complex tasks like games or virtual chemistry labs in which they may make many different choices. Tools from machine learning can be used to make sense of students' choices in these environments, and to use their choices and the way they execute these choices to make fine-grained inferences about their understanding and their strategy. In this talk, I will present an inverse planning framework for reasoning about students’ choices. This framework is based on inverse reinforcement learning and takes a Bayesian approach to combine information about common misunderstandings with the choices of a specific student. I will present behavioral experiments demonstrating the effectiveness of the framework, including applications both in education and psychology, and discuss how this framework might be used in future educational applications.

View talk

View poster

ASSISTANT PROFESSOR, GRADUATE SCHOOL OF EDUCATION, STANFORD UNIVERSITY
Learning artificial agents and cognitive models
ASSISTANT PROFESSOR OF COMPUTER SCIENCE EDUCATION, STANFORD UNIVERSITY
AI teaching assistants that don't need much student data to train
PROFESSOR OF EDUCATION AND INFORMATICS, U.C. IRVINE
Can conversational agents support children’s learning?

View Ying Xu's profile

With the development of natural language processing technologies, conversational agents can now be integrated into children’s media to support everyday learning through engaging children in meaningful conversation. We have explored such potential through developing and researching two educational applications of conversational agents, including interactive audio storybooks to promote early language and literacy skills and interactive videos to foster scientific knowledge and curiosity. In these two applications, children listen to a story or watch a video while responding to questions asked by a conversational agent and receiving elaborative feedback to their responses. Through studies with preschool-aged children’s interaction with the conversational agents, we have demonstrated the effectiveness of conversational agents in enhancing children’s learning from and engagement with storybook reading or video watching.

View talk

 

Associate Professor, Graduate School of Education, UC Berkeley
Discussant

10:15am to 10:55am - Session 4 - Data Science and Educational Measurement

Charles William Eliot Professor of Education, Harvard Graduate School of Education
When does measurement error matter in educational data science?
ASSISTANT PROFESSOR, GRADUATE SCHOOL OF EDUCATION, STANFORD UNIVERSITY
Interplay between speed and accuracy: Novel empirical insights based on 1/4 billion item responses

Response time is an intriguing process data element but relatively limited large-scale empirical investigations have examined its implications for respondent behavior. We take advantage of a unique dataset---roughly 1/4 billion item responses from the NWEA MAP assessment---to shed light on two important test-taker behaviors.  The first, response acceleration, is a reduction in response time for responses that occur relatively late on the assessment. Further, such reductions are heterogeneous as a function of estimated ability and may have implications for our understanding of ability estimates. The second, heterogeneous processing, suggests that response time has a different relationship with the ultimate response depending on the underlying difficulty of the particular item for an individual. This indicates different processes driving responses that could potentially be modeled. These empirical findings offer potential insight on how response times could be used to improve measurement processes.

View talk

ASSISTANT PROFESSOR SCHOOL OF EDUCATION, UC IRVINE
The Bias-Variance Tradeoff: How Data Science Can Inform Educational Debates

**Best paper submission at the Conference on Educational Data Science 2020**

In addition to providing a set of techniques to analyze educational data, we claim that data science as a field can provide broader insights to education research. In particular, we show how the bias-variance tradeoff from machine learning can be formally generalized to be applicable to several prominent educational debates, including debates around learning theories and pedagogy. We further show how various data science techniques that have been proposed to navigate the bias-variance tradeoff can yield insights for productively navigating these educational debates.

View talk

View paper

Associate Professor, Graduate School of Education, UC Berkeley
Data-assistive course articulation using machine translation

Higher education at scale, such as in the California public post-secondary system, has promoted upward socioeconomic mobility by supporting student transfer from 2-year community colleges to 4-year degree granting universities. Among the barriers to transfer is earning the right credit at a 2-year institution that qualifies for degree credit in a 4-year program. Course articulation is defining how course credit earned outside of an institution maps to credit within the institution, and it is an intractable task when attempting to manually articulate all courses among the colleges and universities in a state. In this talk, I will present a methodology towards making tractable this process of defining and maintaining articulations by leveraging information contained within historic enrollment patterns and course catalog descriptions. Limitations of the approach and its future integration plans will be discussed.

View talk

View slides

Vice President of Research, ETS
Discussant

11:00am to 11:40am - Session 5 - Computational Linguistics Approaches

PH.D. Candidate IN ECONOMICS OF EDUCATION, STANFORD UNIVERSITY
Opening the Black Box of College Counseling using Text-as-Data Methods

**Honorable mention for best paper submission at the Conference on Educational Data Science 2020**

Although many programs remotely disseminate information to students about the college application process, there is little evidence as to how students experience these programs. This paper examines a large-scale remote counseling program in which college counselors initiated interactions with 15,000 high school seniors via text message to support them through the college application process. Given the passive nature of text messaging, not all of the counselors' prompts elicited similar responses from students. I use text-as-data methods to measure which interactions lead to productive engagement between counselors and students, and which do not. I show that interactions about financial aid offers and financial aid applications are much more likely to generate productive engagement than interactions about college list.

View talk

View paper

PH.D. candidate, UC IRVINE
Unsupervised Representations Predict Popularity of Peer-Shared Artifacts in Online Learning Environments

**Honorable mention for best paper submission at the Conference on Educational Data Science 2020**

In online collaborative learning environments, students create content and construct their own knowledge through complex interactions over time. To facilitate effective social learning and inclusive participation in this context, insights are needed into the correspondence between student-contributed artifacts and their subsequent popularity among peers. In this study, we represent student artifacts by their (a) contextual clickstream of interactions (b) textual content, and (c) set of instructor-specified features, and use these representations to predict artifact popularity scores. Through a mixture of predictive analysis and visual exploration, we find that the neural embedding representation, learned from contextual clickstream, has the strongest predictions of popularity, ahead of instructor's knowledge, which includes academic value and creativity ratings. Because this representation can be learnt without human labeling, it opens up potential possibilities for shaping student interactions towards the more inclusive and pedagogically valuable on the fly.

View talk

PH.D. candidate, UC IRVINE
PH.D. CANDIDATE, UC IRVINE
In or Out of Sync: Federal Funding and Research in Early Childhood

**Honorable mention for best paper submission at the Conference on Educational Data Science 2020**

Researchers have examined the influence of federal investment on productivity in science and higher education research, but not in early childhood. In addition, existing research tends to use funding and citation metrics, as opposed to examining content shifts. This study applies text mining and fixed effect models on 44,337 articles and federal grants abstracts to examine trends of research areas in early childhood and the extent to which topics in grants map onto topics from previous publications and funding amount. First, we find significant changes in trends of research and grants in early childhood over time, with an increasing distribution of topics in education and care (i.e., teacher training, education technology, and parenting) and evaluation. Second, one-way fixed effect models indicate that funding from the previous year significantly predicts the extent to which a topic appears in subsequent grants. However, topic distribution in prior publications does not strongly predict topic distribution in grants. Third, there exists variation in the association between grant and publication topic distribution across disciplines. Understanding the relation between research and government agenda has implications for promoting scientific knowledge production in early childhood, a rapidly expanding policy area that requires complicated funding and administration. 

View talk

PH.D. CANDIDATE, SCHOOL OF INFORMATION, UC BERKELEY
Content Analysis of Textbooks via Natural Language Processing: Novel Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks

**Best paper submission at the Conference on Educational Data Science 2020**

Cutting-edge data science techniques can shed new light on fundamental questions in educational research. We highlight the insights that can be gained about the representation of historically marginalized groups by applying natural language processing (NLP) to fifteen of the most widely used U.S. history textbooks in Texas between 2015 and 2017. First, Hispanic/Latinx people are rarely discussed, and the most common famous figures are nearly all white men. Secondly, lexicon-based approaches show that Black people perform actions with lower agency and power than others. Thirdly, topic models and word embeddings reveal that women are described less diversely than men and are associated with domestic roles. We also find that more conservative counties tend to purchase textbooks with less representation of women and Black people. Building on a rich tradition of textbook analysis, we release our toolkit for computational analyses of textbooks to support new research directions.

View talk

View paper

Bluhm Family Assistant Professor of Data Science and Education
Discussant

11:45am to 12:45pm - Poster session on Discord

A Call for Critical and Constructive Data Science in Teacher Education
Using extreme gradient boosting to estimate community effects on school readiness
Using Natural Language Processing Methods to Assess Treatment Fidelity
Better together? Initial findings and implications from combining qualitative coding and computational methods to analyze classroom audiovisual data
Towards a Typology of School Choice: Characterizing Charter and Non-charter Public Schools in Texas
Describing inequality and collaboration in grant funding for 1/3 of federal research grant spending
Sorting schools: A computational analysis of charter school identities and stratification
Multimodal learning for classroom activity detection
Extracting Indicators for Education Research from Administrative Data Using Machine Learning Methods
A scientometric review of educational learning analytics research: Trends and visualization
Exploratory Cluster Analysis of U.S. Adults Characteristics in PIAAC data
Friendship Networks in College
Remote Tutoring and Digital Canvases: A text analyses
The Effects of Instructors’ Use of Online Discussions Strategies on Student Participation and Performance in University Online Introductory Mathematical Courses
Gendered Patterns in Online Collaborative Discourse Over Time
Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education
The Effect of Equalization Reform on Elite and Disadvantaged Elementary Schools: Evidence from the Text Mining of Social Media
Ethical Issues in Using Mobile/Wearable Technologies for Research: Opportunities and Challenges
Educational Vision via Data Science: Insights from Alumni Networks on LinkedIn
Modelling students’ social network structure from spatial-temporal network data
Can Learning Analytics Help Us Understand Differences in Behaviors and Achievement Among Diverse Learners? Results from an Online Chemistry Course
If you’re happy and you know it, post a tweet? A study of the sentiment of posts to the #NGSSchat hashtag on Twitter
A Taxonomy of Critical Dimensions in Learning Analytics: Some Key Elements for Interpretation of Data
Data-driven Insights in the Consideration of Noncognitive Attributes and Equity-promoting Practice in Selective Undergraduate Admissions
Understanding Students’ Behavioral Patterns on Interactive E-books using Doc2vec Embeddings
Emotional pulse of schools
Finding the Leaks in the Pipeline: A Random Forest Approach to Understanding Women’s Persistence in STEM
Towards a new measurement of MOOCs
Using Foil Analysis to Develop Pedagogically Valuable Analytics
Automated classification of preservice science teachers' written reflections
What do eye-tracking data say about the cognitive mechanisms underlying the pattern extension skills of young children?
Predicting College Success: What Data Are Useful and for Whom?