Seminar

Useful Variation in Clinical Practice under Uncertainty: Diversification and Learning - Charles Manski

Date
Mon March 10th 2014, 12:45pm
Event Sponsor
the Institute for Research in the Social Sciences (IRiSS) and the Graduate School of Business (GSB)
Location
Room M104 in the McClelland Building (part of the Knight Management Center) of Stanford's Graduate School of Business
Useful Variation in Clinical Practice under Uncertainty: Diversification and Learning - Charles Manski

 Charles Manski, Professor of Economics at Northwestern University

Slides available here.

Abstract

Probabilistic topic models provide a suite of tools for analyzing large document collections. Topic modeling algorithms discover the latent themes that underlie the documents and identify how each document exhibits those themes. Topic modeling can be used to help explore, summarize, and form predictions about documents.

Traditional topic modeling algorithms take a document collection as input and analyze the texts to estimate its latent thematic structure. However, for many collections, there is an additional type of data: how people use the documents. For example, readers click on articles in a newspaper website, scientists place articles in their personal libraries, and lawmakers vote on a collection of bills. User behavior data about documents is essential for building automatic recommendation systems and, further, gives new ways of understanding how a collection and its users are organized.

In this talk, Blei will review the basics of topic modeling and describe our recent research on collaborative topic models, which simultaneously analyze texts and corresponding user behavior data. We studied collaborative topic models on a large collection of 80,000 scientists' libraries and the 250,000 abstracts of the corresponding articles. With this analysis, we can build recommendation systems that point scientists to articles they will like and, further, organize the scientific literature according to the discovered patterns of readership. As examples, we can identify articles that are important within a field and articles that transcend disciplinary boundaries.

More broadly, topic modeling is a case study in the large field of applied probabilistic modeling. Finally, Blei will survey some recent advances in this field. He will show how modern probabilistic modeling gives data scientists a rich language for expressing statistical assumptions and scalable algorithms for uncovering hidden patterns in massive data.

Contact Phone Number