Explore ARCExplore ARC

Building a Community of Social Scientists with Big Data Skills: The ICOS Big Data Summer Camp

By | Educational, Feature, General Interest, News

As the use of data science techniques continues to grow across disciplines, a group of University of Michigan researchers are working to build a community of social scientists with skills in Big Data through a week-long summer camp for faculty and graduate students.

Having recently completed its fourth annual session, the Big Data Summer Camp held by the Interdisciplinary Committee for Organizational Studies (ICOS) trains approximately 50 people each spring in skills and methods such as Python, SQL, and social media APIs. The camp splits up into several groups to try to answer a research question using these newly acquired skills.

Working with researchers from other fields is a key component of the camp, and of creating a Big Data social science community, said co-coordinator Todd Schifeling, a Research Fellow at the Erb Institute in the School of Natural Resources and Environment.

“Students meet from across social science disciplines who wouldn’t meet otherwise,” said Schifeling. “And every year we bring back more and more past campers to present on what they’ve been doing.”

Schifeling himself participated in the camp as a student before taking on the role of coordinator this year.

Teddy DeWitt, the other co-coordinator of the camp and a doctoral student at the Ross School of Business, added the camp presents the curriculum in a unique way relative to the rest of campus.

“This set of material does not seem to be available in other parts of the university, at least … with an applied perspective in mind,” he said. “So we’re glad we have this set of resources that is both accessible and well-received by students.”

Participants range in skill from beginning to advanced, but even a relatively advanced student like Jeff Lockhart, a doctoral student in sociology and population studies who describes himself as “super-committed to computational social science,” said that it’s hard to find classes in computational methods in social science departments.

“[The ICOS camp] doesn’t expect a lot of prior knowledge, which I think is critical,” Lockhart said.

Lockhart, DeWitt, and Dylan Nelson, also a sociology doctoral student, are working on setting up a series of workshops in Computational Social Science for fall 2016 (contact Lockhart at jwlock@umich.edu for more information). Lockhart said it’s critical that social scientists learn Big Data skills.

“If we don’t have skills like this, there’s no way for us to enter into these fields of research that are going to be more and more important,” he said.

“A lot of the skills we’ve learned are sort of the on-ramp for doing data science,” DeWitt added.

The camp is co-sponsored by Advanced Research Computing (ARC).

Data science with social science data

By |

This workshop covers the essential steps to data analysis in Python, using social science data as a case study. The workshop is divided into two parts. The first session includes an introduction to Python’s numpy and Pandas data analysis library. This session requires no previous experience with python. We will cover common steps involved in any data analysis: from loading the data to running a regression and interpreting outcomes.

The second session requires some background knowledge in python provided by the first session. The second session covers more advanced features, from various potential preprocessing steps to using Machine Learning Scikit-learn tools to analyze the data. As in the first session, we will be using an example from the social sciences.

The two sessions will be held in a computer lab and participants will be able to work either individually or in small groups on a few practice exercises.

 

Data science with social science data

By |

This workshop covers the essential steps to data analysis in Python, using social science data as a case study. The workshop is divided into two parts. The first session includes an introduction to Python’s numpy and Pandas data analysis library. This session requires no previous experience with python. We will cover common steps involved in any data analysis: from loading the data to running a regression and interpreting outcomes.

The second session requires some background knowledge in python provided by the first session. The second session covers more advanced features, from various potential preprocessing steps to using Machine Learning Scikit-learn tools to analyze the data. As in the first session, we will be using an example from the social sciences.

The two sessions will be held in a computer lab and participants will be able to work either individually or in small groups on a few practice exercises.

Data Science with Social Science data: an introduction to Pandas and StatsModels in Python

By |

This workshop introduces participants to Python’s NumPy, Pandas DataFrames, Matplotlib and StatsModels using an advertising dataset. Participants will use these tools to model (OLS) associations between advertising expenditures and product sales in example data. We will start with an introductory explanation of Anaconda and the Jupyter notebook environment (although not required for the participant, the instructor will be using these tools). We will proceed with topics including: reading data files; creation, indexing and slicing of Pandas DataFrames; creation and handling of Matplotlib objects; and creation and interpretation of models using Python’s StatsModels. Although not required, we recommend that participants have a basic knowledge of Python.

Data-Intensive Social Science Challenge Symposium

By |

Data-intensive social science is one of the research focus areas that MIDAS supports with its Challenge Awards. Our long-term goal is to support this research area more broadly, using the Challenge Award projects as the starting point to build a critical mass. This symposium offers a platform for all participants to explore collaboration opportunities and aims to attract more researchers to our hub. The two Challenge Award teams will give in-depth presentations, and all participants are encouraged to submit posters on research related to data-intensive social science.

Registration | Poster submission form (Due Monday, Sept. 10)

Preliminary Schedule:

9 am: Introduction

9:05 am to 11:35 pm: Challenge Award presentations

11:35 am to 1 pm: lunch, poster session and networking (Please fill out this form to submit a poster; deadline is Monday, September 10)

1 to 2 pm: Panel discussion: the future of data-intensive social science research at U-M

  • Martha Bailey, Professor, Economics, University of Michigan
  • Sara Heller, Assistant Professor, Economics, University of Michigan
  • Matt Shapiro, Professor, Economics, University of Michigan
  • Lisa Singh, Professor, Computer Science, Georgetown University
  • Mike Traugott, Professor Emeritus, Communication Studies, Political Science, University of Michigan