Statistics: A Review

Register
SummaryView full course description

A one-day, intensive review of common statistical methods of design, measurement analysis and presentation of scientific investigations.  The workshop is designed for any scholar engaged in quantitative research.

Statistics: A Review discusses answers to the following questions:

  • What should we measure?
  • What are the main design types; what are the comparative advantages of each?
  • How are the sample sizes determined?
  • What are the appropriate inference procedures?
  • What do standard error, p-value and confidence level mean?
  • What are some dangers we need to avoid?
  • How should we display our results?
  • What are the statistical software options?

Introduction to SPSS

Register
SummaryView full course description

Note: Topic order is subject to change.  Participants must sign up for the entire series.

Fundamentals

This portion introduces SPSS for Windows, the menu and the help systems, the three main types of files used, and printing from within SPSS.  It then addresses defining variables, attaching labels, defining missing values, and various ways to enter data into SPSS.  Finally, it covers a brief introduction to obtaining frequency distributions, descriptive statistics, and cross tabulations of variables.

Within-Case Transformations

This portion introduces data management capabilities, including recoding variables (manual and automatic), computing new variables using formulas, and counting occurrences of values within subjects.  Attention then turns to temporary transformations, conditional processing of transformations, and repetitive transformations.  SPSS syntax is also introduced.

Data Management with Multiple Files

This portion begins with a discussion of subsetting data files by drawing samples, selecting groups and excluding groups from analysis.  Then, the two main methods of merging SPSS data files are covered: adding additional variables and adding additional cases.  Next, creating aggregated data sets and applying aggregated data to individuals is covered.  Lastly, importing and exporting data between SPSS and other statistical programs (Excel, dBase, SAS) is demonstrated.

Basic Statistics and Graphics

This portion covers basic exploratory procedures, including obtaining percentiles, frequencies, descriptive statistics, and cross tabulations. Basic comparative procedures including two-sample t-tests, paired t-tests, and one-way analysis of variance are also covered.  Then, simple bivariate correlation analysis is introduced.  Participants are given a basic introduction to commonly used graphical procedures for displaying data, including scatter plots, bar graphs, histograms, and boxplots.

Introduction to Stata

Register
SummaryView full course description

Note: Topics are subject to change.  Participants must sign up for the entire series.

Fundamentals

This portion introduces Stata for Windows, including the menus, help systems, search systems, and main windows within the Stata Environment. Entering data into Stata and defining variable attributes is introduced, in addition to the various methods of importing external data files into Stata. Using the menus to set up commands and procedures is contrasted with entering Stata commands interactively.

Data Management

This section introduces working with date and time variables, generating new variables and replacing values in existing variables, sorting data files, merging data files, computing dummy variables, keeping and dropping variables and cases, and using the menus to set up data management procedures. Methods for describing and comparing data sets are also introduced.

Basic Statistical Analysis

This portion introduces basic descriptive and summary analyses in Stata, including standard numerical summaries of continuous variables, frequency and cross-tabulation analysis, and hypothesis testing for means and proportions. Commands for common regression analysis procedures (including linear and logistic regression) are also introduced, along with methods for analyzing data from complex sample surveys. Methods for analyzing subsets of data and performing analyses stratified by a categorical variable are also covered. Finally, working with the values and estimates that are returned by Stata analysis commands is introduced.

Basic Graphing Tools

This portion provides participants with an introduction to commonly used graphing tools in Stata.

Programming

The workshop will end with a basic introduction to programming in Stata using .do files.

Determining Sufficient Sample Size

Register
SummaryView full course description

This workshop outlines how to calculate an appropriate sample size (n) to address the objectives of a research project. Participants will be led through essential steps for the design of a study: specifying the outcome variable, outlining hypothesis tests, estimating the variance or other “nuisance parameters,” determining power to detect particular differences, and balancing these considerations against cost to arrive at a final sample size.

 

Participants have hands-on instruction in a computer classroom, using different sample size software such as nQuery Advisor (available via virtual sites and presented mostly by the instructor) or built-in sample size applications within Stata or SAS (hands-on experience during workshop) to compute sample sizes and plot power curves.  (This workshop does not cover survey research designs.)

Sample size calculations to be covered include:

* comparisons of two means or proportions

* analysis of variance (ANOVA) designs

* repeated measures designs

* regression designs

* case-control study designs.

Special requests from those attending the workshop are welcome.

Introduction to Survey Design: Data Collection, Questionnaire Design and Response Processes

Register
SummaryView full course description

This workshop will present an overview of available modes and methods of survey data collection as well as an introduction to the survey response process and implications for questionnaire design.  Participants will gain an appreciation of the tradeoffs inherent in survey design decisions and how design can affect data quality and survey errors. Topics will include:

  • Survey errors, in particular measurement, coverage, and nonresponse error.
  • What to consider when selecting a data collection method for a particular research question.
  • Measurement (response) error and how to reduce it through question wording/format and questionnaire structure.

The role of the interviewer and interviewer effects.

Regression Analysis

Register
SummaryView full course description

This workshop will provide participants with an overview of commonly used methods in simple linear regression and multiple linear regressions. There will be both lecture and hands-on computer work, using SPSS. Topics will include: the basic regression model, model assumptions, interpretation of coefficients, significance testing, interactions between variables and the use and interpretation of dummy variables. Model checking methods, including residual plots, collinearity diagnostics, and influence plots will also be covered. Several methods for model selection, including all possible regressions and stepwise selection will be included.

Applied Survival Analysis (Event History Analysis, Reliability Analysis)

Register
SummaryView full course description

This workshop, held over two days, covers basic concepts of and common analytical approaches for time-to-event data, known variously as survival analysis (in biological and medical sciences), event history analysis (in social sciences), or reliability analysis (in engineering).

The workshop will be held in a computer lab and the methods will be illustrated with hands-on exercises.

Exercises and examples will use SAS, R, SPSS, and/or Stata as necessary. This workshop covers:

  • Basic concepts associated with the analysis of censored data (survival function, hazard function)
  • Methods for estimating the survival function (Kaplan-Meier, Nelson-Aalen, and life-table analysis)
  • Two-sample tests with censored data (log-rank and Wilcoxon tests)
  • Regression analysis with censored data (Cox proportional hazards, Weibull, Aalen additive hazards), including time varying covariates, correlated data, and stratified Cox models
  • Discrete models for censored data (logistic regression, Poisson regression)
  • Basics of power and sample size estimation for time-to-event studies.

Applications of Hierarchical Linear Models

Register
SummaryView full course description

This workshop teaches the concepts and analysis of multilevel data through multilevel models (also known as hierarchical linear models or mixed models). With understanding of basic linear regression concepts as a prerequisite, the instructors will cover a wide range of topics including clustered data, longitudinal studies, and clustered longitudinal data. Participants will be introduced to the use of HLM 7.0 software. The workshop will consist of lively lectures and hands-on examples using HLM software.Many studies in social sciences (e.g., education, human development, public health, sociology) are multilevel, longitudinal, or both. Multilevel data arise when participants are clustered within social settings. The variation and covariation within and between such settings are often of interest substantively and should not be ignored when assessing relationships between explanatory variables and outcomes. In longitudinal research, we repeatedly observe subjects. These repeated measures for each participant will be correlated and explanatory variables may be time-varying or time-invariant. This workshop will consider the issues of analysis that arise in multilevel and longitudinal research settings.

We will first consider two-level cross-sectional studies in which persons (level-1) are nested within groups (level-2). The level-1 model specifies a process within each group, and the level-2 model explains how these processes are different between groups. Next, we will discuss two-level studies of individual growth and compare the structures of these studies to multilevel studies. We will also consider three-level models. We will focus on the case in which repeated measures (level-1) are nested within persons (level-2) who are themselves nested in organizations (level-3).

All of these studies will involve nearly continuous outcomes for which the normality distribution is at least plausible. They will also feature purely nested designs (e.g., persons nested within organizations). The workshop will provide participants with an overview of other types of applications where hierarchical linear models or generalized hierarchical linear models are appropriate (e.g., binary outcomes), and briefly discuss how the HLM software could be used to model such data.

Advanced Stata

Register
SummaryView full course description

Note: Topic order is subject to change.  Participants must sign up for the entire series.

This workshop provides additional Stata training on topics more advanced than those covered in the Introduction to Stata workshop. Models for clustered/longitudinal data will be discussed along with other regression modelling techniques such as quantile regression and multinomial logistic regression. Structural Equation Modelling and Survival Analysis in Stata will also be discussed. The workshop will end with an introduction to programming in Stata using .do files. Basic looping techniques and macros will be covered. Note that an entire workshop will be offered in spring term on Programming in Stata. This workshop is designed to teach participants how to implement the methods outlined above in Stata and only a brief overview of the theory behind these methods will be covered. Participants should have a working knowledge of Stata as a prerequisite.

Statistical Analysis with R

Register
SummaryView full course description

This workshop will introduce participants to R. R is a free and open source environment for data analysis and statistical computing.  While R contains many built-in statistical procedures, the most unique feature of R is the facility for users to extend these procedures to suit their own needs.  Excellent graphics are another reason R is gaining wide popularity.

  • How to Obtain R
  • Help Tools
  • Importing / Exporting Data
  • Data Management
  • Descriptive Statistics
  • Multivariate Statistical Analyses (Regression Modeling, ANOVA, etc.)
  • Graphics
  • Creating Functions