QMSS provides students with a basis in quantitative skills for social science research through its core curriculum, with the flexibility to engage in interdisciplinary pursuits through their elective selections. Students can select these classes from the elective offerings by QMSS, or draw from the wider university through cross-registration in the many departments across Columbia that QMSS has developed strong relationships with over the years. Below are descriptions of the courses offered by QMSS as well as a selection of courses around the university that may be of interest. Students are advised to work with the program to determine a course of study that suits goals for program design as well as to receive guidance on their options for study throughout the university.
Core QMSS Courses
The five courses below or their equivalents are completed by ALL QMSS Students.
QMSS Theory and Methodology (QMSS G4010)
This interdisciplinary course, taken in the fall semester, is a comprehensive introduction to quantitative research in the social sciences. The course focuses on foundational ideas of social science research, including strengths and weaknesses of different research designs, interpretation of data drawn from contemporary and historical contexts, and strategies for evaluating evidence. The majority of the course is comprised of two-week units examining particular research designs, with a set of scholarly articles that utilize that design. Topics include: the “science” of social science and the role of statistical models, causality and causal inference, concepts and measurement, understanding human decision making, randomization and experimental methods, observation and quasi-experimentation, sampling, survey research, and working with archival data.
QMSS Data Analysis Requirement (QMSS 4015)
The data analysis course covers specific statistical tools used in social science research using the statistical program R. Topics to be covered include statistical data structures, and basic descriptives, regression models, multiple regression analysis, interactions, polynomials, Gauss-Markov assumptions and asymptotics, heteroskedasticity and diagnostics, models for binary outcomes, naive Bayes classifiers, models for ordered data, models for nominal data, first difference analysis, factor analysis, and a review of models that build upon OLS. Prerequisite: introductory statistics course that includes linear regression.
ECON W4412: Advanced Econometrics (Economics Concentration Only)
Students who are planning on pursuing the economics concentration are required to take this course. This course is intended for students who already have a firm grasp of introductory level econometrics, and interested in advanced topics including asymptotic theory. The pre-requisites are linear algebra, intermediate microeconomics, macroeconomics, and econometrics. Topics to be covered include OLS in matrix form, finite sample and asymptotic properties, hypothesis testing, GLS, maximum likelihood, endogeneity, stationary time series, nonstationary time series, panel data, and discrete choice models.
STAT W5701: Probability and Statistics (Data Science CONCENTRATION Only)
Students who are planning on pursuing the data science concentration are required to take this course. This course is a self-contained introduction to probability and statistics with a focus on data science. The topics covered include fundamentals of probability theory and statistical inference, including: probabilistic models, random variables, useful distributions, expectations, the law of large numbers, the central limit theorem, point and confidence interval estimation, maximum likelihood methods, hypothesis tests, and linear regression (as time permits).
QMSS Seminar Series (QMSS G4021 & G4022)
This course is designed to expose students in the QMSS degree program to different methods and practices of social science research. Seminar presentations are given on a wide range of topics by faculty from Columbia and other New York City universities, as well as researchers from private, government, and non-profit settings. QMSS students participate in a weekly seminar. Speakers include faculty from Columbia and other universities, and researchers from the numerous corporate, government, and non-profit settings where quantitative research tools are used. Topics have included: Now-Casting and the Real-Time Data-Flow; Art, Design & Science in Data Visualization; Educational Attainment and School Desegregation: Evidence from Randomized Lotteries; Practical Data Science: North American Oil and Gas Drilling Data.
QMSS Thesis (G5999)
All students must complete an MA thesis, which involves original statistical analysis, under the supervision of the student's advisor and the QMSS program director. Students should register for this course in the last semester of their program
QMSS Students typically take between 4 and 6 elective courses. Any 4000-level or above course offered by QMSS, Computer Science, Economics, Statistics, Political Science, Sociology, Mathematics, or SIPA will satisty one of these requirements. 4000-level courses outside these departments MAY satisfy an elective requirment but require approval by the Director of QMSS.
Each concentration has its own guidlines regarding elective distribution, so be sure to read your Degree Requirements worksheet carefully.
Some popular elective courses are listed below. Be aware that course listings are always subject to change. You should always check in the the Columbia Directory of Classes for the most up-to-date information.
Traditional Track QMSS Students must take TWO Research Methods Electives. We Stringly encourage they fulfill this requirement through QMSS Department Electives.
QMSS Regression Analysis of Temporal Processes (QMSS G4016)
This course will introduce students to the main concepts and methods behind regression analysis of temporal processes and highlight the benefits and limitations of using temporally ordered data. Students study the complementary areas of time series data and longitudinal (or panel) data. There are no formal prerequisites for the course, but a solid understanding of the mechanics and interpretation of OLS regression will be assumed (we will briefly review it at the beginning of the course). Topics to be covered include regression with panel data, probit and logit regression of pooled cross-sectional data, difference-in-difference models, time series regression, dynamic causal effects, vector autoregressions, cointegration, and GARCH models. Statistical computing will be carried out in R.
QMSS Advanced Analytic Techniques (QMSS G4018)
This course is meant to train students in advanced quantitative techniques in the social sciences. Statistical computing will be carried out in R. Topics include: review of multiple/linear regression, review of logistic regression, generalized linear models, models with limited dependent variables, first differences analysis, fixed effects, random effects, lagged dependent variables, growth curve analysis, instrumental variable and two stage least squares, natural experiments, regression discontinuity, propensity score matching, multilevel models or hierarchical linear models, and text-based quantitative analysis.
QMSS Introduction to Missing Data (QMSS G4059)
The goal of this course is to provide students with a basic knowledge of the potential implications of missing data for their data analyses as well as potential solutions. Students will looks at different types of mechanisms that can generate missing data. This will lay the groundwork for discussions of what types of missing data scenarios can be accommodated by each missing data method discussed subsequently. Finally, students will learn how to deal with missing data in Stata. More advanced techniques will be covered in a course later on, using R and Stan. Any QMSS student is presumed to have sufficient background. Any non-QMSS students interested in taking this course should have sufficient background in regression modeling of discrete variables. Topics to be covered are probability theory, endogenous selection, mechanisms of missing data, single imputation methods, multiple imputation methods, multivariate normal imputation, conditional imputation, and post-imputation diagnostics.
QMSS Social Network Analysis (QMSS G4062)
The course is designed to teach students the foundations of network analysis including how to manipulate, analyze and visualize network data themselves using statistical software. We will focus on using the statistical program R for most of the work. Topics will include measures of network size, density, and tie strength, measures of network diversity, sampling issues, making ego-nets from whole networks, distance, dyads, homophily, balance and transitivity, structural holes, brokerage, measures of centrality (degree, betweenness, closeness, eigenvector, beta/Bonacich), statistical inference using network data, community detection, affiliation/bipartite networks, clustering and small worlds; positions, roles and equivalence; visualization, simulation, and network evolution over time.
QMSS Data Visualization (QMSS G4063)
QMSS Bayesian Statistics for the Social Sciences (QMSS G4065)
An introduction to Bayesian statistical methods with applications to the social sciences. Considerable emphasis will be placed on regression modeling and model checking. The primary software used will be Stan, which students do not need to be familiar with in advance. Student in the course will access the Stan library via R, so some experience with R would be helpful but not required. Any QMSS student is presumed to have sufficient background. Any non-QMSS students interested in taking this course should have a comparable background to a QMSS student in basic probability. Topics to be covered are a review of calculus and probability, Bayesian principles, prediction and model checking, linear regression models, Bayesian data collection, Bayesian calculations, Stan, the BUGS language and JAGS, hierarchical linear models, nonlinear regression models, missing data, stochastic processes, and decision theory.
QMSS Experimentation in the Social Sciences (QMSS G4068)
The course is designed to provide students with a basic introduction to the use of experimental methods in political and social sciences. Students will be exposed to methodological, theoretical and practical aspects of experimentation. No prior knowledge of experimental methods is required. Topics to be covered are causal inference randomization and validity; reporting experimental research; experimental design and analysis; ethics, human subjects research and the IRB; laboratory experiments; survey experiments; field experiments; quasi-experimentation, natural experiments and regression discontinuity designs; and integrating experimental research.
QMSS GIS Spatial Analysis (QMSS G4070)
This course introduces students to basic spatial analytic skills. It covers introductory concepts and tools in Geographic Information Systems (GIS) and database management. As well, the course introduces students to the process of developing and writing an original spatial research project. Topics to be covered include: social theories involving space, place and reflexive relationships; social demography concepts and databases; visualizing social data using geographic information systems; exploratory spatial data analysis of social data and spatially weighted regression models, spatial regression models of social data, and space-time models. Use of open-source software (primarily the R software package) will be taught as well.
QMSS Advanced GIS Spatial Analysis (QMSS G4071)
This course builds upon foundational spatial analysis concepts and skills built in the introductory GIS course through the application of advanced spatial statistical modeling tools. Topics covered include 1) Graphical and quantitative description of spatial data, 2) Kriging, block kriging and cokriging, 3) Common variogram models, 4) Spatial autoregressive models, estimation and testing, 5) Spatial non-stationarity and associated modeling procedures and 6) Spatial sampling procedures. Use of open-source software (Primarily the R software package) with emphasis on analysis of real data from the environmental and social sciences will be the substantive focus of the class. Students will do a series of in-class labs and develop a final research project from these labs or an independent project.
Students enrolled in the Quantitative Methods in the Social Sciences MA program have a number of opportunities for internships with various organizations in New York City. All internships will be graded on a pass/fail basis.
An internship must meet the following criteria:
- It is related to the core issues of concern to the MA Program in Quantitative Methods in the Social Sciences.
- The work is substantive (although students may perform some administrative tasks, we want to ensure that they receive experience in substantive research).
- It is a practical, professional experience.
QMSS Independent Study (QMSS G4998)
Students develop a course of study under the supervision of a faculty member. Please see the QMSS program coordinator for more details.
COMS W4170x User interface design 3 pts. Lect: 3. Prerequisites: COMS W3137. Introduction to the theory and practice of computer user interface design, emphasizing the software design of graphical user interfaces. Topics include basic interaction devices and techniques, human factors, interaction styles, dialogue design, and software infrastructure. Design and programming projects are required.
COMS W4705x Natural language processing 3 pts. Lect: 3. Prerequisites: COMS W3133, or W3134, or W3137, or W3139, or the instructor's permission. Computational approaches to natural language generation and understanding. Recommended preparation: some previous or concurrent exposure to AI or Machine Learning. Topics include information extraction, summarization, machine translation, dialogue systems, and emotional speech. Particular attention is given to robust techniques that can handle understanding and generation for the large amounts of text on the Web or in other large corpora. Programming exercises in several of these areas.
COMS W4771y Machine learning 3 pts. Lect: 3. Prerequisites: Any introductory course in linear algebra and any introductory course in statistics are both required. Highly recommended: COMS W4701 or knowledge of Artificial Intelligence. Topics from generative and discriminative machine learning including least squares methods, support vector machines, kernel methods, neural networks, Gaussian distributions, linear classification, linear regression, maximum likelihood, exponential family distributions, Bayesian networks, Bayesian inference, mixture models, the EM algorithm, graphical models, and hidden Markov models. Algorithms implemented in Matlab.
ECON G4301x Economic Growth and Development
Prerequisites: Econ W3211 and W3213. Empirical findings on economic development, theoretical development models; problems of efficient resource allocation in a growing economy; balanced and unbalanced growth in closed and open economic systems; the role of capital accumulation and innovation in economic growth.
ECON W4415: Game Theory
Prerequisites: ECON W3211 and W3213.
Introduction to the systematic treatment of game theory and its applications in economic analysis.
ECON W4020: Economics of Uncertainty and Information
Prerequisites: ECON W3211, W3213 and STAT 1201.
Topics include behavior uncertainty, expected utility hypothesis, insurance, portfolio choice, principle agent problems, screening and signaling, and information theories of financial intermediation.
MATH W4061x Introduction to Modern Analysis 3 pts. 3 pts.
Prerequisites: MATH V1202 or the equivalent and V2010. The second term of this course may not be taken without the first.
Real numbers, metric spaces, elements of general topology. Continuous and differential functions. Implicit functions. Integration. Change of variables. Function spaces.
MATH W5010 Introduction to the Mathematics of Finance 3 pts.
Prerequisites: MATH V1202, MATH V3027, STAT W5203, SIEO W3001, or their equivalents.
The mathematics of finance, principally the problem of pricing of derivative securities, developed using only calculus and basic probability. Topics include mathematical models for financial instruments, Brownian motion, normal and lognormal distributions, the BlackûScholes formula, and binomial models.
POLS W4700x Mathematical Methods for Political Science 3 pts. Nicholas Beauchamp
Provides students of political science with a basic set of tools needed to read, evaluate, and contribute in research areas that increasingly utilize sophisticated mathematical techniques.
POLS W4710x Principles of Quantitative Political Research 3 pts. David Park
Introduction to the use of quantitative techniques in political science and public policy. Topics include descriptive statistics and principles of statistical inference and probability through analysis of variance and ordinary least-squares regression. Computer applications are emphasized.
POLS W4714x Multivariate Political Analysis 3 pts. Shigeo Hirano
Prerequisite: basic data analysis through multiple regression (e.g., POLS W4910) and knowledge of basic calculus and matrix algebra. More mathematical treatment of topics covered in POLS W4911. Examines problems encountered in multivariate analysis of cross-sectional and time-series data.
STAT W3026 Applied Data Mining 3 pts. L. Hannah
Data Mining is a dynamic and fast growing field at the interface of Statistics and Computer Science. The emergence of massive datasets containing millions or even billions of observations provides the primary impetus for the field. Such datasets arise, for instance, in large-scale retailing, telecommunications, astronomy, computational and statistical challenges. This course will provide an overview of current practice in data mining. Specific topics covered will include databases and data warehousing, exploratory data analysis and visualization, descriptive modeling, predictive modeling, pattern and rule discovery, text mining, Bayesian data mining, and causal inference. The use of statistical software will be emphasized.
STAT W4291 Advanced Data Analysis Jingchen Liu
This is a course on getting the most out of data. The emphasis will be on hands-on experience, involving case studies with real data and using common statistical packages. The course covers, at a very high level, exploratory data analysis, model formulation, goodness of fit testing, and other standard and non-standard statistical procedures, including linear regression, analysis of variance, nonlinear regression, generalized linear models, survival analysis, time series analysis, and modern regression methods. Students will be expected to propose a data set of their choice for use as case study material.
STAT W4282 Linear Regression and Time Series Methods Flavio C Bartman and Duncan Szeto
A one semester course covering: Simple and multiple regression, including testing, estimation, and confidence procedures, modeling, regression diagnostics and plots, polynomial regression, colinearity and confounding, model selection, geometry of least squares, linear time series models, auto-regressive, moving average and ARIMA models, estimation and forecasting with time series models, confidence intervals, and prediction error. Students may not receive credit for more than two of STAT W4315, W4437, and W4440.
We encourage students to explore course offings outside GSAS. Some popular options are listed below. Check out each schools' website for comprehensive listsings.
Be aware that these schools have their own distinct registration procedures. Visit the Registration page for full inscruitions.
School of International and Public Affairs
INAF U6045x or y International Capital Markets 3 pts.
The course will acquaint you modern international capital markets. You can expect to learn a substantial amount of up-to-date detail and some useful theory. Specifically, we will survey global markets for credit, equity, foreign exchange, foreign exchange derivatives, futures, interest rate swaps, credit default swaps, and asset backed securities. In each case, students will learn the highlights of payments and settlement, documentation, regulation, applications for end-users, related economic theory, and pricing models. The class will cover options and asset pricing theory; however, the treatment will be informal and designed to help develop intuition. One lecture each will be devoted to international banking (with an emphasis on changing capital regulation), investment banks, and hedge funds.
EDPA 4050 001 Logic and Design of Research in Education Policy and Social Analysis
An introduction to understanding, designing, and conducting empirical research for education policy and the social sciences. Students explore philosophical foundations of research, the relationship between theory and evidence in research, and the mechanics of designing and conducting research, including strategies for sampling, data collection, and analysis. Quantitative, qualitative, and mixed methods approaches to research are addressed. This course is appropriate for students with little prior exposure to social science research.
Mailman School of Public Health
BIST P6104 - Introduction to Biostatistical Methods
Course pre-requisites: Placement exam required, and the instructor's permission
Enrollment priorities: Priority given to BIO students
Like many fields of learning, biostatistics has its own vocabulary often seen in medical and public health literature. Phrases like "statistical significance", "p-value less than 0.05", "95% confident", and "margin of error" can have enormous impact in a world that relies on statistics to make decisions: Should Drug A be recommended over Drug B? Should a national policy on X be implemented? Does Vitamin C truly prevent colds? However, do we really know what these terms and phrases mean? Understanding the theory and methodology behind study design, estimation and hypothesis testing is crucial to ensuring that findings and practices in public health and biomedicine are supported by reliable evidence. This course covers the basic tools for the collection, analysis, and presentation of data. Central to these skills is assessing the impact of chance and variability on the interpretation of research findings and subsequent recommendations for public health practice and policy. Topics covered include: general principles of study design; estimation; hypothesis testing; several methods for comparison of discrete and continuous data including chi square test of independence, ttest, ANOVA, correlation, regression and logistic regression. This introductory course is a Core Course for the Biostatistics Department and is mandatory for all MS in Biostatistics students.
Columbia Business School
B8131-001: Sports Analytics
Sports analytics refers to the use of data and quantitative methods to measure performance and make decisions to gain advantage in the competitive sports arena. This course builds on the Business Analytics core course and is designed to help students to develop and apply analytical skills that are useful in business, using sports as the application area. These skills include critical thinking, mathematical modeling, statistical analysis, predictive analytics, game theory, optimization and simulation. These skills will be applied to sports in this course, but are equally useful in many areas of business.There will be three main topics in the course: (1) measuring and predicting player and team performance, (2) decision-making and strategy in sports, and (3) fantasy sports and sports betting. Typical questions addressed in sports analytics include: How to rank players or teams? How to predict future performance of players or teams? How much is a player on a team worth? How likely are extreme performances, i.e., streaks? Are there hot-hands in sports performances? Which decision is more likely to lead to a win (e.g., attempt a stolen base or not in baseball, punt or go for it on fourth down in football, dump and chase or not in hockey, pull the goalie or not in hockey)? How to form lineups in daily fantasy sports? How to manage money in sports betting? How to analyze various "prop'' bets?