QMSS provides students with a basis in quantitative skills for social science research through its core curriculum, with the flexibility to engage in interdisciplinary pursuits through their elective selections. Students can select these classes from the elective offerings by QMSS, or draw from the wider university through crossregistration in the many departments across Columbia that QMSS has developed strong relationships with over the years. Below are descriptions of the courses offered by QMSS as well as a selection of courses around the university that may be of interest. Students are advised to work with the program to determine a course of study that suits goals for program design as well as to receive guidance on their options for study throughout the university.
See our Registration Procedures page for the enrollment process for both QMSS and nonQMSS Students.
QMSS students read about the various degree tracks/focuses HERE.
Core QMSS Courses
The five courses below or their equivalents are completed by ALL QMSS Students.
Theory and Methodology (QMSS GR5010)
This interdisciplinary course, taken in the fall semester, is a comprehensive introduction to quantitative research in the social sciences. The course focuses on foundational ideas of social science research, including strengths and weaknesses of different research designs, interpretation of data drawn from contemporary and historical contexts, and strategies for evaluating evidence. The majority of the course is comprised of twoweek units examining particular research designs, with a set of scholarly articles that utilize that design. Topics include: the “science” of social science and the role of statistical models, causality and causal inference, concepts and measurement, understanding human decision making, randomization and experimental methods, observation and quasiexperimentation, sampling, survey research, and working with archival data.
Data Analysis Requirement (QMSS GR5015).
The data analysis course covers specific statistical tools used in social science research using the statistical program R. Topics to be covered include statistical data structures, and basic descriptives, regression models, multiple regression analysis, interactions, polynomials, GaussMarkov assumptions and asymptotics, heteroskedasticity and diagnostics, models for binary outcomes, naive Bayes classifiers, models for ordered data, models for nominal data, first difference analysis, factor analysis, and a review of models that build upon OLS. Prerequisite: introductory statistics course that includes linear regression.
Equivalents:
Advanced Econometrics (ECON W4412) Economics Focus Only
Students who are planning on pursuing the economics focus are required to take this course. This course is intended for students who already have a firm grasp of introductory level econometrics, and interested in advanced topics including asymptotic theory. The prerequisites are linear algebra, intermediate microeconomics, macroeconomics, and econometrics. Topics to be covered include OLS in matrix form, finite sample and asymptotic properties, hypothesis testing, GLS, maximum likelihood, endogeneity, stationary time series, nonstationary time series, panel data, and discrete choice models.
Probability and Statistics (STAT W5701) Data Science Concentration Only
Students who are planning on pursuing the data science focus are required to take this course. This course is a selfcontained introduction to probability and statistics with a focus on data science. The topics covered include fundamentals of probability theory and statistical inference, including: probabilistic models, random variables, useful distributions, expectations, the law of large numbers, the central limit theorem, point and confidence interval estimation, maximum likelihood methods, hypothesis tests, and linear regression (as time permits).
Research Seminar I & II (QMSS GR5021 & GR5022)
This course is designed to expose students in the QMSS degree program to different methods and practices of social science research. Seminar presentations are given on a wide range of topics by faculty from Columbia and other New York City universities, as well as researchers from private, government, and nonprofit settings. QMSS students participate in a weekly seminar. Speakers include faculty from Columbia and other universities, and researchers from the numerous corporate, government, and nonprofit settings where quantitative research tools are used. Topics have included: NowCasting and the RealTime DataFlow; Art, Design & Science in Data Visualization; Educational Attainment and School Desegregation: Evidence from Randomized Lotteries; Practical Data Science: North American Oil and Gas Drilling Data.
Master's Thesis (GR5999)
All students must complete an MA thesis, which involves original statistical analysis, under the supervision of the student's advisor and the QMSS program director. Students should register for this course in the last semester of their program
TrackSpecific NonQMSS Requirements
ECONOMICS FOCUS
Advanced Econometrics (ECON GU4212)
Seyhan Erden
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Students must register for required discussion section. The linear regression model will be presented in matrix form and basic asymptotic theory will be introduced. The course will also introduce students to basic time series methods for forecasting and analyzing economic data. Students will be expected to apply the tools to real data.
VIEW PREVIOUS SYLLABUS HERE
Advanced Macroeconomics (ECON GU4213)
Andres Drenik
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Required discussion section ECON GU4214 An introduction to the dynamic models used in the study of modern macroeconomics. Applications of the models will include theoretical issues such as optimal lifetime consumption decisions and policy issues such as inflation targeting. This course is strongly recommended for students considering graduate work in economics.
VIEW PREVIOUS SYLLABUS HERE
Advanced Microeconomics (ECON GU4211)
Andres Drenik
Prerequisites: ECON UN3211 and ECON UN3213 and ECON UN3412 and MATH UN2010 Required discussion section ECON GU4214 An introduction to the dynamic models used in the study of modern macroeconomics. Applications of the models will include theoretical issues such as optimal lifetime consumption decisions and policy issues such as inflation targeting. This course is strongly recommended for students considering graduate work in economics.
VIEW PREVIOUS SYLLABUS HERE
DATA SCIENCE CONCENTRATION
Probability and Statistics (STAT W5701)
Prerequisites: Calculus This course covers the following topics: Fundamentals of probability theory and statistical inference used in data science; Probabilistic models, random variables, useful distributions, expectations, law of large numbers, central limit theorem; Statistical inference; point and confidence interval estimation, hypothesis tests, linear regression.
VIEW PREVIOUS SYLLABUS HERE
Algorithms for Data Science (CSOR W4246)
Prerequisites: basic knowledge in programming (e.g., at the level of COMS W1007), a basic grounding in calculus and linear algebra. Methods for organizing data, e.g. hashing, trees, queues, lists, priority queues. Streaming algorithms for computing statistics on the data. Sorting and searching. Basic graph models and algorithms for searching, shortest paths, and matching. Dynamic programming. Linear and convex programming. Floating point arithmetic, stability of numerical algorithms, Eigenvalues, singular values, PCA, gradient descent, stochastic gradient descent, and block coordinate descent. Conjugate gradient, Newton and quasiNewton methods. Large scale applications from signal processing, collaborative filtering, recommendations systems, etc.
VIEW PREVIOUS SYLLABUS HERE
Exploratory Data Analysis & Visualization (STAT GR5702)
Prerequisites: A course in computer programming. This course covers visual approaches to exploratory data analysis, with a focus on graphical techniques for finding patterns in high dimensional datasets. We consider data from a variety of fields, which may be continuous, categorical, hierarchical, temporal, and/or spatial in nature. We cover visual approaches to selecting, interpreting, and evaluating models/algorithms such as linear regression, time series analysis, clustering, and classification.
VIEW PREVIOUS SYLLABUS HERE
Machine Learning for Data Science (COMS W4721)
This course provides an introduction to supervised and unsupervised techniques for machine learning. We will cover both probabilistic and nonprobabilistic approaches to machine learning. Focus will be on classification and regression models, clustering methods, matrix factorization and sequential models. Methods covered in class include linear and logistic regression, support vector machines, boosting, Kmeans clustering, mixture models, expectationmaximization algorithm, hidden Markov models, among others. We will cover algorithmic techniques for optimization, such as gradient and coordinate descent methods, as the need arises.
VIEW PREVIOUS SYLLABUS HERE
Elective Courses
QMSS Students typically take between 4 and 6 elective courses. Any 4000level or above course offered by QMSS, Computer Science, IEOR, Economics, Statistics, Psychology, Political Science, Sociology, History, Mathematics, or SIPA will satisfy one of these requirements. 4000level courses outside these departments MAY satisfy an elective requirement but require approval by the Director of QMSS. Send a copy of the syllabus to <qmss@columbia.edu> for approval to count a course towards your degree progress.

NOTE: Approval of a course does not grant you permission to enroll in a nonQMSS course. That is controlled by the course instructor. CLICK HERE for more detailed registration procedures.
Each focus has its own guidelines regarding elective distribution, so be sure to read your Degree Requirements worksheet carefully.
Some popular elective courses are listed below. Be aware that course listings are always subject to change. You should always check in the Columbia Directory of Classes for the most uptodate information.
QMSS Electives
*Please note that some QMSS classes are only offered once per academic year. Those courses are enumerated below. All other QMSS classes are offered during both fall and spring semesters. Summer semester classes change yearly depending upon need and availability.*
EXCLUSIVELY FALL CLASSES
GR5010 QUANTITATIVE THEORY & METHODOLOGY
GR5016 REGRESSION MODELTEMP PROCESS
GR5070 GIS & SPATIAL ANALYSISSOC SCI
GR5058 DATA MINING FOR SOCIAL SCIENCE
EXCLUSIVELY SPRING CLASSES
GR5018 ADV ANALYTIC TECHNIQUES
GR5062 SOCIAL NETWORK ANALYSIS
GR5063 DATA VISUALIZATION
GR5065 BAYESIAN STATS FOR THE SOC SCI
GR5069 APPLIED DATA SCI FOR SOC SCIENTISTS
QMSS Students in the Flexible Focus must take TWO Research Methods Electives. We Strongly encourage they fulfill this requirement through QMSS department electives. All QMSS students are guaranteed a seat in QMSS electives (that do not directly conflict with other enrollments.)
Time Series, Panel Data, and Forecasting (QMSS GR5016)
This course will introduce students to the main concepts and methods behind regression analysis of temporal processes and highlight the benefits and limitations of using temporally ordered data. Students study the complementary areas of time series data and longitudinal (or panel) data. There are no formal prerequisites for the course, but a solid understanding of the mechanics and interpretation of OLS regression will be assumed (we will briefly review it at the beginning of the course). Topics to be covered include regression with panel data, probit and logit regression of pooled crosssectional data, differenceindifference models, time series regression, dynamic causal effects, vector autoregressions, cointegration, and GARCH models. Statistical computing will be carried out in R.
Advanced Analytic Techniques (QMSS GR5018)
This course is meant to train students in advanced quantitative techniques in the social sciences. Statistical computing will be carried out in R. Topics include: review of multiple/linear regression, review of logistic regression, generalized linear models, models with limited dependent variables, first differences analysis, fixed effects, random effects, lagged dependent variables, growth curve analysis, instrumental variable and twostage least squares, natural experiments, regression discontinuity, propensity score matching, multilevel models or hierarchical linear models, and textbased quantitative analysis.
Practicum in Data Analysis (QMSS GR5052)
This practicum course is meant to offer valuable training to students. Specifically, this practicum will mimic the typical conditions that students would face in an internship in a large dataintense institution. The practicum will focus on four core elements involved in most internships: (1) Developing the intuition and skills to properly scope ambiguous project ideas; (2) practicing organizing and accessing a variety of largescale data sources and formats; (3) conducting basic and advanced analysis of big data; and (4) communicating and “productizing” results and findings from the earlier steps, in things like dashboards, reports, interactive graphics, or apps. The practicum will also give students time to reflect on their work, and how it would best translate into corporate, nonprofit, startup and other contexts.
Data Mining for Social Science (QMSS GR5058)
The class is roughly divided into two parts: 1. programming best practices, exploratory data analysis (EDA), and unsupervised learning 2. supervised learning including regression and classification methods In the first part of the course we will focus writing R programs in the context of simulations, data wrangling, and EDA. Unsupervised learning is focused on problems where the outcome variable is not known and the goal of the analysis is to find hidden structure in data such as different market segments from buying patterns or human population structure from genetic data. Supervised learning deals with prediction problems where the outcome variable is known such as predicting the price of a house in a certain neighborhood or an outcome of a congressional race.
Internship (QMSS GR5050 & QMSS GR5051)
Students enrolled in the Quantitative Methods in the Social Sciences MA program have a number of opportunities for internships with various organizations in New York City. All internships will be graded on a pass/fail basis.
An internship must meet the following criteria:
 It is related to the core issues of concern to the MA Program in Quantitative Methods in the Social Sciences.
 The work is substantive (although students may perform some administrative tasks, we want to ensure that they receive experience in substantive research).
 It is a practical, professional experience.
Social Network Analysis (QMSS GR5062)
The course is designed to teach students the foundations of network analysis including how to manipulate, analyze and visualize network data themselves using statistical software. We will focus on using the statistical program R for most of the work. Topics will include measures of network size, density, and tie strength, measures of network diversity, sampling issues, making egonets from whole networks, distance, dyads, homophily, balance and transitivity, structural holes, brokerage, measures of centrality (degree, betweenness, closeness, eigenvector, beta/Bonacich), statistical inference using network data, community detection, affiliation/bipartite networks, clustering and small worlds; positions, roles and equivalence; visualization, simulation, and network evolution over time.
Data Visualization (QMSS GR5063)
This course is designed to the interdisciplinary and emerging field of data science. It will cover techniques and algorithms for creating effective visualizations based on principles from graphic design, visual art, perceptual psychology, and cognitive science to enhance the understanding of complex data. Students will be required to complete several scripting, data analysis and visualization design assignments as well as a final project. Topics include: data and image models, social and interactive visualizations, principles and designs, perception and attention, mapping and cartography, network visualization. Computational methods are emphasized and students will be expected to program in R, Javascript, D3, HTML and CSS and will be expected to submit and peer review work through Github. Students will be expected to write up the results of the project in the form of a conference paper submission.
Bayesian Statistics for the Social Sciences (QMSS GR5065)
An introduction to Bayesian statistical methods with applications to the social sciences. Considerable emphasis will be placed on regression modeling and model checking. The primary software used will be Stan, which students do not need to be familiar with in advance. Students in the course will access the Stan library via R, so some experience with R would be helpful but not required. Any QMSS student is presumed to have sufficient background. Any nonQMSS students interested in taking this course should have a comparable background to a QMSS student in basic probability. Topics to be covered are a review of calculus and probability, Bayesian principles, prediction and model checking, linear regression models, Bayesian data collection, Bayesian calculations, Stan, the BUGS language and JAGS, hierarchical linear models, nonlinear regression models, missing data, stochastic processes, and decision theory.
Natural Language Processing (QMSS GR5067)
Social scientists need to engage with natural language processing (NLP) approaches that are found in computer science, engineering, AI, tech and in industry. This course will provide an overview of natural language processing as it is applied in a number of domains. The goal is to gain familiarity with a number of critical topics and techniques that use text as data, and then to see how those NLP techniques can be used to produce social science research and insights. This course will be handson, with several largescale exercises. The course will start with an introduction to Python and associated key NLP packages and github. The course will then cover topics like language modeling; part of speech tagging; parsing; information extraction; tokenizing; topic modeling; machine translation; sentiment analysis; summarization; supervised machine learning; and hidden Markov models. Prerequisites are basic probability and statistics, basic linear algebra and calculus. The course will use Python, and so if students haveprogrammedd in at least one software language, that will make it easier to keep up with the course/
Applied Data Science for Social Science (QMSS GR5069)
In his now classic Venn diagram, Drew Conway described Data Science as sitting at the intersection between good hacking skills, math and statistics knowledge, and sub stantive expertise. As a result of normal instruction, social scientists possess a uid combination of all three but also bring an additional layer to the mix. We have acquired slightly dierent training, skills and expertise tailored to understand human behavior, and to explain why things happen the way they do. Social scientists are, thus, a particular kind of data scientist. This course is a collection of topics that ll very specic gaps identied over the years on what a social scientist should know at minimum when entering data science, and what a data scientist should know to hit the ground running and add immediate value to their teams.
GIS and Spatial Analysis for Social Science (QMSS GR5070)
This course introduces students to basic spatial analytic skills. It covers introductory concepts and tools in Geographic Information Systems (GIS) and database management. As well, the course introduces students to the process of developing and writing an original spatial research project. Topics to be covered include: social theories involving space, place and reflexive relationships; social demography concepts and databases; visualizing social data using geographic information systems; exploratory spatial data analysis of social data and spatially weighted regression models, spatial regression models of social data, and spacetime models. Use of opensource software (primarily the R software package) will be taught as well..
Modern Data Structures (QMSS GR5072)
This course is intended to provide a detailed tour of how to access, clean, “munge” and organize data, both big and small. (It should also give students a flavor of what would be expected of them in a typical data science interview.) Each week will have simple, moderate and complex examples in class, with code to follow. Students will then practice additional exercises at home. The end point of each project would be to get the data organized and cleaned enough so that it is in a dataframe, ready for subsequent analysis and graphing. Therefore, no analysis or visualization (beyond just basic tables and plots to make sure everything was correctly organized) will be taught; and this will free up substantial time for the “nittygritty” of all of this data wrangling.
Machine Learning for Social Sciences (QMSS GR5073)
This course will provide a comprehensive overview of machine learning as it is applied in a number of domains. Comparisons and contrasts will be drawn between this machine learning approach and more traditional regressionbased approaches used in the social sciences. Emphasis will also be placed on opportunities to synthesize these two approaches. The course will start with an introduction to Python, the scikitlearn package and GitHub. After that, there will be some discussion of data exploration, visualization in matplotlib, preprocessing, feature engineering, variable imputation, and feature selection. Supervised learning methods will be considered, including OLS models, linear models for classification, support vector machines, decision trees, and random forests, and gradient boosting. Calibration, model evaluation and strategies for dealing with imbalanced datasets, nonnegative matrix factorization, and outlier detection will be considered next. This will be followed by unsupervised techniques: PCA, discriminant analysis, manifold learning, clustering, mixture models, cluster evaluation. Lastly, we will consider neural networks, convolutional neural networks for image classification and recurrent neural networks. This course will primarily us Python. Previous programming experience will be helpful but not requisite. Prerequisites: basic probability and statistics, basic linear algebra, and calculus.
Projects in Advanced Machine Learning (QMSS GR5074)
Machine learning algorithms continue to advance in their capacity to predict outcomes and rival human judgment in a variety of settings. This course is designed to offer insight into advanced machine learning models, including Deep Learning, Convolutional Neural Networks for image and text data, Object detection models, Recurrent Neural Networks (Timeseries data), and Adversarial Neural Networks. Students are expected to have familiarity with using Python, the scikitlearn package, and Github. Roughly half of the course will engage machine learning methods while the other half of the course will be devoted to students working in key substantive areas, where advanced machine learning will prove helpful  areas like computer vision and images, text and natural language processing, and tabular data. Students will be tasked to develop team projects in these areas and they will develop a public portfolio of three (or four) meaningful projects. By the end of the course, students will be able to show their work by launching their models in live REST APIs and webapplications. Prerequisites are basic probability and statistics, basic linear algebra and calculus. Students are expected to have familiarity with using Python, the Scikitlearn package, and Github.
Independent Study (QMSS GR5998)
Students develop a course of study under the supervision of a faculty member. Please see the QMSS program coordinator for more details.
NONQMSS Electives
Below is just a sampling of some popular course offered through other departments that may be counted towards the QMSS degree. Any 4000level or above course offered by QMSS, Computer Science, Economics, Statistics, Political Science, Sociology, or Mathematics will satisfy one of these requirements. For full listings from each department, see the Directory of Classes.
Computer Science
COMS W4170x User interface design
Prerequisites: COMS W3137. Introduction to the theory and practice of computer user interface design, emphasizing the software design of graphical user interfaces. Topics include basic interaction devices and techniques, human factors, interaction styles, dialogue design, and software infrastructure. Design and programming projects are required.
COMS W4705x Natural language processing 3 pts.
Prerequisites: COMS W3133, or W3134, or W3137, or W3139, or the instructor's permission. Computational approaches to natural language generation and understanding. Recommended preparation: some previous or concurrent exposure to AI or Machine Learning. Topics include information extraction, summarization, machine translation, dialogue systems, and emotional speech. Particular attention is given to robust techniques that can handle understanding and generation for the large amounts of text on the Web or in other large corpora. Programming exercises in several of these areas.
Economics
ECON G4301x Economic Growth and Development
Prerequisites: Econ W3211 and W3213. Empirical findings on economic development, theoretical development models; problems of efficient resource allocation in a growing economy; balanced and unbalanced growth in closed and open economic systems; the role of capital accumulation and innovation in economic growth.
ECON W4415: Game Theory
Prerequisites: ECON W3211 and W3213.
Introduction to the systematic treatment of game theory and its applications in economic analysis.
ECON W4020: Economics of Uncertainty and Information
Prerequisites: ECON W3211, W3213 and STAT 1201.
Topics include behavior uncertainty, expected utility hypothesis, insurance, portfolio choice, principle agent problems, screening and signaling, and information theories of financial intermediation.
Mathematics
MATH W4061x Introduction to Modern Analysis
Prerequisites: MATH V1202 or the equivalent and V2010. The second term of this course may not be taken without the first.
Real numbers, metric spaces, elements of general topology. Continuous and differential functions. Implicit functions. Integration. Change of variables. Function spaces.
MATH W5010 Introduction to the Mathematics of Finance
Prerequisites: MATH V1202, MATH V3027, STAT W5203, SIEO W3001, or their equivalents.
The mathematics of finance, principally the problem of pricing of derivative securities, developed using only calculus and basic probability. Topics include mathematical models for financial instruments, Brownian motion, normal and lognormal distributions, the BlackûScholes formula, and binomial models.
Political Science
POLS W4700x Mathematical Methods for Political Science
Provides students of political science with a basic set of tools needed to read, evaluate, and contribute in research areas that increasingly utilize sophisticated mathematical techniques.
POLS W4710x Principles of Quantitative Political Research
Introduction to the use of quantitative techniques in political science and public policy. Topics include descriptive statistics and principles of statistical inference and probability through analysis of variance and ordinary leastsquares regression. Computer applications are emphasized.
POLS W4714x Multivariate Political Analysis
Prerequisite: basic data analysis through multiple regression (e.g., POLS W4910) and knowledge of basic calculus and matrix algebra. More mathematical treatment of topics covered in POLS W4911. Examines problems encountered in multivariate analysis of crosssectional and timeseries data.
Statistics
STAT W4243 Applied Data Science
This course will incorporate knowledge and skills covered in a statistical curriculum with topics and projects in data science. Programming will be covered using existing tools in R. Computing best practices will be taught using testdriven development, version control, and collaboration. Students finish the class with a portfolio of projects, and deeper understanding of several core statistical/machinelearning algorithms. Short project cycles throughout the semester provide students extensive handson experience with various datadriven applications. Description Prerequisites: Prerequisite for this course includes working knowledge in Statistics and Probability, data mining, statistical modeling and machine learning. Prior programming experience in R or Python is required.
STAT W4291 Advanced Data Analysis
This is a course on getting the most out of data. The emphasis will be on handson experience, involving case studies with real data and using common statistical packages. The course covers, at a very high level, exploratory data analysis, model formulation, goodness of fit testing, and other standard and nonstandard statistical procedures, including linear regression, analysis of variance, nonlinear regression, generalized linear models, survival analysis, time series analysis, and modern regression methods. Students will be expected to propose a data set of their choice for use as case study material.
STAT W4282 Linear Regression and Time Series Methods
A one semester course covering: Simple and multiple regression, including testing, estimation, and confidence procedures, modeling, regression diagnostics and plots, polynomial regression, colinearity and confounding, model selection, geometry of least squares, linear time series models, autoregressive, moving average and ARIMA models, estimation and forecasting with time series models, confidence intervals, and prediction error. Students may not receive credit for more than two of STAT W4315, W4437, and W4440.
Outside GSAS
We encourage students to explore course offings outside GSAS. Some popular options are listed below. Check out each schools' website for comprehensive listsings.
Be aware that these schools have their own distinct registration procedures. Visit the Registration page for full inscruitions.
School of International and Public Affairs
INAF U6045x or y International Capital Markets 3 pts.
The course will acquaint you modern international capital markets. You can expect to learn a substantial amount of uptodate detail and some useful theory. Specifically, we will survey global markets for credit, equity, foreign exchange, foreign exchange derivatives, futures, interest rate swaps, credit default swaps, and assetbacked securities. In each case, students will learn the highlights of payments and settlement, documentation, regulation, applications for endusers, related economic theory, and pricing models. The class will cover options and asset pricing theory; however, the treatment will be informal and designed to help develop intuition. One lecture each will be devoted to international banking (with an emphasis on changing capital regulation), investment banks, and hedge funds.
Teacher's College
EDPA 4050 001 Logic and Design of Research in Education Policy and Social Analysis
An introduction to understanding, designing, and conducting empirical research for education policy and the social sciences. Students explore philosophical foundations of research, the relationship between theory and evidence in research, and the mechanics of designing and conducting research, including strategies for sampling, data collection, and analysis. Quantitative, qualitative, and mixed methods approaches to research are addressed. This course is appropriate for students with little prior exposure to social science research.
Mailman School of Public Health
BIST P6104  Introduction to Biostatistical Methods
Course prerequisites: Placement exam required, and the instructor's permission
Enrollment priorities: Priority given to BIO students
Like many fields of learning, biostatistics has its own vocabulary often seen in medical and public health literature. Phrases like "statistical significance", "pvalue less than 0.05", "95% confident", and "margin of error" can have enormous impact in a world that relies on statistics to make decisions: Should Drug A be recommended over Drug B? Should a national policy on X be implemented? Does Vitamin C truly prevent colds? However, do we really know what these terms and phrases mean? Understanding the theory and methodology behind study design, estimation and hypothesis testing is crucial to ensuring that findings and practices in public health and biomedicine are supported by reliable evidence. This course covers the basic tools for the collection, analysis, and presentation of data. Central to these skills is assessing the impact of chance and variability on the interpretation of research findings and subsequent recommendations for public health practice and policy. Topics covered include: general principles of study design; estimation; hypothesis testing; several methods for comparison of discrete and continuous data including chi square test of independence, ttest, ANOVA, correlation, regression and logistic regression. This introductory course is a Core Course for the Biostatistics Department and is mandatory for all MS in Biostatistics students.
Columbia Business School
B8131001: Sports Analytics
Sports analytics refers to the use of data and quantitative methods to measure performance and make decisions to gain advantage in the competitive sports arena. This course builds on the Business Analytics core course and is designed to help students to develop and apply analytical skills that are useful in business, using sports as the application area. These skills include critical thinking, mathematical modeling, statistical analysis, predictive analytics, game theory, optimization and simulation. These skills will be applied to sports in this course, but are equally useful in many areas of business.There will be three main topics in the course: (1) measuring and predicting player and team performance, (2) decisionmaking and strategy in sports, and (3) fantasy sports and sports betting. Typical questions addressed in sports analytics include: How to rank players or teams? How to predict future performance of players or teams? How much is a player on a team worth? How likely are extreme performances, i.e., streaks? Are there hothands in sports performances? Which decision is more likely to lead to a win (e.g., attempt a stolen base or not in baseball, punt or go for it on fourth down in football, dump and chase or not in hockey, pull the goalie or not in hockey)? How to form lineups in daily fantasy sports? How to manage money in sports betting? How to analyze various "prop'' bets?