I teach training courses, varying in length from 1 to 5 days, on data science and statistics. The topics I cover include Bayesian data analysis, multilevel modelling, nonlinear regression, structural equation modelling and path analysis, reproducible research, general data analysis and statistics using R, data science and scientific programming with Python.

Introduction to Stan for Bayesian Data Analysis

January 17 to January 20, 2022

Online course

Stan is “a state-of-the-art platform for statistical modeling and high-performance statistical computation. Thousands of users rely on Stan for statistical modeling, data analysis, and prediction in the social, biological, and physical sciences, engineering, and business.” In this course, we provide a general introduction to the Stan language, and describe how to use it to develop and run Bayesian models. We begin by first covering the theory behind Stan, which covers Bayesian inference, Markov Chain Monte Carlo (MCMC) for sampling from probability distributions, and the efficient Hamiltonian Monte Carlo (HMC) method that Stan implements. Next, we learn how to write Stan models by creating simple Bayesian such as binomial models and models using normal distributions. In so doing, the basics of the Stan language will be apparent. We then cover some widely used and practically useful models including linear regression, logistic regression, multilevel and mixed effects models. We will end by covering some more complex models, including probabilistic mixture models.

Bayesian data analysis

January 10 to January 14, 2022

Online course

The aim of this course is to provide a solid introduction to Bayesian methods, both theoretically and practically. We will begin by teaching the fundamental concepts of Bayesian inference and Bayesian modelling, including how Bayesian methods differ from their classical statistics counterparts, and show how to do Bayesian data analysis in practice in R. We then provide a solid introduction to Bayesian approaches to these topics using R and the brms package. We begin by covering Bayesian approaches to linear regression. We will then proceed to Bayesian approaches to generalized linear models, including binary logistic regression, ordinal logistic regression, Poisson regression, zero-inflated models, etc. Finally, we will cover Bayesian approaches to multilevel and mixed effects models. Throughout this course, we will be using, via the brms package, Stan based Markov Chain Monte Carlo (MCMC) methods.

Model selection and model simplification

November 24 to November 25, 2021

Online course

This two day course covers the important and general topics of statistical model building, model evaluation, model selection, model comparison, model simplification, and model averaging. We cover topics such how to measure model fit and a model’s predictive performance, nested model comparison, out-of-sample predictive performance, over-fitting, leave-one-out cross-validation, information criteria, variable selection, stepwise regression, ridge regression, Lasso, elastic nets, model averaging, Bayesian methods of model comparison and evaluation.

Introduction to Machine Learning and Deep Learning using R

November 17 to November 18, 2021

Online course

In this two day course, we provide an introduction to machine learning and deep learning using R. We begin by providing an overview of the machine learning and deep learning landscape, and then turn to some major machine learning applications. We cover binary and multiclass classification problems using logistic regression, naive Bayes classifiers, support vector machines, decision trees, and random forests. We then look at unsupervised learning methods, including clustering methods and mixture models. We then cover artificial neural networks and deep learning using Torch in R.

Introduction to Multilevel (hierarchical, or mixed effects) Models

November 10 to November 11, 2021

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to multilevel models, also known as hierarchical or mixed effects models. We will focus primarily on multilevel linear models, but also cover multilevel generalized linear models. Likewise, we will also describe Bayesian approaches to multilevel modelling.

Introduction to Generalized Linear Models

November 3 to November 4, 2021

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is, for example, a binary, or ordinal, or count variable, etc. The specific models we cover include binary, binomial, ordinal, and categorical logistic regression, Poisson and negative binomial regression for count variables. We will also cover zero-inflated Poisson and negative binomial regression models.

Introduction to R and Rstudio

October 20, 2021

Online course

In this one day course, we provide a comprehensive introduction to R and how it can be used for data science and statistics. We begin by providing a thorough introduction to RStudio, which is the most popular and powerful interfaces for using R. We then introduce all the fundamentals of the R language and R environment: variables and assignment, data structures, operators, functions, scripts, packages, projects, etc. We end with an introduction to data processing and formatting (aka, data wrangling), an introduction to data visualization, and introduce how to some of the most widely used statistical methods such as linear regression, etc. From this course, you will gain a comprehensive introduction to R, which will serve as foundation for progressing further with R to any kind of data analysis, data science, or statistics.

Introduction to Python for Data Science

October 13 to October 15, 2021

Online course

Python is one of the most widely used and highly valued programming languages in the world, and is especially widely used in data science, machine learning, and in other scientific computing applications. Python can provide a complementary toolbox to what is available in R Statistical Software. This course provides both a general introduction to programming with Python and a comprehensive introduction to using Python for data science. The course will also cover the use of Python through the R Statistical software. The major topics that we will cover include the following: the fundamentals of general purpose programming in Python; numerical computing using numpy; data processing and manipulations using pandas; parallel processing. Finally, this course will address the integration of Python into R (i.e. how to use Python from R). Overall, this course aims to provide a solid introduction to Python generally as a programming language, and to its principal tools for doing data science, machine learning, and scientific computing.

Introduction to Data Wrangling and Data Visualization using R

October 4 to October 8, 2021

Online course

In this course, we provide a comprehensive practical introduction to data wrangling and data visualization using R. In the coverage of data wrangling, we will cover tools provided by R’s tidyverse, including dplyr, tidyr, purrr, etc. We will cover how to read data of different types into R using readr and related packages, and then cover in detail all the dplyr tools such as select, filter, mutate, summarize, etc. We will also cover the pipe operator (%>%) to create data wrangling pipelines that take raw messy data on the one end and return cleaned tidy data on the other. We will also how to reshape data using pivots, and how to merge data sets using merge operations. For the topic of visualization, we provide a comprehensive introduction to data visualization in R using ggplot. We begin by covering the major types of plots for visualizing distributions of univariate data: histograms, density plots, barplots, and Tukey boxplots. In all of these cases, we will consider how to visualize multiple distributions simultaneously on the same plot using different colours and “facet” plots. We then turn to the visualization of bivariate data using scatterplots. Here, we will explore how to apply linear and nonlinear smoothing functions to the data, how to add marginal histograms to the scatterplot, add labels to points, and scale each point by the value of a third variable.

Introduction to Data Science using R

September 14 to September 23, 2021

Online course

In this four day course (Sept 14, Sept 16, Sept 21, Sept 23), we provide an introduction to some advanced topics in statistics and data science. In particular, we will cover the following topics: Data wrangling using dplyr and tidyr; data visualization using ggplot; how to read in and process and visualize NetCDF satellite data; programming in R using loops and functionals; reproducible data analysis with RMarkdown; general linear models; generalized linear models, including models for count data.

Introduction to Machine Learning and Deep Learning using R

June 9 to June 10, 2021

Online course

In this two day course, we provide an introduction to machine learning and deep learning using R. We begin by providing an overview of the machine learning and deep learning landscape, and then turn to some major machine learning applications. We cover binary and multiclass classification problems using logistic regression, naive Bayes classifiers, support vector machines, decision trees, and random forests. We then look at unsupervised learning methods, including clustering methods and mixture models. We then cover artifical neural networks and deep learning using Torch in R.

Introduction to Bayesian Data Analysis

May 28, 2021

Online course

Bayesian methods are now increasingly widely in data analysis across most scientific research fields. Given that Bayesian methods differ conceptually and theoretically from their classical statistical counterparts that are traditionally taught in statistics courses, many researchers do not have opportunities to learn the fundamentals of Bayesian methods, which makes using Bayesian data analysis in practice more challenging. This seminar provides an introduction to Bayesian methods, both theoretically and practically. We will overview the fundamental concepts of Bayesian inference and Bayesian modelling, including how Bayesian methods differ from their classical statistics counterparts, and show how to do Bayesian data analysis in practice in R.

Bayesian Approaches to Regression and Mixed Effects Models using R and brms

May 26 to May 27, 2021

Online course

Bayesian methods are now increasingly widely used for data analysis based on linear and generalized linear models, and multilevel and mixed effects models. The aim of this course is to provide a solid introduction to Bayesian approaches to these topics using R and the brms package. Ultimately, in this course, we aim to show how Bayesian methods provide a very powerful, flexible, and extensible approach to general statistical data analysis.

The Fundamentals of Bayesian Data Analysis

May 19 to May 20, 2021

Online course

This course provides a solid introduction to Bayesian methods, both theoretically and practically. We will cover the fundamental concepts of Bayesian inference and Bayesian modelling, including how Bayesian methods differ from their classical statistics counterparts, and show how to do Bayesian data analysis in practice in R.

Introduction to Machine Learning and Deep Learning using Python

May 12 to May 13, 2021

Online course

In this two day course, we provide an introduction to machine learning and deep learning using Python. We begin by providing an overview of the machine learning and deep learning landscape, and discuss the prominent role that Python has come to play in this area. We then turn to machine learning in practice, and for this, we will primarily using the widely used and acclaimed scikit-learn toolbox.

A Brief Introduction to Statistical Data Analysis with Python

May 7, 2021

Online course

In this 2hr course, we will provide a brief introduction to data analysis and statistics in Python. In particular, we will cover data processing using pandas, statistical analysis using statsmodels, and data visualization using matplotlib and seaborn.

Introduction to Python and Programming in Python

April 28 to April 29, 2021

Online course

This two day course provides a general introduction to the Python environment, the Python language, and general purpose programming in Python. We cover how to install and set up a Python computing environment, describing how to set virtual environments, how to use Python package installers, and overview some Python integrated development environments (IDE) and Python Jupyter notebooks. We then provide a comprehensive introduction to programming in Python, covering all the following major topics: data types and data container types, conditionals, iterations, functional programming, object oriented programming, modules, packages, and imports.

Introduction to Data Wrangling using R

April 21 to April 22, 2021

Online course

In this two day course, we provide a comprehensive practical introduction to data wrangling using R. In particular, we focus on tools provided by R’s tidyverse, including dplyr, tidyr, purrr, etc.

Model selection and model simplification

April 14 to April 15, 2021

Online course

This two day course covers the important and general topics of statistical model building, model evaluation, model selection, model comparison, model simplification, and model averaging. We cover topics such how to measure model fit and a model’s predictive performance, nested model comparison, out-of-sample predictive performance, over-fitting, leave-one-out cross-validation, information criteria, variable selection, stepwise regression, ridge regression, Lasso, elastic nets, model averaging, Bayesian methods of model comparison and evaluation.

Introduction to Data Visualization using R

April 7 to April 8, 2021

Online course

In this two day course, we provide a comprehensive introduction to data visualization in R using ggplot. We begin by providing a brief overview of the general principles data visualization, and an overview of the general principles behind ggplot. We cover the major types of plots including histograms, density plots, barplots, Tukey boxplots, scatterplots and scatterplot smoothing, frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping.

Introduction to Multilevel (hierarchical, or mixed effects) Models

March 31 to April 1, 2021

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to multilevel models, also known as hierarchical or mixed effects models. We will focus primarily on multilevel linear models, but also cover multilevel generalized linear models. Likewise, we will also describe Bayesian approaches to multilevel modelling.

Introduction to Generalized Linear Models

March 24 to March 25, 2021

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is, for example, a binary, or ordinal, or count variable, etc. The specific models we cover include binary, binomial, ordinal, and categorical logistic regression, Poisson and negative binomial regression for count variables. We will also cover zero-inflated Poisson and negative binomial regression models.

Introduction to statistics using R and Rstudio

March 17 to March 18, 2021

Online course

In this two day course, we provide a comprehensive introduction to R and how it can be used for data science and statistics. We begin by providing a thorough introduction to RStudio, which is the most popular and powerful interfaces for using R. We then introduce all the fundamentals of the R language and R environment: variables and assignment, data structures, operators, functions, scripts, packages, projects, etc. We then provide an introduction to data processing and formatting (aka, data wrangling), an introduction to data visualization, an introduction to RMarkdown, and introduce how to some of the most widely used statistical methods such as linear regression, Anovas, etc.

Nonlinear Regression using Generalized Additive Models

December 17 to December 16, 2020

Online course

This course provides a general introduction to nonlinear regression analysis using generalized additive models. We cover spline and radial basis function regression, generalized additive models (GAMs) and generalized additive mixed models (GAMMs). GAMs can be viewed as the generalization of spline or basis function regression, but cover a wider range of topics including nonlinear spatial and temporal models and interaction models.

Introduction to Machine Learning and Deep Learning using Python

December 9 to December 10, 2020

Online course

In this two day course, we provide an introduction to machine learning and deep learning using Python. We begin by providing an overview of the machine learning and deep learning landscape, and discuss the prominent role that Python has come to play in this area. We then turn to machine learning in practice, and for this, we will primarily using the widely used and acclaimed scikit-learn toolbox.

Introduction to Scientific, Numerical, and Data Analysis Programming in Python

December 2 to December 3, 2020

Online course

This two day course provides a general introduction to numerical programming in Python, particularly using numpy, data processing in Python using Pandas, data analysis in Python using statsmodels and rpy2. We will also cover the major data visualization and graphics tools in Python, particularly matplotlib, seaborn, and ggplot. Finally, we will cover some other major scientific Python tools, such as for symbolic mathematics and parallel programming and code acceleration.

Introduction to Multilevel (hierarchical, or mixed effects) Models

November 25 to November 26, 2020

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to multilevel models, also known as hierarchical or mixed effects models. We will focus primarily on multilevel linear models, but also cover multilevel generalized linear models. Likewise, we will also describe Bayesian approaches to multilevel modelling.

Introduction to Python and Programming in Python

November 18 to November 19, 2020

Online course

This two day course provides a general introduction to the Python environment, the Python language, and general purpose programming in Python. We cover how to install and set up a Python computing environment, describing how to set virtual environments, how to use Python package installers, and overview some Python integrated development environments (IDE) and Python Jupyter notebooks. We then provide a comprehensive introduction to programming in Python, covering all the following major topics: data types and data container types, conditionals, iterations, functional programming, object oriented programming, modules, packages, and imports.

Introduction to Generalized Linear Models

November 11 to November 12, 2020

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is, for example, a binary, or ordinal, or count variable, etc. The specific models we cover include binary, binomial, ordinal, and categorical logistic regression, Poisson and negative binomial regression for count variables. We will also cover zero-inflated Poisson and negative binomial regression models.

Using Git and GitHub for R based data analysis projects

November 3 to November 4, 2020

Online course

In this course, which is over two days with two 2 hour sessions each day, we cover how to use Git and GitHub to manage your code in R based data analysis projects. We begin by providing a conceptual overview of version control software, and why it is vital for the management of software and code management generally. We will then cover the Git software in particular, and some of its design principles and internal mechanics. We then cover GitHub, and how and why it, and its alternatives, are used. Having provided a conceptual overview, we then turn to our practical introduction to the major Git commands (e.g., init, clone, add, commit, push, pull, branch, etc.). Here, we will also cover the major topic of Git branching and merging, which allow for extremely powerful workflows, especially when working with others.

Introduction to statistics using R and Rstudio

October 28 to October 29, 2020

Online course

In this two day course, we provide a comprehensive introduction to R and how it can be used for data science and statistics. We begin by providing a thorough introduction to RStudio, which is the most popular and powerful interfaces for using R. We then introduce all the fundamentals of the R language and R environment: variables and assignment, data structures, operators, functions, scripts, packages, projects, etc. We then provide an introduction to data processing and formatting (aka, data wrangling), an introduction to data visualization, an introduction to RMarkdown, and introduce how to some of the most widely used statistical methods such as linear regression, Anovas, etc. From this course, you will gain a comprehensive introduction to R, which will serve as foundation for progressing further with R to any kind of data analysis, data science, or statistics.

Data wrangling and RMarkdown-based reproducible reports using R

September 14 to September 15, 2020

Online course

In this two day course, we provide a practical introduction to data wrangling and RMarkdown based reproducible reports using R. For the topic of data wrangling, we cover the dplyr tools such as select, filter, mutate, etc, the pipe operator (%>%) to create data wrangling pipelines, and how to perform descriptive or summary statistics using dplyr’s summarize and group_by functions, how to merge data frames, and how to pivot between wide and long data formats. We then turn to the topic of writing reproducible data analysis reports using RMarkdown, knitr, and related tools. These are vital tools for reproducible research that allow us to produce data analysis reports, i.e. articles, slides, posters, websites, etc., by embedding R analysis code within the text of the report that is then executed, and the results it produces are inserted into the final output document. We will also mention some of the key additional tools that can used for reproducible research in R including Git, R packages, docker, and Make/Drake.

Introduction to Data Wrangling using R

September 3 to September 4, 2020

Online course

In this two day course, we provide a comprehensive practical introduction to data wrangling using R. In particular, we focus on tools provided by R’s tidyverse, including dplyr, tidyr, purrr, etc. Data wrangling is the art of taking raw and messy data and formating and cleaning it so that data analysis and visualization etc may be performed on it. Done poorly, it can be a time consuming, labourious, and error-prone. Fortunately, the tools provided by R’s tidyverse allow us to do data wrangling in a fast, efficient, and high-level manner, which can have dramatic consequence for ease and speed with which we analyse data. On Day 1 of this course, having covered how to read data of different types into R, we cover in detail all the dplyr tools such as select, filter, mutate, etc. Here, we will also cover the pipe operator (%>%) to create data wrangling pipelines that take raw messy data on the one end and return cleaned tidy data on the other. On Day 2, we cover how to perform descriptive or summary statistics on our data using dplyr’s summarize and group_by functions. We then turn to combining and merging data. Here, we will consider how to concatenate data frames, including concatenating all data files in a folder, as well as cover the powerful SQL like join operations that allow us to merge information in different data frames. The final topic we will consider is how to “pivot” data from a “wide” to “long” format and back using tidyr’s pivot_longer and pivot_wider.

Introduction to Data Visualization using R

August 20 to August 21, 2020

Online course

In this two day course, we provide a comprehensive introduction to data visualization in R using ggplot. We begin by providing a brief overview of the general principles data visualization, and an overview of the general principles behind ggplot. We cover the major types of plots including histograms, density plots, barplots, Tukey boxplots, scatterplots and scatterplot smoothing, frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping.

Introduction to Multilevel (hierarchical, or mixed effects) Models

August 6 to August 7, 2020

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to multilevel models, also known as hierarchical or mixed effects models. We will focus primarily on multilevel linear models, but also cover multilevel generalized linear models. Likewise, we will also describe Bayesian approaches to multilevel modelling.

Introduction to Generalized Linear Models

July 23 to July 24, 2020

Online course

In this two day course, we provide a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is, for example, a binary, or ordinal, or count variable, etc. The specific models we cover include binary, binomial, ordinal, and categorical logistic regression, Poisson and negative binomial regression for count variables. We will also cover zero-inflated Poisson and negative binomial regression models.

Reproducible Data Science using RMarkdown, Git, R packages, Docker, Make & Drake, and other tools

June 29 to July 4, 2020

Online course

This course provides a comprehensive introduction to doing reproducible data analysis, which we define as analysis where the entire workflow or pipeline is as open and transparent as possible, making it possible for others, including our future selves, to be able to exactly reproduce any of its results. We cover this topic by providing a thorough introduction to a set of R based and general computing tools such as RMarkdown, Git & GitHub, R packages, Docker, Gnu Make and Drake, and show how they can be used together to do reproducible data analysis that can then be shared with others.

Nonlinear regression using R: Generalized Linear Models, Generalized Additive Models, Basis function regression, Gaussian processes, and other methods

May 25 to May 29, 2020

Online course

This course provides a general introduction to nonlinear regression analysis, covering major topics including, but not limited to, general and generalized linear models, generalized additive models, spline and radial basis function regression, and Gaussian process regression. We approach the general topic of nonlinear regression by showing how the powerful and flexible statistical modelling framework of general and generalized linear models, and their multilevel counterparts, can be extended to handle nonlinear relationships between predictor and outcome variables.

Python for data science, machine learning, and scientific computing

May 4 to May 8, 2020

Online course

This five day course provides both a general introduction to programming with Python and a comprehensive introduction to using Python for data science, machine learning, and scientific computing. The major topics that we will cover include the following: the fundamentals of general purpose programming in Python; using Jupyter notebooks as a reproducible interactive Python programming environment; numerical computing using numpy; data processing and manipulations using pandas; data visualization using matplotlib, seaborn, ggplot, bokeh, altair, etc; symbolic mathematics using sympy; data science and machine learning using scikit-learn, PyTorch; high performance computing with Numba, IPyParallel, Dask. Overall, this course aims to provide a solid introduction to Python generally as a programming language, and to its principal tools for doing data science, machine learning, and scientific computing.

Introduction to R and RStudio

January 8, 2020

Stratford upon Avon, England

R is a very powerful free and open source software package for doing data analysis. It is increasingly widely used in psychology, both for research and for statistics teaching. In this workshop, you will be provided with a friendly introduction to R. It is intended to provide people who are new to R with all the basics and fundamentals that they need to get up and running with R so that they can use it on a regular basis. No prior experience with R is necessary. All we will assume is a familiarity with statistics typical of someone with a undergrad, or higher, degree in psychology or related discipline. You will be required to bring your own laptop with R, RStudio, and some R packages installed, and instructions on how to install these (free) software packages will be provided in advance of the workshop.

Python for data science, machine learning, and scientific computing

September 23 to September 27, 2019

Glasgow, Scotland

This five day long workshop provides both a general introduction to programming with Python and a comprehensive introduction to using Python for data science, machine learning, and scientific computing. The major topics that we will cover include the following: the fundamentals of general purpose programming in Python; using Jupyter notebooks as a reproducible interactive Python programming environment; numerical computing using numpy; data processing and manipulations using pandas; data visualization using matplotlib, seaborn, ggplot, bokeh, altair, etc; symbolic mathematics using sympy; data science and machine learning using scikit-learn, keras, and tensorflow; Bayesian modelling using PyMC3 and PyStan; high performance computing with Cython, Numba, IPyParallel, Dask. Overall, this course aims to provide a solid introduction to Python generally as a programming language, and to its principal tools for doing data science, machine learning, and scientific computing.

Structural Equation Models, Path Analysis, Causal Modelling and Latent Variable Models Using R

September 16 to September 20, 2019

Glasgow, Scotland

This five day workshop provides a comprehensive introduction to a set of inter-related topics of widespread applicability: structural equation modelling, path analysis, causal modelling, mediation analysis, latent variable modelling, Bayesian networks, graphical models, and other related topics. We begin with the topic of path analysis. At its simplest, path analysis can be seen as an extension of standard (e.g. linear) regression analysis to cases where there are more complex structural relationship between the predictor and outcome variables. More generally, and more usefully, we can view path analysis as specifying and modelling causal relationships between observed variables. In order to fully appreciate path analyses, and especially their role as causal models, we will introduce the concept of directed acyclic graphical models, also known as Bayesian networks, which are a powerful mathematical and conceptual tool for reasoning about causal relationships. We then turn to structural equation modelling, which can be seen as an extension of path analysis, particularly due to the inclusion of unobserved or latent variables.

Nonlinear Regression and Generalized Additive Models

September 9 to September 13, 2019

Glasgow, Scotland

This five day long workshop provides a general introduction to nonlinear regression analysis, covering major topics including, but not limited to, general and generalized linear models, generalized additive models, spline and radial basis function regression, and Gaussian process regression. We approach the general topic of nonlinear regression by showing how the powerful and flexible statistical modelling framework of general and generalized linear models, and their multilevel counterparts, can be extended to handle nonlinear relationships between predictor and outcome variables, particularly by using basis functions. We’ll begin our coverage of basis function regression with the major topic of spline regression, and then proceed to cover radial basis functions and the multilayer perceptron, both of which are types of artificial neural networks. We then move on to the major topic of generalized additive models (GAMs) and generalized additive mixed models (GAMMs), which can be viewed as the generalization of all the basis function regression topics, but cover a wider range of topic including nonlinear spatial and temporal models and interaction models. Finally, we will cover the powerful Bayesian nonlinear regression method of Gaussian process regression.

Introduction to Bayesian Multilevel Regression using R and brms

September 3, 2019

Stoke on Trent, UK

Multilevel regression analysis is now a widely used statistical technique in psychology, being used for repeated measures analyses or for when data are hierarchically nested or cross classified into different groups. Popular R based packages such as lme4 have done much to facilitate the use of multilevel regression. However, these packages are limited in terms of the types of regression analyses that can accomplish. Bayesian modelling is a powerful and flexible approach to statistical modelling generally, including multilevel regression modelling. The brms R package, which uses the Stan probabilistic programming language, is a remarkably powerful yet easy to learn and easy to use tool for Bayesian regression modelling, including general, generalized, and nonlinear multilevel modelling. This workshop provides an introduction to multilevel modelling, describes the Bayesian approach to the topic, and provides hands-on advice on how to use brms using a range of interesting real world examples. This workshop is aimed at anyone interested in multilevel regression modelling in general, and Bayesian multilevel regression in particular.

Data Analysis Using R

July 15 to July 16, 2019

Hatfield, UK

In this two day workshop, we will provide a general introduction to R and how to do statistical data analysis using R. On the first day, we will provide an introduction to R fundamentals (RStudio overview, R commands and operations, assignment, data structures, functions, scripts, packages, reading in data files, viewing and summarizing data). This will be followed by data wrangling (using dplyr, tidyr, and pipelines), and data visualization using ggplot2. On the second data, we begin by covering introductory statistical tests (t-tests, correlation, Chi-square), and then proceed to linear models (simple linear regression, multiple linear regression, general linear models, oneway anova, factorial anova). We will then cover generalized linear models (logistic regression, Poisson regression, negative binomial regression), and general and generalized linear mixed effects (multilevel) modelling.

Doing open, transparent, and reproducible research with RMarkdown and Git

July 8, 2019

Rotterdam, Netherlands

One important contributing factor to the replication crisis in psychology is that the raw data and the analysis pipeline that lead to the final reported results are usually completely opaque on the basis of the published manuscript alone. Recent software tools, such as RMarkdown, allow data, analysis code, and the text of the manuscript to be coupled from the earliest stages of data-analysis, and all of the results and figures reported in the manuscript are automatically generated by the analysis code and data. This workshop will provide a comprehensive introduction to using RMarkdown. It will also provide an introduction to Git and Github and describe how these tools can be used to share RMarkdown documents with collaborators and the public.

Introduction to R and RStudio

April 23, 2019

London, UK

In this one day workshop, we will provide a friendly but comprehensive introduction to R. It is intended to provide people who are new to R all the basics and fundamentals that they need to get up and running with R so that they can use it on a regular basis. We will provide an introduction to R fundamentals such as an overview of RStudio, R commands and operations, assignment, data structures, functions, scripts, packages, reading in data files, viewing and summarizing data, and data visualization using ggplot2. We will also cover how to do statistical data analysis in R, specifically covering linear regression and Anova models.

Bayesian Multilevel Modelling

March 25 to March 29, 2019

Cologne, Germany

This five day workshop provides a general introduction to Bayesian multilevel modelling. Throughout, we will extensively use R and the Bayesian probabilistic programming language Stan and its R based brms interface. The course begins by providing a solid introduction to all the fundamental principles and concepts of Bayesian data analysis including the likelihood function, prior distributions, posterior distributions, high posterior density intervals, posterior predictive distributions, marginal likelihoods, Bayes factors, etc. We will do this using some simple models that are easy to understand and easy to work with. We then turn to providing a solid introduction to multilevel models, again beginning with conceptually and computationally simple multilevel models. We then proceed to more practically useful Bayesian multilevel models, and ones that necessarily require Monte Carlo methods, particularly multilevel general and generalized linear regression models. For these models and analyses, we will make extensive use of the R based brms interface to Stan, which allows general Bayesian multilevel regression to be done remarkably easily. In the final part of the course will cover multilevel models that are not regresison models per se. In particular, we will explore multilevel probabilistic mixture models, especially Latent Dirichlet Allocation and the Hierarchical Dirichlet Process model, which have been shown to be particularly valuable in the modelling of text data.

Introduction to Bayesian data analysis for social and behavioural sciences using R and Stan

December 3 to December 7, 2018

Glasgow, Scotland

In this five day long workshop, we provide a general introduction to Bayesian data analysis using R and the Bayesian probabilistic programming language Stan. We begin with a gentle introduction to all the fundamental principles and concepts of Bayesian data analysis, and then proceed to more practically useful Bayesian analyses using general linear models, generalized linear models, and their multilevel counterparts. For these analyses, we will use real world data sets, and carry out the analysis with Stan using the brms interface to Stan in R. We will explore general concepts such as model checking and improvement using posterior predictive checks, and model evaluation using cross-validation, WAIC, and Bayes factors. In the final part of the course, we will delve into some more advanced topics including understanding Markov Chain Monte Carlo in depth, general probabilistic programming with Stan, Gaussian process regression, probabilistic mixture models.

Bayesian Data Analysis

September 25 to September 26, 2018

Munich, Germany

A two day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. We will start by providing a brief historical overview of statistical inference and introduce Bayes’s theorem. The fundamental concepts of Bayesian statistical inference will follow, contrasted with frequentist methods of inference. On the second day, we provide a solid theoretical and practical foundation for real-world Bayesian data analysis in psychology and social sciences. We will begin by focusing on some simple, but nontrivial, statistical inference problems. These problems are simple enough to handle with just a few calculations using R, but can nonetheless illustrate clearly all the major concepts on Bayesian inference including prior distributions, the likelihood function, posterior distributions, high posterior density intervals, posterior predictive distributions, marginal likelihood, Bayesian model comparison, and so on. We will then turn to looking at regression models, beginning with the linear/normal models, and then turn to generalized linear models, including logistic regression and Poisson regression, as well as their multilevel counterparts. We will explore these models using Stan, a probabilistic programming language, and brms, an R package that allows us to use Stan easily and efficiently.

Introduction to R

September 19, 2018

Nottingham, UK

In this one day workshop, we will provide a friendly but comprehensive introduction to R. It is intended to provide people who are new to R with all the basics and fundamentals that they need to get up and running with R so that they can use it on a regular basis. We will provide an introduction to R fundamentals such as an overview of RStudio, R commands and operations, assignment, data structures, functions, scripts, packages, reading in data files, viewing and summarizing data, and data visualization using ggplot2. We will also cover how to do statistical data analysis in R, particularly linear regression and Anova models.

Bayesian Data Analysis

August 28, 2018

Liverpool, UK

This workshop provides a practical introduction to Bayesian data analysis. We will start by describing the fundamental concepts of Bayesian statistical inference, and contrast these with frequentist methods of inference. Then, using a simple case study, we will explore all the major concepts and issues in Bayesian statistics including priors, likelihood functions, posterior inference, posterior predictive inference, Bayes factors, out-of-sample generalization, and so on. We will then turn to doing Bayesian general and generalized linear regression models and their multilevel counterparts. It is hoped that this workshop will provide a useful introduction to the theory and practice of Bayesian data analysis, and provide an introductory guide on how to use Bayesian methods in psychological research.

Bayesian Parameter Estimation

June 7, 2018

Lancaster, UK

This workshop provides a practical introduction to parameter estimation in Bayesian data analysis. We will start by describing the fundamental concepts of Bayesian statistical inference, and contrast these with frequentist methods of inference. Then, using a simple case study, we will explore all the major concepts and issues in Bayesian statistics including priors, likelihood functions, posterior inference, posterior predictive inference, Bayes factors, out-of-sample generalization, and so on. We will then turn to parameter estimation in Bayesian linear regression models. It is hoped that this workshop will provide a useful introduction to the theory and practice of parameter estimation in Bayesian data analysis, and provide an introductory guide on how to use Bayesian methods in psychological research.

Prior Exposure Bayesian Data Analysis Workshops: September, 2017

September 27 to September 28, 2017

Nottingham, UK

a two day workshop on some relatively advanced topics in bayesian data analysis. on the first day, we cover bayesian approaches to multilevel models including multilevel linear models and multilevel generalized linear models. for this, we will use the mcmcglmm r package and the jags probabilistic programming language. on the second day, we cover bayesian approaches to probabilistic mixture models and nonlinear regression. we also provide a more thorough introduction to the theoretical basis of Markov Chain Monte Carlo (mcmc) samplers. on the second day, we will primarily use jags.

Bayesian Methods for Language Sciences

June 29 to June 30, 2017

Toulouse, France

A 5 day lecture course on Bayesian methods for language sciences taught at 29th European Summer School in Logic, Language, and Information, University of Toulouse (France), 17-28 July, 2017

Prior Exposure Bayesian Data Analysis Workshops: June, 2017

June 28 to June 30, 2017

Nottingham, UK

A three day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to R and how to use R for data processing and data analysis. On the second day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. On the third day, we delve more deeply into the theoretical and practical details of Bayesian inference, particularly focusing on Bayesian linear models.

Prior Exposure Bayesian Data Analysis Workshops: April, 2017

April 5 to April 7, 2017

Nottingham, UK

A three day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to R and how to use R for data processing and data analysis. On the second day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. On the third day, we delve more deeply into the theoretical and practical details of Bayesian inference, particularly focusing on Bayesian linear models.

Introduction to Bayesian Data Analysis using R and Jags

December 12 to December 13, 2016

Cork, Ireland

A two day workshop introducing Bayesian data analysis using R. On the first day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. We will also provide a general introduction to R and RStudio. On the second day, we will cover Bayesian linear and generalized linear regression using the probabilistic programming language Jags.

Prior Exposure Bayesian Data Analysis Workshops: September, 2016

September 14 to September 16, 2016

Nottingham, UK

A three day workshop on some relatively advanced topics in Bayesian data analysis. On the first day, we provide a general introduction to linear mixed effects modelling using R. This workshop will not focus on Bayesian methods per se, but provides a background for the content that follows on the second and third day. On the second day, we cover Bayesian approaches to multilevel models including multilevel linear models and multilevel generalized linear models. For this, we will use the MCMCglmm R package and the Jags probabilistic programming language. On the third day, we cover Bayesian approaches to probabilistic mixture models and nonlinear regression. We also provide a more thorough introduction to the theoretical basis of Markov Chain Monte Carlo (MCMC) samplers. On the third day, we will primarily use Jags.

Prior Exposure Bayesian Data Analysis Workshops: June, 2016

June 15 to June 17, 2016

Nottingham, UK

A three day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to R and how to use R for data processing and data analysis. On the second day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. On the third day, we delve more deeply into the theoretical and practical details of Bayesian inference, particularly focusing on Bayesian linear models.

Prior Exposure Bayesian Data Analysis Workshops: April, 2016

March 30 to April 1, 2016

Nottingham, UK

A three day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to R and how to use R for data processing and data analysis. On the second day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. On the third day, we delve more deeply into the theoretical and practical details of Bayesian inference, particularly focusing on Bayesian linear models.

Prior Exposure Bayesian Data Analysis Workshops: September, 2015

September 22 to September 23, 2015

Nottingham, UK

A two day workshop on some relatively advanced topics in Bayesian data analysis. On the first day, we cover Bayesian approaches to multilevel models including multilevel linear models and multilevel generalized linear models. For this, we will use the MCMCglmm R package and the Jags probabilistic programming language. On the second day, we cover Bayesian approaches to probabilistic mixture models and nonlinear regression. We also provide a more thorough introduction to the theoretical basis of Markov Chain Monte Carlo (MCMC) samplers. On the second day, we will primarily use Jags.

Prior Exposure Bayesian Data Analysis Workshops: April, 2015

March 31 to April 1, 2015

Nottingham, UK

A two day workshop introducing Bayesian data analysis. On the first day, we provide a general introduction to Bayesian data analysis and how it differs from the more familiar classical approaches to data analysis. On the second day, we delve more deeply into the theoretical and practical details of Bayesian inference, particularly focusing on Bayesian linear models.

An introduction to the Python programming language for beginners

September 2, 2014

Nottingham, UK

The objective of this workshop is to introduce and motivate the use of the Python programming language in research in cognitive psychology. Within the last ten years, the development of scientific and numerical libraries in Python has grown to the point where Python can now be used as a scientific and numerical computing environment comparable to products like Matlab, R and Mathematica. As of yet, however, it appears that knowledge of the potential applications of Python to research in cognitive psychology is still rather limited. The aim of this tutorial, therefore, is to describe these areas of application and to advocate the advantages and appeals of using Python as the principal programming language in cognitive psychology research.

Python for Cognitive Science

August 1, 2012

Sapporo, Japan

The objective of this workshop is to introduce the Python programming language for use in cognitive science research. Within the last 10 years, the development of scientific and numerical libraries in Python has grown to the point where Python can now be used as a scientific and numerical computing environment comparable to products like Matlab. The aim of this tutorial is to describe these areas of application and to advocate the advantages and appeals of using Python as the principal programming language in cognitive science research.