Python is one of the most widely used and highly valued programming languages in the world, and is especially widely used in data science, machine learning, and in other scientific computing applications. Python can provide a complementary toolbox to what is available in R Statistical Software. This course provides both a general introduction to programming with Python and a comprehensive introduction to using Python for data science. The course will also cover the use of Python through the R Statistical software. The major topics that we will cover include the following: the fundamentals of general purpose programming in Python; numerical computing using numpy; data processing and manipulations using pandas; parallel processing. Finally, this course will address the integration of Python into R (i.e. how to use Python from R). Overall, this course aims to provide a solid introduction to Python generally as a programming language, and to its principal tools for doing data science, machine learning, and scientific computing. (Note that this course will focus on Python 3 exclusively given that Python 2 has now reached it end of life).

Target audience

This workshop is intended to anyone interested in using Python for data science and with no or only little previous knowledge of this language.

Prerequisites

For the last topic below, prior experience with R and RStudio is necessary. Attendees should be familiar with R syntax and commands, and know how to use RStudio.

Outline

Day 1: Introduction to Python (day 1 / 4h)

  • Installing and setting up Python
  • Data Structures
  • Programming in Python

Day 2: Programming and data processing with Python (day 2 / 2h)

  • Numerical programming with numpy
  • Data processing with pandas

Day 3: Parallel processing and integrating Python with R (day 3 / 4h)

  • Parallel processing
  • Integrating R and Python

Software

The following button will launch a Jupyter lab session that can be used for this course. Binder You can also use Python Jupyter notebook and pip package installers immediately using Google’s Colaboratory If you are new to Python, either of these approaches is highly recommended. You will be able to immediately starting learning Python without any installation or configuration of software. This entire course can be done using this approach.

On the other hand, for those wishing to install on their own devices (Windows or Macs), instructions are available here.

GitHub resources

Further resources for this training course can be found on Github at mark-andrews/intropy4ds.