MDS Summer Preparation
Here you will find a few resources you can explore before you join us here in Durham. There is no expectation that you read or complete any of these activities, our introductory courses during term 1 (Michaelmas) will set the foundations you need to learn more advanced data science methods and techniques in term 2 (Epiphany).
Introduction to Mathematics and Statistics for Data Science
- Linear Algebra: This is one of the pillars of data science and statistics. Almost all machine learning problems can be transformed into something simpler to give you a good starting point. Visual and spatial intuition can help you understand what the mathematics does formally. Some great resources to develop this intuition can be found in The Essence of Linear Algebra by Grant Sanderson (3Blue1Brown – Youtube Channel).
- Calculus: The Essence of Calculus from 3Blue1Brown has great visual demonstration of calculus concepts. This is essential for understanding optimization procedures and approximations.
- Probability and Statistics: The Art of Statistics: Learning from Data by David Spiegelhalter (Pelican) is a lovely book to get you started with the basics. There are also great tutorials and classes you can watch at the Khan Academy in the item below.
- Khan Academy: The Khan Academy is a great resource for anyone learning Maths, Probability and Stats, and many other skills. You can use to learn new concepts or just to practice.
Our modules are taught using the Python and R languages primarily. You will likely see other languages and tools throughout the year as part of the more specialized modules.
- Python: There are some great resources available from the Python for Non-Programmers page. If you want to install Python in your own laptop, please install Anaconda. You will also use the clusters we have here in Durham to learn how to code.
- R: This is the most widely used statistical programming language. Combined with Python, you will have a very powerful data science toolkit to get started. We recommend you install RStudio (development environment for R) and R. Links and instructions on how to install both can be found in the RStudio website.
We recommend the following book after you have built some basic knowledge in Python:
Karsdorp, F., Kestemont, M. and Riddell, A., Humanities Data Analysis: Case Studies with Python, Princeton University Press (2021). Open access e-book: https://www.humanitiesdataanalysis.org/
Here are a few interesting items you can check before joining us:
- Hans Rosling TED Talk on “The good news of the decade? We’re winning the war against child mortality”.
- A case study on hospital process change developed with a NHS Trust: Unblocking a hospital in gridlock