Scientific Data Analysis, Data Science, and Machine Learning in Python
An independent study with two parts:
Part 1: Consolidate and advance our understanding of mainstream and cutting edge scientific data analysis techniques using Chapters 1-8 of Pasha, Astronomical Python
Part 2: Survey data science techniques up to and including neural net and deep learning implementations (which are the relevant preparation for a subsequent study of natural-language processing and LLMs) using Chapters 1-11, 13-15, and 18-19 Grus, Data Science from Scratch, 2nd Ed.
Term 6 of Academic Year 2024-2025, Deep Springs College
Mentor: Prof. Brian Hill
Student: Hexi Jin (DS 23)
Syllabus
- PDF of Syllabus (a PDF of the same content as is on these web pages)
Materials
Required
- Imad Pasha, Astronomical Python
- Joel Grus, Data Science from Scratch, 2nd Edition
Optional
- Both Pasha and Grus include adequate introductions to Python features as they use them, but you may want a more systematic introduction to use as a reference. An excellent one is David Beazley, Python Distilled. It is actually a distillation and update of his time-tested Python: Essential Reference, which was growing overly-long as the Python language feature set kept growing.
- Since we will be using Git to keep and share all of our code and notes, and version control with this level of sophistication is de rigueur for working in a software team, consider supplementing the understanding that you get from the workflows we are using by reading Travis Swicegood’s Pragmatic Guide to Git.
Actual Daily Schedules (Kept Retrospectively)
Looking Beyond
Notes (mostly code samples)