How to self-study Computer Science and Data Science
The Lord of self-studying: The Fellowship of studying
Today, a friend asked me how to study computer science well — a popular question that many others wonder too. Thus, I decided to write an all-in-one note to explain in detail about learning computer science and data science as well. I’m not excellent in computer science at all, and this note is only sharing some experiences I have.
Before you start
Knowing where you are is important. You need to understand how much CS knowledge you have. This journey is long and exhausted but I can ensure it’ll be fun as well, if you really like those stuff.
Prepare some things:
- A notebook and a pen to note and visualize important things while you learn such as how HTTP requests are sent and received, how CPU and RAM works, and so on. Visualization always helps you understand things faster and easier.
Here are some examples of things I drew when learning about CPU and RAM:


A laptop/computer with Linux/OSX operation system. Yup, you can use Windows if you like, but I prefer Linux to learn programming.
A lot of time
And last but not least, a superior unbreakable persistent mind
There are 3 parts of this note:
- Learning Maths
- Learning Computer Science
- Learning Data Science
Maths is basic stuff that support everything else, so I put it into an independent part.
All 3 parts will only contain materials which I personally like and find them useful the most.
Bold courses are must-taken or you should be familiar with the outcomes of those courses (so you don’t have to re-take it). Other courses are useful and good to learn.
Learning Path
For those who want to learn programming applied in non-CS areas (like business, finance, health care, …), this is the learning path I think able to help you learn effectively:
- Read 2 starter books
- Do maths courses Math_01 → Math_02 → Math_03 or make sure you understand all the concepts in each course
- Do core courses in Computer Science: 1 → 2 → 3 → 8 (in course 8 only need Lecture 1 to 8)
- Learn Python: Py_2 → Py_5 (do all projects)
- Learn DS and Algorithm: Al_01 → Al_02 → Al_05
- Do real projects
Recommended Certifications
To get some certifications, some courses below are recommended:
Algorithms: Stanford Course on Coursera - Intermediate Level. You should learn basic algorithms before taking this one (Al_01 is a good start).
C Programming: Dartmouth Course on edX - Intermediate Level. Read the book C_1 before taking this course. Self-paced, $308 fee.
C++ Programming: Series of 3 courses from Microsoft
Python: Python for Data Science - University of Michigan series
Mathematics
The courses below should be taken in order.
| No. | Course | What you expected to learn |
|---|---|---|
| Math_01 | Single Variable Calculus | Limit, Derivatives, Chain Rules, Approximation, Mean Value Theorem, Riemann Sums, The Fundamental Theorem of Calculus, Integrals and Application, L’Hospital’s Rule, Infinite Series, Taylor’s Series |
| Math_02 | Multivariable Calculus | Vectors and operators on vectors, Determinants and Matrices (Basics), Partial Derivatives, Second Derivative Test, Total Differentials and Chain Rule, Gradient, Lagrange Multipliers, Double Integrals, Vector Fields, Curl, Green’s Theorem, Triple Integrals, Flux, Stokes’ Theorem |
| Math_03 | Math 54 Linear Algebra and Differential Equations | Vector Spaces and Linear transformations, subspaces, kernels, range, eigenvalues and eigenvectors, The Gram-Schmidt Process, Least-Squares Problems, Linear Systems of Differential Equations, Fourier Series |
| Math_04 | Math 55 - Discrete Mathematics | Propositional logic and equivalences, Sets and set operations, Functions, sequences, cardinality, Solving congruences, Mathematical induction, recursive, Counting, Permutations and combinations, Graphs and graph models, Graph isomorphism |
Computer Science
Before studying courses
You need to read a few books to understand the big pictures of computer science. Read these books in order:
2. Structure and Interpretation of Computer Programs (SICP)
A legendary book. SICP is the kind of book that is recommended from everywhere in the CS world. It covers a lot of programming aspects from functional programming, abstraction, instruction processor to compilers. This might be quite hard for complete beginners and because this book uses the Scheme programming language, it’s pretty hard to read as well.
Core Courses
There are many things not covered in this path yet (after completing CS61 series). If you want to discover other courses about security, networks, machine learning, look for them on the Internet or search in this awesome guide.
1. CS50’s Introduction to Computer Science - Harvard
Link: edX Course
Arguably the best intro course for CS I’ve ever learnt.
What you can learn:
- Basic understanding of computer science and programming
- Basic algorithms and problem-solving thinking
- Concepts like abstraction, data structures, encapsulation, web development
- Languages: Python, C, SQL, Javascript
2. CS 61A Structure and Interpretation of Computer Programs - Berkeley
Link: Course Website
A truly great course which makes me extremely eager to learn everything from Berkeley. However, to really enjoy it as well as get useful knowledge, you need to put a quite large amount of effort — definitely commit to do all the homeworks, assignments, and take all the exams. Might take you few months to actually get things done.
What you can learn:
- Python and its implementation
- Higher-Order Functions, Recursion, Environment diagrams
- Data Abstraction
- Scheme, Exceptions, Iterators, Streams, SQL
3. CS 61B Data Structures - Berkeley
Videos: InfoCoBuild
Course: datastructur.es
This one is quite hard for me because I don’t like learning Java much. But I think it’s useful if you want to learn data structures.
What you can learn:
- Basic Java
- Data structures: Lists, Abstract Classes, Trees, BST, Graphs
- Algorithms: Sorting and Algorithmic Bounds
4. CS 70 Discrete Mathematics and Probability Theory - Berkeley
Videos: InfoCoBuild
Course: Course Website
I’ve not done this course yet. Discrete maths is cool and probability is interesting so I’ll finish this course soon.
What you can learn:
- Induction and Recursion
- Graphs, Eulerian Tour
- Modular Arithmetic
- Polynomials, Secret Sharing, Erasure Codes
- Probability: Counting, Sample Spaces, Events, Independence, Conditional Probability
- Inference
5. CS 61C Great Ideas in Computer Architecture - Berkeley
Videos: InfoCoBuild
Course: Course Website
My favorite one, because you will learn a lot about how machines work under the hood. Not digging too deep but quite good.
What you can learn:
- C programming
- Assembly Language, MIPS Intro
6. CS 162 Operating Systems and Systems Programming - Berkeley
Videos: InfoCoBuild
Course: Course Website
A very tough course that I haven’t finished yet. More details about operating systems than CS61C.
What you can learn:
- Processes, Fork, I/O, Files
- Thread, Scheduling, Address Translation, Caching
- Performance, Storage Devices, Queueing theory
- Distributed Systems, TCP/IP
7. CS164 Programming Languages and Compilers - Berkeley
Videos: InfoCoBuild
Course: Course Website
I have not learned this course, but knowing about programming languages and compilers is surely useful for being a software engineer.
What you can learn:
- Unit Calculator
- Scoping and desugaring
- Coroutines and Lazy Iterators
- Regular expressions
- CYK and Earley Parsers
8. CS169 Software Engineering - Berkeley
Videos: InfoCoBuild
Course: Course Website
I have not learned this course, just read the syllabus and think it’s practical and useful.
What you can learn:
- Unit Testing
- Git
- UML
- Software processes
- MVC, Web development
Strengthen your knowledge
I consider the 8 courses above as core courses that you should take seriously. After that, you now have a good understanding of computer science and are ready to improve yourself even more.
The first thing I want to dig into is programming languages. I’m gonna cover a few most popular ones like C, C++, and Python.
To practice coding, join some popular competitive coding sites like Codeforces and TopCoder.
Programming Languages Resources
| No. | Language | Resource | Link | Description | Difficulty |
|---|---|---|---|---|---|
| C_1 | C | The C Programming Language | Amazon | Most popular C book. Short and great for learning basic concepts | ★★☆☆☆ |
| C_2 | C | Practical Programming in C | MIT OCW | MIT course with good practice problems and solutions | ★★★☆☆ |
| C++_1 | C++ | Learn C++ | learncpp.com | Great and detailed tutorial about C++ | ★★☆☆☆ |
| C++_2 | C++ | Accelerated C++ | Amazon | Practices and examples | ★★★☆☆ |
| C++_3 | C++ | The C++ Programming Language | Amazon | Detailed book from author of C++ itself | ★★★☆☆ |
| Py_1 | Python | Official Docs | python.org | Where I started learning Python — detailed and up-to-date | ★★★☆☆ |
| Py_2 | Python | Learn Python the Hard Way | Website | Simple intro to Python | ★☆☆☆☆ |
| Py_3 | Python | Real Python | realpython.com | Detailed examples for practice | ★★☆☆☆ |
| Py_4 | Python | Full Stack Python | fullstackpython.com | Learn what you need to build Python apps | ★★★★☆ |
| Py_5 | Python | Automate the Boring Stuff | Website | Learn to build cool Python projects | ★★☆☆☆ |
Data Structures and Algorithms
| No. | Resource | Link | What you expected to learn |
|---|---|---|---|
| Al_01 | Introduction to Algorithms | MIT OCW | Basic algorithms from MIT |
| Al_02 | Design and Analysis of Algorithms | MIT OCW | Part 2: divide and conquer, dynamic programming, complexity |
| Al_03 | Advanced Algorithms | MIT OCW | Part 3 with advanced algorithms |
| Al_04 | Data Structures and Algorithms | YouTube Playlist | Richard Buckland’s course on basic data structures |
| Al_05 | CS 186: Introduction to Database Systems | InfoCoBuild | Database management systems, SQL |
Data Science
Probability and Statistics
1. Probability and Statistics - Stanford Online
Link: Stanford Lagunita
A solid foundation in probability and statistics from Stanford.
2. Probabilistic Systems Analysis and Applied Probability - MIT
Link: MIT OCW
This course and the Stanford one can be taken separately (you can choose one of them). But for me, I took both and studied at the same time. The MIT course is more engineering-focused while the Stanford course is more statistics-focused.