Course Objectives:
While traditional areas of computer science remain highly important, increasingly researchers of the future will be involved with using computers to understand and extract usable information from massive data arising in applications. The main objective of this course is to introduce students to the theoretical and mathematical foundations of data science. This course will be rigorous, and will explore the rich and fascinating math behind some of the popular techniques and intellectual ideas of modern day data science and machine learning.
Course Content
High-dimensional space: Law of large numbers, the geometry of high dimensions (6 lectures)
Best-Fit subspaces and SVD: Introduction, singular vectors, singular value decomposition (SVD), best k-rank approximations, left singular vectors, eigenvectors, applications of SVD (9 lectures)
Random walks and Markov Chains: Introduction, stationary distribution, Markov chain Monte Carlo, areas and volumes, convergence of random walks on undirected graphs, random walks on undirected graph with unit edge weights, random walk in Euclidean space (13 lectures)
Machine learning: Introduction, the perceptron algorithm, kernel functions, generalizing to new data, overfitting and uniform convergence, Occam’s razor, regularization, online learning, support-vector machines, VC-dimension, boosting, stochastic gradient descent, deep learning (14 lectures)
Learning Outcomes:
Upon successful completion of this course, the student will:
- have an understanding of basic mathematical concepts in data science, relating to linear algebra, probability, and calculus.
- be able to employ methods related to these concepts in a variety of data science applications.
- be able to adopt a rigorous and mathematical approach to solving problems in machine learning and data science.
- be able to apply the mathematical concepts discussed over the duration of the course.
Text Books:
- Avrim Blum, John Hopcroft and Ravindran Kannan, Foundations of Data Science, Cambridge University Press, February 29, 2020, ISBN-13: 978-1108485067
References:
Will be prescribed by the instructor on a topic-by-topic basis.
Past Offerings
- Offered in Jan-May, 2024 by Deepak Rajendraprasad
- Offered in Jan-May, 2023 by Deepak Rajendraprasad
- Offered in Jan-May, 2022 by Deepak Rajendraprasad
- Offered in Jan-May, 2021 by Deepak
Course Metadata
Item | Details |
---|---|
Course Title | Foundations of Data Science and Machine Learning |
Course Code | CS5014 |
Course Credits | 3-0-0-3 |
Course Category | PMT |
Proposing Faculty | Albert Sunny |
Approved on | Senate 11 of IIT Palakkad |
Course prerequisites | Probability & Linear Algebra |
Course status | NEW |