Additional course websites: UIUC Canvas
Ge Liu
Instructor, Assistant Prof. @ UIUC CS
geliu[AT]illinois[dot]edu
Stefan Ivanovic
TA
stefan4@illinois.edu
Course description
In this graduate-level course, we will introduce foundational and state-of-the-art machine-learning approaches to key problems in life sciences and biology, with an emphasis on deep learning for molecular engineering. The course is structured into four main modules: functional genomics, protein language models, geometric DL, and protein structure models. We will cover a wide range of DL architectures (CNN, Transformer, GNN, etc.), generative models (e.g., language models and diffusion) and machine learning techniques (supervised, unsupervised, optimization), and discuss their applications to modeling, understanding, and designing the sequence, structure, and functions of genomes and proteins. Students will learn about both theoretical concepts and applications to biological problems, and gain practical experience through implementing solutions to problem sets, reading papers, and conducting final course project.
Logistics
- Lecture: WF 11:00 - 12:15 in 1302 Siebel Center for Comp Sci
- Office hours: Thursday 3pm-4pm in 3212 Siebel Center for Comp Sci
- TA office hours: F 10:00-10:50 in Siebel Center floor 0 lobby
Grading
Grading will be based upon four problem sets containing either programming questions or written questions (40%), a Course Project (35%), reading assignments (20%), and participation (5%). Attendance in lecture is important as the class moves quickly and you will need to be present. You can use three late days for problem set deadlines (or email the course staff).
Syllabus and schedule
Date | Lecture | Title | Assignment | |
---|---|---|---|---|
8/28/2024 | Lecture 1 | Logistics and intro | ||
8/30/2024 | Lecture 2 | ML foundations | ||
9/4/2024 | Lecture 3 | ML foundations | ||
Module I: Functional Genomics | ||||
9/6/2024 | Lecture 4 | DL for functional genomics I | ||
9/11/2024 | Lecture 5 | CNN/ResNet | ||
9/13/2024 | Lecture 6 | RNN, LSTM, StateSpaceModel(S4,Mamba,Hyena) | Reading assignment 1 | |
9/18/2024 | Lecture 7 | Model uncertainty and interpretability | Problem set 1 out | |
9/20/2024 | Lecture 8 | DL for functional genomics II | ||
Module II: Protein Language Models | ||||
9/25/2024 | Lecture 9 | Protein sequenece modeling I | ||
9/27/2024 | Lecture 10 | Transformer and MLM I | ||
10/2/2024 | Lecture 11 | Transformer and MLM II | Reading assignment 2 | |
10/4/2024 | Lecture 12 | Protein sequenece modeling II | Problem set 1 due, Finalize project team-up | |
10/9/2024 | Lecture 13 | Protein function prediction (sequence based) | Problem set 2 out | |
10/11/2024 | Lecture 14 | Protein function optimization (sequence based) I | ||
10/16/2024 | Lecture 15 | Protein function optimization (sequence based) II | ||
Module III: Intro to Geometric DL | ||||
10/18/2024 | Lecture 16 | GNN | ||
10/23/2024 | Lecture 17 | Intro to Geometric DL for molecular representation I | Problem set 2 due, Reading assignment 3 | |
10/25/2024 | Lecture 18 | Intro to Geometric DL for molecular representation II | Problem set 3 out | |
Module IV: Protein Structure Models | ||||
10/30/2024 | Lecture 19 | Protein structure prediction I | ||
10/1/2024 | Lecture 20 | Protein structure prediction II | Project proposal due | |
11/6/2024 | Lecture 21 | Folding and inverse folding | Reading assignment 4 | |
11/8/2024 | Lecture 22 | Protein interaction, function prediction | ||
11/13/2024 | Lecture 23 | Generative models primers: VAE,Diffusion, Flow Matching I | Problem set 3 due | |
11/15/2024 | Lecture 24 | Generative models primers: VAE,Diffusion, Flow Matching II | Problem set 4, Reading assignment 5 | |
11/20/2024 | Lecture 25 | Generative model for protein design I | ||
11/22/2024 | Lecture 26 | Generative model for protein design II | ||
Final projects | ||||
11/27/2024 | No lecture | Fall break | ||
11/29/2024 | No lecture | Fall break | ||
12/6/2024 | Lecture 27 | Project Presentations I | ||
12/8/2024 | Lecture 28 | Project Presentations II |
Prerequisites
Linear Algebra, python programming, intro to ML.
Project
This subject has a substantial project component. We recommend (but do not require) working on projects in team of 2-3 students. You are free to choose any problem related to the lectures of the course, and develop a deep learning solution.