Deep Learning 8 Week Summary
Written on
This is a summary of my first 8 weeks following my 12 week deep learning path generated by ChatGPT. The journey focused primarily on Andrew Ng's Deep Learning Specialization on Coursera, which proved to be an excellent foundational resource.
Week 1: Neural Network Fundamentals
I started with the basics of neural networks, focusing on understanding what they are and how they work at a fundamental level.
Material Covered: - 3Blue1Brown But what is a neural network? | Deep learning chapter 1 - Neural Networks and Deep Learning - Module 1, Weeks 1 & 2 - Started reading Deep Learning but decided to pause until completing the Coursera course
Reflection: The Coursera course is excellent and Andrew Ng is an amazing educator. Despite some work interruptions, I was happy with the progress.
Week 2: Gradient Descent and Multi-Layer Networks
This week focused on understanding the training process for multi-layer neural networks, particularly gradient descent optimization.
Material Covered: - Interactive Gradient Descent Demo - Master Gradient Descent through Visual Animation - Neural Networks and Deep Learning - Module 1, Weeks 3 & 4
Reflection: Ended this week with a much better understanding of training multi-layer neural networks. While math has always been a weak point for me, Andrew's reassurance that you can grasp concepts without being a mathematical expert was encouraging.
Week 3: Hyperparameter Tuning and Side Projects
I decided to complete all 5 courses in the Deep Learning Specialization to build foundational knowledge. Also began planning an AI Scribe implementation project.
Material Covered: - Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization - Week 1 - Structuring Machine Learning Projects - Started Week 1
Side Project: Set up a test server by dual-booting my gaming PC with Debian to work on an AI Scribe implementation. AI scribes are popular in primary care, making this both educationally valuable and relevant to my work evaluating privacy and security of these products.
Week 4: Optimization Methods
Advanced optimization techniques for improving neural network training.
Material Covered: - Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization - Week 2
Topics Covered: - Apply optimization methods such as (Stochastic) Gradient Descent, Momentum, RMSProp and Adam - Use random minibatches to accelerate convergence and improve optimization - Describe the benefits of learning rate decay and apply it to your optimization
Reflection: The more I progressed in learning the basics of deep learning, the more excited I became about reading research papers and running my own experiments. There's so much here that's ripe for progress and new ideas.
Week 5: TensorFlow and Hyperparameter Tuning
After a brief pause for convocation in Washington DC and recovering from illness, I completed the second course in the specialization.
Material Covered: - Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization - Week 3
Topics Covered: - Master the process of hyperparameter tuning - Describe softmax classification for multiple classes - Apply batch normalization to make your neural network more robust - Build a neural network in TensorFlow and train it on a TensorFlow dataset - Describe the purpose and operation of GradientTape - Use tf.Variable to modify the state of a variable - Apply TensorFlow decorators to speed up code - Explain the difference between a variable and a constant
Applied Work: Set up test server with: - OS: Debian 12 - CPU: 12th Gen Intel Core i5-12400 4.4 GHz - GPU: GeForce RTX 3070 - Memory: 24G - Disk: 2G SSD
Began experimenting with whisper to translate recordings of patient conversations to text, then using a local LLM to translate to SOAP format. Also explored the new MCP Course from Hugging Face.
Week 6: ML Strategy and Project Structure
This week focused on strategic approaches to structuring machine learning projects.
Material Covered: - Structuring Machine Learning Projects - Week 1
Topics Covered: - Why ML Strategy - Orthogonalization - Single Number Evaluation Metric - Satisficing and Optimizing Metric - Train/Dev/Test Distributions - Size of the Dev and Test Sets - When to Change Dev/Test Sets and Metrics - Why Human-level Performance - Avoidable Bias - Understanding Human-level Performance - Surpassing Human-level Performance - Improving your Model Performance - Andrej Karpathy Interview
Reflection: The quiz for this week was particularly interesting. Andrew compares the testing method to a "flight simulator" - rather than just asking questions related to the topics covered, it provides a fictional scenario and asks what actions you would take as you train an LLM for a customer. This method of testing was much more effective in connecting the underlying intuitions with real-world use.
Interesting Find: Came across this post by security researcher Sean Heelan on using o3 to find a 0-day vulnerability. o3 found a new 0-day 1 in 100 tries, which is likely better than human-level performance if you had 100 kernel experts looking at the same codebase.
Week 7: Error Analysis and Transfer Learning
Advanced project structuring topics including error analysis and transfer learning.
Material Covered: - Structuring Machine Learning Projects - Week 2
Topics Covered: - Carrying Out Error Analysis - Cleaning Up Incorrectly Labeled Data - Build your First System Quickly, then Iterate - Training and Testing on Different Distributions - Bias and Variance with Mismatched Data Distributions - Addressing Data Mismatch - Transfer Learning - Multi-task Learning - What is End-to-end Deep Learning - Whether to use End-to-end Deep Learning
Applied Work: Began working with a physician at work on converting unstructured data from clinical notes to the OMOP Common Data Model (CDM).
Used Synthea to generate synthetic patient data including unstructured clinical notes. The plan: - Generate simple synthetic patient data including unstructured clinical notes - Test different prompts to convert this data to OMOP CDM - Test different models, both local and over API - Measure accuracy of prompt/model combinations
Set up Mistral-7B-Instruct-v0.2 and started testing.
Week 8: Convolutional Neural Networks
Introduction to CNNs and their application to computer vision tasks.
Material Covered: - Convolutional Neural Networks - Week 1
Topics Covered: - Explain the convolution operation - Apply two different types of pooling operations - Identify the components used in a convolutional neural network (padding, stride, filter, ...) and their purpose - Build a convolutional neural network - Implement convolutional and pooling layers in numpy, including forward propagation - Implement helper functions to use when implementing a TensorFlow model - Create a mood classifier using the TF Keras Sequential API - Build a ConvNet to identify sign language digits using the TF Keras Functional API - Build and train a ConvNet in TensorFlow for a binary classification problem - Build and train a ConvNet in TensorFlow for a multiclass classification problem - Explain different use cases for the Sequential and Functional APIs
Applied Work: The labs in the Coursera modules proved to be good introductions to applied techniques. This week's labs used TensorFlow to train a model to recognize smiling faces (binary classification) and sign language digits (multiclass classification). It introduced the sequential and functional APIs and explained the benefits of the functional API.
Key Takeaways
-
Andrew Ng's Deep Learning Specialization is exceptional - The structured approach and clear explanations make complex topics accessible.
-
Applied learning is critical - Setting up a dedicated test environment and working on real-world problems (AI Scribe, OMOP CDM conversion) reinforced theoretical knowledge.
-
Math is helpful but not required to start - You can grasp concepts and build intuition before diving deep into mathematical intricacies.
-
The field is moving rapidly - From o3 finding kernel vulnerabilities to new MCP courses, there's constant innovation and opportunity.
-
Practical labs accelerate understanding - The Coursera labs using TensorFlow provided hands-on experience that cemented theoretical concepts.
Looking Ahead
The remaining weeks will continue with the Deep Learning Specialization, focusing on: - More advanced CNN architectures - Sequence models and RNNs - Attention mechanisms and transformers
The combination of structured learning through Coursera and applied projects provides a solid foundation for moving into research and more advanced applications.