**Architectural Understanding**

1. **Define** deep learning as neural networks with multiple hidden layers that enable hierarchical feature learning (not just "3+ layers" but *why* depth matters for representation learning)  
2. **Explain** why "deep" matters: each layer learns increasingly complex features  
3. **Compare** how deep learning is different from machine learning, and be able to identify other deep learning methods

**When to Use Deep Learning** 

4\. **Choose** deep learning when you have: large datasets, unstructured data (images/text), complex patterns. Also understand why pretrained models work and when to fine-tune vs. feature extraction

5\. **Avoid** deep learning when you need: explainable results, small datasets, or simple patterns

**Understanding BERT** 

6\. **Explain** what BERT does: understands word meaning based on context (unlike older methods) 

7\. **Understand** BERT training: pre-training with a massive dataset, masked language modeling, bidirectional learning and the transformer framework

8\. **Recognize** why pretrained models save time and work better than training from scratch

**Practical Implementation and Evaluation** 

10\. **Implement** sentiment analysis using pretrained BERT via Hugging Face transformers 

11\. **Evaluate** model performance using appropriate metrics for classification tasks 

12\. **Interpret** the model's confidence scores and predictions

**Notes**

What is deep learning?  
Video tutorial: [Link](https://www.youtube.com/watch?v=q6kJ71tEYqM)

- Previously we learned what machine learning is  
- Deep learning is a subset of machine learning  
- A subfield of AI is ML \-\> Neural Network \-\> Deep Learning  
  - More than three layers of neural network is considered deep neural network \-\> deep learning  
  - Can ingest unstructured data and determine \-\> different from supervised learning \-\> unsupervised learning

When to use  
Video: [Link](https://www.youtube.com/watch?v=o3bWqPdWJ88)

- Unstructured data, like image, video, text  
- High volumn of data \-\> deep learning will give you better result  
- Complexity of feature \-\> complicated features \-\> deep learning  
- Interpretability (important)  
  - Industries like healthcare and finance require high interpretability, which is better answered by statistical ML  
  - Deep learning’s complex neural networks makes it hard to interpret

BERT

- Google search is powered by BERT (bidirectional encoder representations from transformers)   
  - BERT base, BERT large  
- If you have two homes, how can you say if there are similar  
  - For an object, if you can derive and compare features and compare their similarities…take all the numbers and create vectors and compare the vectors, you can then compare  
- Generate feature vector for these words \-\> compare feature vector/word embedding  
- How to generate word embeddings  
- Word to vector (word2vec)  
- Issues with word2vec \-\> you need a model that can generate contextualized meaning of words \-\> this is what BERT allows you to do

Pretrained BERT for sentiment analysis

- Download and install Transformer from huggingface  
  - Install and import dependencies  
  - Instantiat model \- bert-base-multilingual-uncased-sentiment  
- Perform sentiment scoring  
  - Encode and calculate sentiment