Callum McDougall, Arthur Conmy, Cody Rushing and I go through our paper, Copy Suppression: Comprehensively Understanding an Attention Head, and discuss the key ideas and themes. Part 1 of 3
Part 1: • A Walkthrough of Copy ...
Part 2: • A Walkthrough of Copy ...
Part 3: • A Walkthrough of Copy ...
Copy Suppression paper: arxiv.org/abs/...
Streamlit App: copy-suppressi...
Timestamps
0:08 Introduction
1:16 Quick Summary of Paper
2:09 What is Copy Suppression?
5:10 Backstory Behind Paper
7:38 Why did we look at this head?
10:50 The difference between Name Movers and Copy Suppression
13:40 Anti-Induction Scores Vs Copy-Suppression Scores
15:52 Question
24:16 Why 25%?
27:29 Why It's Surprising to Find Copy Suppression in a Model
37:16 Guesses to why Copy Suppression Forms
44:48 The Implications of Copy Suppression
46:24 Copy Suppression and Self Repair
49:53 The Relationship Between Copy Suppression and Circuit Discovery Tools
52:15 Closing Thoughts
Thanks to Brooklyn Rose Ludlow for editing.
Негізгі бет A Walkthrough of Copy Suppression w/ Callum McDougall, Arthur Conmy & Cody Rushing Part 1/3
Пікірлер: 6