Check out the Newsletter/Podcast with summaries of all the papers I kept:
open.substack.com/pub/evintun...
Support my learning journey either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!
/ tunadorable
account.venmo.com/u/tunadorable
Discuss this stuff with other Tunadorks on Discord
/ discord
All my other links
linktr.ee/tunadorable
Timestamps:
0:00 Intro
0:52 Accelerated Grokking by Amplifying Slow Gradients arxiv.org/abs/2405.20233
2:54 Standard Language Ideology in AI-Generated Language arxiv.org/abs/2406.08726
5:16 Optimizing Large Model Training through Overlapped Activation Recomputation arxiv.org/abs/2406.08756
6:01 Zoom and Shift are All You Need arxiv.org/abs/2406.08866
7:32 Diffusion - An Elementary Tutorial arxiv.org/abs/2406.08929
9:12 A Memory-Efficient Expert Switching Framework for LLMs arxiv.org/abs/2406.09041
11:17 Chain of Preference Optimization arxiv.org/abs/2406.09136
12:16 Scalable Functional Encryption in Federated Learning through Weight Clustering and Probabilistic Filters arxiv.org/abs/2406.09152
13:17 Towards Bidirectional Human-AI Alignment - A Systematic Review arxiv.org/abs/2406.09264
14:37 Analysing Neurons Across Languages and Tasks in LLMs arxiv.org/abs/2406.09265
15:42 Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding arxiv.org/abs/2406.09297
17:24 Why Warmup the Learning Rate? arxiv.org/abs/2406.09405
18:44 Interpreting the Weight Space of Customized Diffusion Models arxiv.org/abs/2406.09413
19:14 Explore the Limits of Omni-modal Pretraining at Scale arxiv.org/abs/2406.09412
21:14 Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? arxiv.org/abs/2406.04391
23:08 Scaling Speech Decoding With Self-Supervised Learning arxiv.org/abs/2406.04328
24:36 A Systematic Survey of Prompting Techniques arxiv.org/abs/2406.06608
27:48 When Swarm Learning meets energy series data arxiv.org/abs/2406.04743
29:47 Your Language Agents Already Know How to Achieve High-level Goals arxiv.org/abs/2406.04784
31:12 Zero, Finite, and Infinite Belief History of Theory of Mind Reasoning in LLMs arxiv.org/abs/2406.04800
31:51 LLMs emulate certain cognitive profiles arxiv.org/abs/2406.04988
32:49 Compositional Generalization with Grounded LLMs arxiv.org/abs/2406.04989
34:25 Large Generative Graph Models arxiv.org/abs/2406.05109
36:07 The Factorization Curse arxiv.org/abs/2406.05183
38:14 How to Strategize Human Content Creation in the Era of GenAI? arxiv.org/abs/2406.05187
40:41 Information Geometry of Evolution of NN Params While Training arxiv.org/abs/2406.05295
42:07 Concept Formation and Alignment in LLMs arxiv.org/abs/2406.05315
43:11 Critical Phase Transition in a LLM arxiv.org/abs/2406.05335
44:16 Natural Language-Oriented Programming arxiv.org/abs/2406.05409
48:10 Generalist Multimodal AI - A Review arxiv.org/abs/2406.05496
49:29 Automata Extraction from Transformers arxiv.org/abs/2406.05564
50:31 The Price of Debiasing Language Models arxiv.org/abs/2406.05587
51:22 Attention as a Hypernetwork arxiv.org/abs/2406.05816
52:52 LLM-powered Personalized Agent for Long-term Dialogue arxiv.org/abs/2406.05925
54:14 Recurrent Context Compression arxiv.org/abs/2406.06110
55:42 LLMs Resist Alignment arxiv.org/abs/2406.06144
56:46 Lifelong Learning of LLMs - A Survey arxiv.org/abs/2406.06391
57:25 What's in an embedding? arxiv.org/abs/2406.06870
58:43 Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot arxiv.org/abs/2406.06893
59:06 Effectively Compress KV Heads for LLM arxiv.org/abs/2406.07056
60:55 Teaching LLMs to Self-Improve by Learning from Language Feedback arxiv.org/abs/2406.07168
61:24 Ternarized LLM arxiv.org/abs/2406.07177
62:52 Needle In A Multimodal Haystack arxiv.org/abs/2406.07230
63:19 Limited Out-of-Context Knowledge Reasoning in LLMs arxiv.org/abs/2406.07393
63:41 Hybrid State Space Models for Efficient Unlimited Context Language Modeling arxiv.org/abs/2406.07522
66:31 Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy arxiv.org/abs/2406.07735
68:15 Are LLMs Good Statisticians? arxiv.org/abs/2406.07815
70:53 An Empirical Study of Mamba-based LLMs arxiv.org/abs/2406.07887
74:45 LLMs Must Be Taught to Know What They Don't Know arxiv.org/abs/2406.08391
75:25 Scaling Laws in Linear Regression arxiv.org/abs/2406.08466
76:19 Outro
Негізгі бет Hella Brand New AI Papers - June 15, 2024
Пікірлер: 33