Sorry it seems my recording software froze on the screen, I guess it's a podcast this week
Read the Substack podcast/newsletter:
open.substack.com/pub/evintun...
The scripts I use to automate the paper finding process:
github.com/evintunador/arxiv-...
Support me either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!
/ tunadorable
account.venmo.com/u/tunadorable
Discuss with other Tunadorks on Discord
/ discord
My other links
linktr.ee/tunadorable
Timestamps:
0:00 Intro
0:56 Accelerate LLM Inference with Bi-directional Multiple Drafting Heads in a Non-autoregressive Style arxiv.org/abs/2406.13170
1:58 Locating and Extracting Relational Concepts in LLMs arxiv.org/abs/2406.13184
3:44 AdaMoE - Token-Adaptive Routing with Null Experts for MoE LMs arxiv.org/abs/2406.13233
5:14 Understanding the RoPE Extensions of Long-Context LLMs - An Attention Perspective arxiv.org/abs/2406.13282
6:37 Lightning-fast Compressing Context for LLM arxiv.org/abs/2406.13618
8:42 Unveiling the Hidden Structure of Self-Attention via Kernel PCA arxiv.org/abs/2406.13762
10:16 Elliptical Attention arxiv.org/abs/2406.13770
12:23 Distributional reasoning in LLMs arxiv.org/abs/2406.13858
14:10 Complex fractal trainability boundary can arise from trivial non-convexity arxiv.org/abs/2406.13971
15:36 Ranking LLMs by compression arxiv.org/abs/2406.14171
16:37 In Tree Structure Should Sentence Be Generated arxiv.org/abs/2406.14189
19:28 Whiteboard-of-Thought - Thinking Step-by-Step Across Modalities arxiv.org/abs/2406.14562
20:53 Mixture-of-Agents Enhances LLM Capabilities arxiv.org/abs/2406.04692
23:05 Beyond Scaling Laws - Understanding Transformer Performance with Associative Memory arxiv.org/abs/2405.08707
25:06 Accessing GPT-4 level Mathematical Olympiad Solutions via MC Tree Self-refine with LLaMa-3 8B arxiv.org/abs/2406.07394
26:59 FL driven LLMs for Swarm Intelligence - A Survey arxiv.org/abs/2406.09831
27:50 3D-RPE - Enhancing Long-Context Modeling Through 3D Rotary Position Encoding arxiv.org/abs/2406.09897
30:02 An elementary proof of a universal approximation theorem arxiv.org/abs/2406.10002
32:20 Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask arxiv.org/abs/2406.10034
33:36 LieRE - Generalizing Rotary Position Encodings arxiv.org/abs/2406.10322
35:56 Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs arxiv.org/abs/2406.10209
38:15 Multilingual LLMs and Curse of Multilinguality arxiv.org/abs/2406.10602
39:29 Breaking the Attention Bottleneck arxiv.org/abs/2406.10906
41:16 Understanding Understanding - A Pragmatic Framework Motivated by LLMs arxiv.org/abs/2406.10937
42:11 A Peek into Token Bias - LLMs Are Not Yet Genuine Reasoners arxiv.org/abs/2406.11050
43:30 What Kinds of Tokens Benefit from Distant Text? An Analysis on Long Context LMing arxiv.org/abs/2406.11238
44:48 Dynamic Data Mixing Maximizes Instruction Tuning for MoE arxiv.org/abs/2406.11256
45:44 MetaGPT - Merging LLMs Using Model Exclusive Task Arithmetic arxiv.org/abs/2406.11385
47:24 Promises, Outlooks and Challenges of Diffusion LMing arxiv.org/abs/2406.11473
48:31 A Critical Study of What Code-LLMs (Do Not) Learn arxiv.org/abs/2406.11930
50:33 Tokenization Falling Short - The Curse of Tokenization arxiv.org/abs/2406.11687
52:55 Transcendence - Generative Models Can Outperform The Experts That Train Them arxiv.org/abs/2406.11741
53:54 Provable Guarantees for Model Performance via Mechanistic Interpretability arxiv.org/abs/2406.11779
55:15 How Do LLMs Acquire Factual Knowledge During Pretraining? arxiv.org/abs/2406.11813
56:32 Can LLMs Learn Macroeconomic Narratives from Social Media? arxiv.org/abs/2406.12109
57:54 LLMs Are Prone to Fallacies in Causal Inference arxiv.org/abs/2406.12158
59:27 Mixture of Scales - Memory-Efficient Token-Adaptive Binarization for LLMs arxiv.org/abs/2406.12311
60:56 Translation Equivariant Transformer Neural Processes arxiv.org/abs/2406.12409
62:50 What makes two models think alike? arxiv.org/abs/2406.12620
63:53 Estimating Knowledge in LLMs Without Generating a Single Token arxiv.org/abs/2406.12673
65:04 Can LLMs Always Solve Easy Problems if They Can Solve Harder Ones? arxiv.org/abs/2406.12809
66:15 LaMDA - Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation arxiv.org/abs/2406.12832
67:23 Synergizing Foundation Models and FL - A Survey arxiv.org/abs/2406.12844
68:41 Outro
Негізгі бет Hella New AI Papers This Week - June 21, 2024
Пікірлер: 18