Hella New AI Papers This Week

Sorry it seems my recording software froze on the screen, I guess it's a podcast this week
Read the Substack podcast/newsletter:
open.substack.com/pub/evintun...
The scripts I use to automate the paper finding process:
github.com/evintunador/arxiv-...
Support me either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!
/ tunadorable
account.venmo.com/u/tunadorable
Discuss with other Tunadorks on Discord
/ discord
My other links
linktr.ee/tunadorable
Timestamps:
0:00 Intro
0:56 Accelerate LLM Inference with Bi-directional Multiple Drafting Heads in a Non-autoregressive Style arxiv.org/abs/2406.13170
1:58 Locating and Extracting Relational Concepts in LLMs arxiv.org/abs/2406.13184
3:44 AdaMoE - Token-Adaptive Routing with Null Experts for MoE LMs arxiv.org/abs/2406.13233
5:14 Understanding the RoPE Extensions of Long-Context LLMs - An Attention Perspective arxiv.org/abs/2406.13282
6:37 Lightning-fast Compressing Context for LLM arxiv.org/abs/2406.13618
8:42 Unveiling the Hidden Structure of Self-Attention via Kernel PCA arxiv.org/abs/2406.13762
10:16 Elliptical Attention arxiv.org/abs/2406.13770
12:23 Distributional reasoning in LLMs arxiv.org/abs/2406.13858
14:10 Complex fractal trainability boundary can arise from trivial non-convexity arxiv.org/abs/2406.13971
15:36 Ranking LLMs by compression arxiv.org/abs/2406.14171
16:37 In Tree Structure Should Sentence Be Generated arxiv.org/abs/2406.14189
19:28 Whiteboard-of-Thought - Thinking Step-by-Step Across Modalities arxiv.org/abs/2406.14562
20:53 Mixture-of-Agents Enhances LLM Capabilities arxiv.org/abs/2406.04692
23:05 Beyond Scaling Laws - Understanding Transformer Performance with Associative Memory arxiv.org/abs/2405.08707
25:06 Accessing GPT-4 level Mathematical Olympiad Solutions via MC Tree Self-refine with LLaMa-3 8B arxiv.org/abs/2406.07394
26:59 FL driven LLMs for Swarm Intelligence - A Survey arxiv.org/abs/2406.09831
27:50 3D-RPE - Enhancing Long-Context Modeling Through 3D Rotary Position Encoding arxiv.org/abs/2406.09897
30:02 An elementary proof of a universal approximation theorem arxiv.org/abs/2406.10002
32:20 Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask arxiv.org/abs/2406.10034
33:36 LieRE - Generalizing Rotary Position Encodings arxiv.org/abs/2406.10322
35:56 Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs arxiv.org/abs/2406.10209
38:15 Multilingual LLMs and Curse of Multilinguality arxiv.org/abs/2406.10602
39:29 Breaking the Attention Bottleneck arxiv.org/abs/2406.10906
41:16 Understanding Understanding - A Pragmatic Framework Motivated by LLMs arxiv.org/abs/2406.10937
42:11 A Peek into Token Bias - LLMs Are Not Yet Genuine Reasoners arxiv.org/abs/2406.11050
43:30 What Kinds of Tokens Benefit from Distant Text? An Analysis on Long Context LMing arxiv.org/abs/2406.11238
44:48 Dynamic Data Mixing Maximizes Instruction Tuning for MoE arxiv.org/abs/2406.11256
45:44 MetaGPT - Merging LLMs Using Model Exclusive Task Arithmetic arxiv.org/abs/2406.11385
47:24 Promises, Outlooks and Challenges of Diffusion LMing arxiv.org/abs/2406.11473
48:31 A Critical Study of What Code-LLMs (Do Not) Learn arxiv.org/abs/2406.11930
50:33 Tokenization Falling Short - The Curse of Tokenization arxiv.org/abs/2406.11687
52:55 Transcendence - Generative Models Can Outperform The Experts That Train Them arxiv.org/abs/2406.11741
53:54 Provable Guarantees for Model Performance via Mechanistic Interpretability arxiv.org/abs/2406.11779
55:15 How Do LLMs Acquire Factual Knowledge During Pretraining? arxiv.org/abs/2406.11813
56:32 Can LLMs Learn Macroeconomic Narratives from Social Media? arxiv.org/abs/2406.12109
57:54 LLMs Are Prone to Fallacies in Causal Inference arxiv.org/abs/2406.12158
59:27 Mixture of Scales - Memory-Efficient Token-Adaptive Binarization for LLMs arxiv.org/abs/2406.12311
60:56 Translation Equivariant Transformer Neural Processes arxiv.org/abs/2406.12409
62:50 What makes two models think alike? arxiv.org/abs/2406.12620
63:53 Estimating Knowledge in LLMs Without Generating a Single Token arxiv.org/abs/2406.12673
65:04 Can LLMs Always Solve Easy Problems if They Can Solve Harder Ones? arxiv.org/abs/2406.12809
66:15 LaMDA - Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation arxiv.org/abs/2406.12832
67:23 Synergizing Foundation Models and FL - A Survey arxiv.org/abs/2406.12844
68:41 Outro

Жүктеу

Пікірлер: 18

@sikunowlol
13 күн бұрын
you messed up the video btw
@Tunadorable
13 күн бұрын
? how so
@sikunowlol
13 күн бұрын
@@Tunadorable the browser isn't showing in the vid (what you want to show)
@Tunadorable
13 күн бұрын
ooooh damn yeah I see, my recording software OBS did that. I guess it's a podcast this week
@sikunowlol
13 күн бұрын
@@Tunadorable happens
@TheDarkhawk243
13 күн бұрын
this is why multiple monitors are important.
@Tunadorable
13 күн бұрын
lmao i would if my 8gb of macbook air ram could handle multiple monitors
@Lolleka
12 күн бұрын
This is why having an income is important.
@Tunadorable
12 күн бұрын
💀
@TheDarkhawk243
12 күн бұрын
@@Lolleka agreed
@WaxenPith237
12 күн бұрын
It seems that links for timestamps past 1 hour aren't being generated in the description
@Tunadorable
12 күн бұрын
youtube limits the total number of characters in the description so i’ve setup my timestamp generating script to remove timestamps with the shortest time period until it gets below the character limit. meaning there are a ton more missing throughout the rest of the video as well
@alfinal5787
13 күн бұрын
Efficient NLP has a very good explanation of RoPE
@sadface7457
13 күн бұрын
These papers are getting wild
@ArchMithrillas6976
13 күн бұрын
didnt know u started a podcast
@sikunowlol
13 күн бұрын
oi
@Tunadorable
13 күн бұрын
you are now the equivalent to people posting "first" but for my channel lmao
@sikunowlol
13 күн бұрын
@@Tunadorable LMAO

The Illusion of State in State-Space Models (like Mamba)

Stanford Computer Scientist Answers Coding Questions From Twitter | Tech Support | WIRED

Can You Draw A PERFECTLY Dotted Line?

Haha😂 Power💪 #trending #funny #viral #shorts

Increíble final 😱

Really practical tips and tricks! How to securely fasten a wire to a metal pole #shorts #diy #tips

Trump’s Second Term: Last Week Tonight with John Oliver (HBO)

Why Does Diffusion Work Better than Auto-Regression?

How Britain Became a Poor Country

Francois Chollet - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

How Cohere will improve AI Reasoning this year

What Is an AI Anyway? | Mustafa Suleyman | TED

NVIDIA Unveils "NIMS" Digital Humans, Robots, Earth 2.0, and AI Factories

Prof. Geoffrey Hinton - "Will digital intelligence replace biological intelligence?" Romanes Lecture

How AI was Stolen

AI the Product vs AI the Feature

Can You Draw A PERFECTLY Dotted Line?

Hella New AI Papers This Week - June 21, 2024

Пікірлер: 18