[Workshop] AI Engineering 201: Inference

Optional introductory course for AI Engineers, free for all Summit attendees. Advanced knowledge of AI Engineering, led by instructor Charles Frye of the massively popular Full Stack LLM Bootcamp.
Part I: Running Inference
What is the workload?
Open vs Proprietary Models
Execution
End User Device
Over a Network
Serving Inference
Timestamps
0:00:00 Intro & Overview
0:03:52 What is Inference?
0:10:16 Proprietary Models for Inference
0:21:22 Open Models for Inference
0:30:41 Will Open or Proprietary Models Win Long-Term?
0:36:19 Q&A on Models
0:44:12 Inference on End-User Devices
1:04:32 Inference-as-a-Service Providers
1:10:00 Cloud Inference and Serverless GPUs
1:17:46 Rack-and-Stack for Inference
1:20:12 Inference Arithmetic for GPUs
1:27:07 TPUs and Other Custom Silicon for Inference
1:36:11 Containerizing Inference and Inference Services

Жүктеу

Пікірлер: 2

@charles_irl
7 ай бұрын
12:20 -- I missed a decimal point on the price of Anthropic. They were actually cheaper than OpenAI at the time. But with the new GPT-4 Turbo API announced at Dev Day, OpenAI is in fact cheaper than Anthropic now.
@charles_irl
7 ай бұрын
1:16:08 -- I looked into CloudFlare's GPU Workers and they are what I call here "inference-as-a-service", aka running specific models on your behalf, rather than what I call "serverless GPUs", aka running arbitrary workloads for you with scale-to-zero pricing.

MIT Introduction to Deep Learning | 6.S191

AI Engineering 201: The Rest of the Owl

Khóa ly biệt

Homeless Woman and Child Help a Stranger, What Happens Next Will Amaze You! 🌟💖 #kindness #shorts

She ruined my dominos! 😭 Cool train tool helps me #gadget

Unboxing Barbie: Gymnast and Roller dolls😍 Rate our outfits 1-10💖💚🛼🤸🏼‍♀️

Pragmatic AI with TypeChat: Daniel Rosenwasser

19. Rob Pike - What We Got Right, What We Got Wrong | GopherConAU 2023

No Priors Ep. 39 | With OpenAI Co-Founder & Chief Scientist Ilya Sutskever

The AI Evolution: Mario Rodriguez, GitHub

Harnessing the Power of LLMs Locally: Mithun Hunsur

The Most Important Algorithm in Machine Learning

Retrieval Augmented Generation in the Wild: Anton Troynikov

[1hr Talk] Intro to Large Language Models

Master CrewAI: Your Ultimate Beginner's Guide!

What is LangChain?

iOS 18 использует iPhone ВМЕСТО ТЕБЯ. Всё о WWDC 2024!

Какие телефоны запрещены в разных странах мира ?(Часть 2) 📱

478 СОКЕТ НА СТЕРОИДАХ / ЧТО СМОЖЕТ В 2024 ГОДУ?

TOP-18 ФИШЕК iOS 18

Неразрушаемый смартфон

MacBook Air Японский Прикол!

Купил этот ваш VR.

[Workshop] AI Engineering 201: Inference

Пікірлер: 2