[Webinar] Scaling Out GenAI with Message Queues on Kubernetes with Jérôme Petazzoni.

Load balancers are a staple of scalable, high-throughput, high-availability architectures. They work great to scale web services. When requests take longer, though, things get complicated. Requests can pile up on some backends; bursts of traffic can send the latency through the roof; and when autoscaling kicks in, it might be too late and/or too expensive.
Asynchronous architectures and message queues can help a lot here combined with event-driven autoscaling.
We're going to see how to implement that pattern on Kubernetes, leveraging:
- A popular LLM to generate thousands of completions;
- RabbitMQ and PostgreSQL to store requests and responses;
- Bento to implement API servers, producers, and consumers without writing code;
- Prometheus, Grafana, and KEDA for observability, dashboard, and autoscaling;
- Helm and Helmfile to automate deployment as much as possible.
***********************************************************
PerfectScale makes it easy for DevOps and SRE professionals to govern, right-size and scale Kubernetes to continually meet customer demand.
By comparing overtime usage patterns with resource configurations we provide actionable recommendations that improve performance while eliminating wasted compute resources up to 60%.
Get the data-driven intelligence needed to ensure peak Kubernetes performance at the lowest possible cost with PerfectScale.
👉 Start your free trial today: www.perfectsca...
👉 Book a demo and let's talk: www.perfectsca...
***********************************************************
► PerfectScale is platform agnostic, supporting EKS, EKS Anywhere, GKE, AKS, KOPS, and other Kubernetes distributions.
► Trusted globally by DevOps, SRE, and Platform Engineering teams at leading companies like Rapyd and Paramount Pictures.
#kubernetes #k8s #devops #sre #EKS #platformengineering #AKS #GKE #genai

Жүктеу

Microservices are Technical Debt

The Tragedy of systemd

Ты главная подозреваемая | 3 серия | Сериал «Эскорт. Новый вызов» | КОНКУРС

Стойкость Фёдора поразила всех!

БЕЛКА СЬЕЛА КОТЕНКА?#cat

когда не обедаешь в школе // EVA mash

Top 5 techniques for building the worst microservice system ever - William Brander - NDC London 2023

[Webinar] Optimizing Kubernetes Node Utilization: From Theory to Practice

AusNOG 2024 - Tim Raphael - Nokia

Postgres just got even faster

John Mearsheimer and Jeffrey Sachs | All-In Summit 2024

AWS Cloud Practitioner | AWS Certified Cloud Practitioner - Full Course | AWS Training | Edureka

Radxa X4: An N100 Pi

Eric Weinstein - Are We On The Brink Of A Revolution? (4K)

Beyond The Success Of Kotlin / The Documentary by EngX

Kubernetes 101 workshop - complete hands-on

Ты главная подозреваемая | 3 серия | Сериал «Эскорт. Новый вызов» | КОНКУРС

[Webinar] Scaling Out GenAI with Message Queues on Kubernetes with Jérôme Petazzoni.

Пікірлер