Building Data Tooling in Rust for Multimodal AI by Chang She
Visit rstats.ai for information on upcoming conferences.
Abstract: AI adoption is bringing a host of new challenges for data management and new workloads. This is especially true for multi-modal AI where data challenges extend far beyond just embeddings and require new tooling for working with images, audio, video, pdfs, and more. Traditional formats and tooling are optimized for purely tabular data and cannot be used effectively to manage unstructured data types. Instead, a new set of infrastructure and tooling are being built, in Rust. Rust makes high performance data manipulation code much safer, which means developers can move much quicker with more confidence. It's easy to bridge Rust into higher level languages like Python/R to be wrapped into APIs much more familiar to the data science / machine learning users. Finally, Rust offers powerful features for concurrency, which allows developers to parallelize data manipulation tasks much easier.
In this talk we'll use Lance and LanceDB as a source of examples on building high performance data tools for AI in Rust. We'll show you how Rust is used to create blazing fast vector search with hardware acceleration, how Rust helps us create new data management tooling for unstructured data, and how these tools can be exposed in higher level languages like python and javascript.
Bio: Chang She is the CEO and cofounder of LanceDB, the developer-friendly, open-source database for multi-modal AI. A serial entrepreneur, Chang has been building DS/ML tooling for nearly two decades and is one of the original contributors to the pandas library. Prior to founding LanceDB, Chang was VP of Engineering at TubiTV, where he focused on personalized recommendations and ML experimentation.
Twitter: / changhiskhan
Presented at the 2024 New York R Conference (May 16, 2024)
Hosted by Lander Analytics (landeranalytics.com)
Негізгі бет Ғылым және технология Chang She - Building Data Tooling in Rust for Multimodal AI
Пікірлер