Loading…
Attending this event?
June 19-20, 2024
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for AI_dev Europe to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (Central European Summer Time) UTC/GMT +2 hours. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Thursday, June 20 • 15:00 - 15:30
Efficient and Cross-Platform LLM Inference in the Heterogenous Cloud - Michael Yuan, Second State

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

As AI/LLM applications gain popularity, there are increasing demands to run and scale them in the cloud. However, compared with traditional cloud workloads, AI workloads are heavily reliant on the GPU. Linux containers are not portable across different hardware devices, and traditional container management tools are not setup to re-compile applications on new devices at deployment time. Cloud native Wasm provides a new portable bytecode format that abstracts away GPUs and hardware accelerators for these applications. With emerging W3C standards like WASI-NN, you can write and test LLM applications in Rust on your Macbook, and then deploy on a Nvidia cloud server or an ARM NPU device without re-compilation or any change to the Wasm bytecode file. The Wasm apps can also be managed by existing container tools such as Docker, Podman, and K8s, making them a great alternative to Linux containers for this new workload. This talk will discuss how WasmEdge (CNCF sandbox) implements WASI-NN and supports a large array AI/LLM applications. You will learn practical skills on how to build and run LLM applications on ALL your local, edge, and cloud devices using a single binary application.

Speakers
avatar for Michael Yuan

Michael Yuan

Maintainer, CNCF WasmEdge and CEO, Second State
Dr. Michael Yuan is a maintainer of WasmEdge Runtime (a project under CNCF) and a co-founder of Second State. He is the author of 5 books on software engineering published by Addison-Wesley, Prentice-Hall, and O'Reilly. Michael is a long-time open-source developer and contributor... Read More →


Thursday June 20, 2024 15:00 - 15:30 CEST
Monge (Level 3)
  AI Systems & Performance
Feedback form isn't open yet.