Name: Workshop: Efficient and Portable AI / LLM Inference on the Edge Cloud - Xiaowei Hu, Second State
Start: 2024-06-20T11:20:00+0200
End: 2024-06-20T12:35:00+0200

June 19-20, 2024
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for AI_dev Europe to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (Central European Summer Time) UTC/GMT +2 hours. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Back To Schedule

Workshop: Efficient and Portable AI / LLM Inference on the Edge Cloud - Xiaowei Hu, Second State

As AI applications gain popularity, we are increasingly seeing requirements to run AI or even LLM workloads on the edge cloud with heterogeneous hardware (eg GPU accelerators). However, the simplistic approaches are too heavyweight, too slow and not portable. For example, the PyTorch container image is 3GB and a container image for a C++ native toolchain is 300MB. Python apps also require complex dependency packages and could be very slow. Those container images are dependent on the underlying host’s CPU and GPU, making them difficult to manage. Wasm has emerged as a lightweight runtime for cloud native applications. For an AI app, the entire Wasm runtime and app can be under 20MB. The Wasm binary app runs at native speed, integrates with k8s and is portable across CPUs & GPUs. In this tutorial, we will demonstrate how to create and run Wasm-based AI applications on edge server or local host. We will showcase AI models and libraries for media processing (Mediapipe), vision (YOLO, amd Llava) and language (Llama2 series of models). You will be able to run all examples on your own laptop at the session.

Speakers

Vivian Hu

Product Manager, Second State

Vivian Hu is a Product Manager at Second State and a columnist at InfoQ. She is a founding member of the WasmEdge project. She organizes Rust and WebAssembly community events in Asia.

Thursday June 20, 2024 11:20 - 12:35 CEST
Monge (Level 3)

AI Systems & Performance

Audience Experience Level Beginner

Feedback form isn't open yet.

AI_dev Europe 2024

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Vivian Hu