Name: Future-Proofing Agent Supervision - Alexandre Variengien & Diego Dorn, EffiSciences
Start: 2024-06-20T14:15:00+0200
End: 2024-06-20T14:45:00+0200

June 19-20, 2024
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for AI_dev Europe to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (Central European Summer Time) UTC/GMT +2 hours. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Back To Schedule

Future-Proofing Agent Supervision - Alexandre Variengien & Diego Dorn, EffiSciences

As autonomous agents powered by LLM are deployed in the real world, there is a need for real-time monitoring to detect and mitigate their unpredictable failures. These challenges, including indirect prompt injection and strategic deception, diverge from traditional software issues due to the agents' emergent capabilities and continuous learning. The question arises: how do we ensure our monitoring systems can preemptively address unforeseen failures? This presentation advocates for rigorous evaluations of agent monitoring systems, highlighting the importance of diverse anomaly detection, engaging with more than just chat interfaces, and tackling nuanced issues like ethical boundaries. We propose a community-driven approach to refine LLM agent supervision, featuring a shared database of failure cases and a unified trace format across applications to foster collaborative innovation. Our framework introduce two metrics: i) accuracy on held-out anomaly, simulating the unforeseen failure modes that will emerge on the future, ii) its proficiency in spotting early warning signs before an harmful action. Join us in shaping the future of agent supervision, to anticipate the unexpected!

Speakers

Diego Dorn

Research Engineer, EffiSciences

Diego Dorn is a Research Engineer at EffiSciences, currently developing a supervisor for LLM agents and a benchmark to evaluate LLM monitoring systems. He draws his expertise from his many projects during his master's in Communication Systems at EPFL, from game jams and hackathons... Read More →

Alexandre Variengien

AI safety researcher, EffiSciences

Alexandre Variengien is a researcher who worked on scalable LLM interpretability at Conjecture and Redwood Research, publishing several papers in top conferences. He's now working as an independent researcher focusing his research efforts on generalist agent supervision. He is also... Read More →

Thursday June 20, 2024 14:15 - 14:45 CEST
Saint-Victor (Level 3)

AI Quality & Security

Audience Experience Level Intermediate

Feedback form isn't open yet.

AI_dev Europe 2024

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Diego Dorn

Alexandre Variengien