S2AI-BigData

Introduction

With the recent advancement in LLMs and foundation models, AI agents have gained popularity in a wide range of sectors such as education, finance, healthcare, research, transportation, and beyond. These agentic systems could consist of multiple LLM-based agents, tools, memory and storage to effectively solve complex tasks. However, as AI Agents become more embedded in complex environments and big data infrastructures, new challenges arise that need careful attention. Among the most pressing concerns is the need to prioritize AI agents to be more secure, explainable, and resilient.

The increasing number of AI agents and their integration with external tools and third-party applications significantly expand the attack surfaces within big data infrastructures. Each new connection or integration represents a possible vulnerability that can be exploited by adversaries to manipulate, gain unauthorized access to agents or sensitive data. During the data handling process by the AI agents, there arises a need to ensure compliance with regulatory frameworks in order to meet the standards for data privacy and transparency.

Further, as AI agents interact with big data infrastructure systems to perform tasks in an autonomous or semi-autonomous fashion, it is critical to understand their actions to increase user trust and enable intervention to prevent harmful actions. For example, trace data generated by agents could be mined to help explain their behavior, including why they took certain decisions or failed to complete a task successfully. But, how to generate and interpret trace data to ensure faithfulness and accountability of agent actions remains an open question. This challenge is further exacerbated in big data environments because Volume, Verocity, Variety, and Veracity make it much harder to both record and interpret traces.

In addition, AI agents should also be robust to failures in dynamic big data systems. These systems often consist of multiple components such as AI agents, infrastructure elements and data sources, all of which can introduce various forms of noise, failures, and faults. Such issues may lead to cascading errors throughout the system. Therefore, AI agents must be designed to be resilient to handle these challenges and perform tasks reliably.

Toward solving all these challenges, the goal of this workshop is to invite researchers and practitioners from the big data community to exchange ideas and develop algorithms, practical tools, and frameworks that aid in developing secure and safe AI agents for big data infrastructure.

Workshop on Secure and Safe AI Agents for Big Data Infrastructures

Virtual co-located with IEEE BigData 2025

Introduction

Important Dates