hacking
Many think that “hacker” refers to some self-taugh
Read More
Let’s get real—AI agents aren’t easy to build or deploy. But once they’re embedded, the impact is incredible. I love hearing from UiPath customers like Ainara Etxeandia Sagasti, Head of the Digital Services at Lantik, who’s "combining RPA, generative AI, and agentic technology [to make] public services more accessible, efficient, and citizen-focused than ever." Already, more than 10,000 AI agents have been built on the UiPath Platform™. Agents can transform process efficiency and profitability, but they need strong orchestration and help from automation and humans in the loop. In this blog post, I’ll cover the most common pain points when building, testing, or deploying AI agents at scale. I’ll also explain how an orchestrated approach—built on controlled agency and interoperability—can mitigate them. 1. Performance and reliability of agents Developers and users frequently cite the unreliability of AI agents as a barrier to production. Large language models (LLMs) make agents flexible and adaptable, but this also leads to inconsistent outputs. This can frustrate development and testing. As one engineer put it, “My agents sometimes work perfectly, then completely fail on similar inputs. We need better ways to simulate edge cases and reproduce failures consistently… monitoring agent ‘drift’ over time is a real headache.” Another challenge is hallucinations—agents making up facts or tool inputs—which can grind processes to a halt. A user building AI workflows shared: “The biggest pain points we find are repeatability and hallucinations… ensuring that for the same or similar queries the LLM agents don’t go off the rails and hallucinate inputs to other tools.” This unpredictability needs extensive testing and validation, but agent testing tools are immature. When errors occur, they can be hard to diagnose due to opaque model reasoning. This causes teams to be extremely cautious about changes: “We’re so wary of system prompt changes at this point because we’ve been burned by telling the agent not to do something and then it starts behaving weird… so many times.”