I tested 3 city AI chatbots. Here's what they actually do.

Chong Kee Siong via Getty Images
COMMENTARY | Results were decidedly mixed and revealed different problems with each, as well as consistent patterns.
Municipal governments are racing to deploy AI chatbots for resident services. Denver launched "Sunny" in 2024. Winter Haven, Florida rolled out "Ask Winter Haven." Atlanta introduced "Ava" in 2023, announcing it would "put the power of information directly in users' hands."
Municipal chatbots are not neutral service tools — they are public-facing governance instruments that shape how residents understand city policy.
I tested all three with questions residents actually ask: What's the city doing about crime? How do I get housing assistance? What's the policy on homelessness?
The results reveal that cities are deploying systems without understanding what they've bought or how those systems represent them to the public.
I attempted to test additional systems. Palo Alto, California's CityAssist — launched just weeks earlier in December 2025 — was offline due to "a technical disconnect between CityAssist vendor Citibot and our website provider Granicus." Phoenix's myPHX311 and Detroit's Emily require app downloads or phone access, making independent testing impossible.
The three systems I could access — Denver, Winter Haven and Atlanta — revealed different problems, but a consistent pattern.
What I Found
Denver's Sunny handled some questions well. Asked about crime, it provided detailed information about violent crime reduction, auto theft initiatives, and gun violence prevention, with links to State of the City reports. Asked about housing assistance, it listed five specific programs with eligibility requirements and contact information.
But ask about homelessness policy, and the system breaks down entirely. My question: "What is the city doing about the homelessness issue?" Sunny's response: "What is the address of the issue?"
The chatbot couldn't distinguish between a policy question and a service request. It assumed I wanted to report an encampment for removal, not understand the city's approach to homelessness.
Winter Haven's chatbot showed the opposite pattern. Asked about crime, it couldn't answer: "I'm not sure about the specific crime issues currently facing Winter Haven." Asked about local sinkholes, it drew a blank.
But ask about homelessness, and it delivers enforcement statistics: "Since May 2025, the WHPD has removed 25 unlawful campsites in the city." Someone configured this chatbot to emphasize specific messaging about homelessness enforcement. That's not a technical choice—it's a political one.
Atlanta's "Ava" isn't AI at all. It's a keyword search with a friendly name.
Asked "What are the biggest crime issues?" it returns instructions for reporting crimes to Crime Stoppers. Asked about homelessness, it dumps 50+ links including sewer backups and tree removal permits. Asked "How is the city addressing overpolicing?" it provides instructions for changing your address with the Municipal Court.
Atlanta marketed this as AI innovation. It's search results, often irrelevant, always unfiltered.
The Pattern
Three cities, three different systems, one consistent problem: none of them answer basic questions about contested civic issues.
Denver routes homelessness questions to service requests. Winter Haven can't discuss crime but emphasizes homelessness enforcement. Atlanta dumps keyword matches and calls it AI.
These aren't random technical glitches. Someone made choices about how these systems handle sensitive topics.
In Denver's case, "homelessness" triggers the service request flow. In Winter Haven's case, someone loaded specific talking points about encampment removals. In Atlanta's case, no one verified the system works before announcing it publicly.
What This Means for Municipal Leaders
Most city managers can't answer basic questions about their AI deployments:
- What does your chatbot say when residents ask about controversial programs?
- How does it handle questions where community and institutional perspectives differ?
- Can marginalized communities access it and trust what it says?
- Who configured it, and what assumptions did they make?
These questions matter because chatbots don't just provide information — they construct narratives about how government works and whose perspectives count.
When Denver's chatbot asks for a street address in response to a homelessness policy question, it's communicating something specific: this is a problem to report and remove, not a policy challenge to discuss.
When Winter Haven's chatbot can't explain crime trends but readily provides enforcement statistics about homeless encampment removals, it's making editorial choices about what information residents should receive.
When Atlanta's chatbot returns Municipal Court address change forms in response to questions about police accountability, it demonstrates that no one tested this system with questions residents actually ask.
What to Do About It
Test your chatbot on difficult questions before launch. Not just "What are your hours?" or "How do I pay a parking ticket?" Ask it about recent controversies, contested policies, administrative failures. See what it says. See what it omits.
Verify what you've actually deployed. Is it AI or rebranded search? Does it answer questions or just return links? Can it distinguish between policy questions and service requests?
Check who made configuration decisions. If your chatbot emphasizes certain narratives about sensitive topics, find out who decided that and whether it reflects city policy or vendor defaults.
Make systems testable. If journalists, researchers, and oversight bodies can't access your chatbot without downloading an app or creating an account, you can't verify how it works. Phoenix and Detroit require apps. Palo Alto's was down when I tried to test it. Require web accessibility for any system that speaks for the city.
Maintain editorial control. When your chatbot tells residents how the city handles homelessness or crime or police accountability, those are official statements. They should reflect deliberate choices by leadership, not vendor configurations or IT department defaults.
The Real Cost
Denver spent resources deploying Sunny. Winter Haven invested in Ask Winter Haven. Atlanta announced Ava as an innovation. All three systems fail basic tests of whether they help residents understand city policy on contested issues.
The problem isn't that AI is inherently flawed or that technology can't help. The problem is that cities are procuring systems without understanding what they do, configuring them without testing edge cases, and deploying them without mechanisms for residents to challenge what they say.
A resident asking about homelessness policy shouldn't get asked for a street address. A resident asking about police accountability shouldn't get routed to court address forms. A resident asking about crime shouldn't get 50 unfiltered search results.
These systems are already live. They're already representing your city to residents who have legitimate questions about how government works. If your chatbot speaks for your city, you need to know what it's saying — and who decided it should say it.
Katryna Peart researches AI governance in local government and works in LLM evaluation, stress-testing models for bias, nuance, and context. She has written about AI implementation for ICMA and other public-sector publications.



