Self-Improving and Self-Adapting Agents

Mon, 01 Dec 2025 00:00:00 +0000

🎯 Overview: We study test-time scaling (TTS) methods for self-improving and self-adapting agents, advancing a new paradigm of artificial intelligence in which autonomous systems do not merely act, but evolve reliably through learning from experience, refining behavior at test time, and autonomously modifying their own learning mechanisms.

🧠 Abstract: Although modern foundation models (FMs) like ChatGPT are extraordinarily capable, they remain largely fragile: small variations in input (aka ‘‘prompts’’), such as subtle phrasing changes or unintended noise, can lead to contradictory or ungrounded reasoning. This limits their broader deployment in high-stakes domains like healthcare, law, and science, where precision, reliability, and interpretability are essential.

In this project, we aim to build the next generation of FM systems that can continually assess and correct their behavior at test time, making them more trustworthy and capable. Our studies span three main aspects:

(1) Understand Fragility of FMs: How can we interpretably understand model behavior through the lens of prompts to improve FMs reliably? (NLPromptEval @ ACL'25, FormatBiasEval @ NAACL'25)
(2) Study TTS Methods: With the above understanding, how can we develop TTS to enhance model effectively and reliably? (Multi-Expert Prompting @ EMNLP'24, adv-ICL @ ACL'24, Chain-of-Opinion @ COLING'25)
(3) Self-Improving Agents: How can we build powerful TTS-powered agentic systems that critique and refine their own behavior, ultimately enabling autonomous self-improvement? (LongGuide (Self-Adapting Agent) @ ACL'25, VISTA (Self-Improving Agent))

Our findings so far have offered a foundation for building foundation model agents that reliably improve through interaction, feedback, and structure-aware test-time methods—without reliance on additional gradient-based training.

🔥 News: We are expanding our studies towards multiple directions. Please review our work above. If you have a strong interest in the topics listed and a solid background in mathematics and programming, and optionally in research, come to join us!

WING Members Present at NUS AI Seminar at the Computer Science Department

Wed, 19 Nov 2025 00:00:00 +0000

NUS AI Seminar at the Computer Science Department is a new initiative to bring together AI researchers to share their latest research findings and insights. There are eight speakers in the Fall 2025 semester, usually hosted on Wednesday afternoons at Seminar Room 21. WING members are excited to present their research alongside other presenters from NUS Computing, MIT, and UCLA. We appreciate Prof. Lee Wee Sun and PhD student Guoji Fu for hosting the seminar.

Yisong Miao gave a seminar presentation titled “Faithfulness and Interpretability for Discourse Understanding” on November 19th, 2025. This talk combines his work on Discursive Socratic Questioning (ACL 2024) and Discursive Circuits (EMNLP 2025). He explained how our team tackles the technical challenge of evaluating a model’s discourse faithfulness and formalizing a unique setup for transformer circuit discovery. The seminar ends with an audience discussion on the conceptual difference between truthfulness and faithfulness, as well as designing new methods using sparse circuits in future work.

Yajing Yang presented “From Data to Insights: LLMs for Financial Narration”, also on November 19th, 2025, highlighting her research on financial data narration. Drawing from DataTales: A Benchmark for Real-World Intelligent Data Narration (EMNLP 2024) and KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration (EMNLP 2025), she described our team’s efforts to build a realistic benchmark and improve generation quality through knowledge integration and hierarchical analysis. The talk sparked discussion on remaining challenges in financial report generation and how LLMs could be used to produce more actionable and informative insights.

On December 3rd, 2025, Xuan Long Do delivered a seminar titled “Toward Autonomous Self-Improving Foundation Models”. The presentation spanned multiple projects, including “What Makes a Good Natural Language Prompt?” (ACL 2025), Multi-Expert Prompting (EMNLP 2024), and VISTA, his 2025 Google internship project on a self-improving video generation agent. Beginning with an analysis of prompt and model fragility, he showed how multi-agent reasoning can help address these limitations before moving toward the broader vision of autonomous self-improving foundation models. The seminar concluded with an audience discussion on future research directions in this area.

Below is a gallery of the seminar photos, some of which are enhanced and regenerated via Google’s Nano Banana

Yisong Miao presenting his work on “Faithfulness and Interpretability for Discourse Understanding” at the AI Seminar on November 19th, 2025:

Yajing Yang presenting her work on “From Data to Insights: LLMs for Financial Narration” at the AI Seminar on November 19th, 2025:

Xuan Long Do presenting his work on “Toward Autonomous Self-Improving Foundation Models” at the AI Seminar on December 3rd, 2025:

Self-Improving | Web IR NLP Group @ NUS

Self-Improving and Self-Adapting Agents

WING Members Present at NUS AI Seminar at the Computer Science Department