Publications

My research centers on building trustworthy knowledge-intensive AI systems, organized around a unified question-answering pipeline: given a user query, (1) understanding what the query is truly asking, (2) assessing whether the model knows the answer, (3) evaluating whether external evidence is reliable, and (4) effectively leveraging external knowledge when needed. Each stage addresses a critical challenge in ensuring that AI systems produce accurate, honest, and well-grounded responses.

What is the Query asking for?

Clarifying Question Generation/Facet Generation

A Comparative Study of Training Objectives for Clarification Facet Generation[PDF] [Code] [PPT]
Shiyu Ni, Keping Bi, Jiafeng Guo and Xueqi Cheng
SIGIR-AP’ 2023: Proceedings of the 1st International ACM SIGIR Conference on Information Retrieval in the Asia Pacific

Does the Model Know the Answer?

LLM Honesty Alignment (Self-assessment)

Annotation-Efficient Universal Honesty Alignment[Arxiv] [Code]
Shiyu Ni, Keping Bi, Jiafeng Guo, Minghao Tang, Jingtong Wu, Zengxin Han and Xueqi Cheng
ICLR’ 2026: The Fourteenth International Conference on Learning Representations
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception[Arxiv] [Code] [Poster]
Shiyu Ni, Keping Bi, Jiafeng Guo, Lulu Yu, Baolong Bi and Xueqi Cheng
ACL’ 2025: The 63rd Annual Meeting of the Association for Computational Linguistics
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs’ Overconfidence Helps Retrieval Augmentation[Arxiv] [Blog] [Code]
Shiyu Ni, Keping Bi, Jiafeng Guo and Xueqi Cheng
ACL’ 2024: Findings of the Association for Computational Linguistics, 2024
Are Large Language Models More Honest in Their Probabilistic or Verbalized Confidence?[Arxiv]
Shiyu Ni, Keping Bi, Lulu Yu and Jiafeng Guo
CCIR’ 2024: The 30th China Conference on Information Retrieval
Do LVLMs Know What They Know? A Systematic Study of Knowledge Boundary Perception in LVLMs[Arxiv] [Code]
Zhikai Ding, Shiyu Ni, and Keping Bi
EMNLP’ 2025: Findings of Empirical Methods in Natural Language Processing
Propose the idea and refine the whole paper.
How Knowledge Popularity Influences and Enhances LLM Knowledge Boundary Perception[Arxiv]
Shiyu Ni, Keping Bi, Jiafeng Guo and Xueqi Cheng
Evaluating and Calibrating LLM Confidence on Questions with Multiple Correct Answers[Arxiv] [Code]
Yuhan Wang^†, Shiyu Ni^†, Zhikai Ding, Zihang Zhan, Yuanzi Li and Keping Bi

LLM-as-a-judge (External Verifier)

How Long Reasoning Chains Influence LLMs’ Judgment of Answer Factuality[Arxiv] [Code]
Minzhu Tu^†, Shiyu Ni^† and Keping Bi
ACL’ 2026: The 64th Annual Meeting of the Association for Computational Linguistics

Is External Evidence Good?

Query Performance Prediction

Can LLM Rerankers Predict Their Own Ranking Performance?[Arxiv]
Shiyu Ni, Keping Bi, Jiafeng Guo, Jingtong Wu, Zengxin Han and Xueqi Cheng

How to Leverage External Knowledge?

Interact Using Tokens

Injecting External Knowledge into the Reasoning Process Enhances Retrieval-Augmented Generation[Arxiv] [Code]
Minghao Tang, Shiyu Ni, Jiafeng Guo and Keping Bi
SIGIR-AP’ 2025: Proceedings of the 3rd International ACM SIGIR Conference on Information Retrieval in the Asia Pacific
Write abstract, introduction, refine the whole paper and conduct experimental analysis

Parametric Injection

The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation[Arxiv]
Minghao Tang, Shiyu Ni, Jingtong Wu, Zengxin Han and Keping Bi
Write abstract, introduction and refine the whole paper (Version 1)

Collaborations

Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness[Arxiv]
Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Junfeng Fang, Hongcheng Gao, Shiyu Ni and Xueqi Cheng
ICLR’ 2025: The Thirteenth International Conference on Learning Representations
Deep Research: A Systematic Survey[Arxiv] [Repo]
Zhengliang Shi, Yiqun Chen, Haitao Li, Weiwei Sun, Shiyu Ni, Yougang Lyu, Runze Fan et.al.
Preprint 2025
Write Section 3.2.2 and Section 6.1.

^† Equal contribution.

Shiyu Ni

What is the Query asking for?

Does the Model Know the Answer?

Is External Evidence Good?

How to Leverage External Knowledge?

Collaborations