About me

I am a Ph.D. student in the Programming Languages Lab at Peking University. My research focuses on explainable and trustworthy AI, especially model-agnostic local explanation methods for modern machine learning systems and large language models.

Recently, I have been working on making interpretability more practical: reducing the cost of explaining black-box LLMs, accelerating rule-based explanations, and turning explanations into actionable tools for model and prompt optimization.

Recent Publications

  • Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection
    IJCAI-ECAI 2026. Second author.
    We propose a coarse-to-fine framework that uses a proxy model to select an optimized neighborhood for faithful, fine-grained explanations of long-context large language models.
    [Paper] [arXiv]

  • MAnchors: Memorization-Based Acceleration of Anchors via Rule Reuse and Transformation
    ICML 2026. First author.
    We propose a memorization-based acceleration framework for Anchors, reusing and transforming previously generated rules to reduce explanation time while preserving fidelity and understandability.
    [Paper]

  • Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models
    ACL 2026. Second author.
    We introduce a budget-friendly proxy framework for LLM interpretability, using efficient models plus a statistical screen-and-apply mechanism to approximate expensive black-box explanations and support prompt compression and poisoned example removal.
    [Paper] [Code & Data]