å¼ ęÆę±
I am a CS PhD student at the Center for Language and Speech Processing, Johns Hopkins University, advised by Daniel Khashabi and Benjamin Van Durme. Iām also a part-time student researcher at Microsoft.
My research interest lies in the area of natural language processing, particularly in the safety, trustworthiness, and alignment of foundation models. My recent work focuses on safety alignment and enhancing verifiability of LLMs.
In the past, I have collaborated with Mark Dredze at JHU CLSP, Yulia Tsvetkov and Tianxing He at the University of Washington, and Jim Glass at MIT CSAIL. I completed my B.S. also from JHU with majors in Computer Science, Mathematics, Applied Mathematics, and minor in Economics. GO HOP! šš¤
![]() |
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Jingyu Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Benjamin Van Durme. ICLR 2025 The current paradigm for safety alignment of large language models (LLMs) follows a one-size-fits-all approach and lacks flexibility in the face of varying social norms across cultures, and diverse user needs. We propose Controllable Safety Alignment, a framework that adapt models to diverse safety requirements without re-training. |
![]() |
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data
Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi. NAACL 2025 (oral) To trust the fluent generations of large language models, humans must be able to verify their correctness against trusted external sources. We trivialize the verification process by developing models that quote verbatim statements from trusted sources in their pre-training data. |
![]() |
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
Abe Bohan Hou*, Jingyu Zhang*, Tianxing He*, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov. NAACL 2024 Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design. To address this issue, we propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH), which partitions the semantic space of sentences. |
Abe Bohan Hou, Hongru Du, Yichen Wang, Jingyu Zhang, Zixiao Wang, Paul Pu Liang, Daniel Khashabi, Lauren Gardner, Tianxing He. Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy. arXiv preprint
Jingyu Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Benjamin Van Durme. Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements. ICLR 2025.
Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi. Rationalyst: Pre-training Process-Supervision for Improving Reasoning. arXiv preprint.
Zhengping Jiang, Jingyu Zhang, Nathaniel Weir, Seth Ebner, Miriam Wanner, Kate Sanders, Daniel Khashabi, Anqi Liu, Benjamin Van Durme. Core: Robust Factual Precision Scoring with Informative Sub-Claim Identification. arXiv preprint.
Dongwei Jiang, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, Daniel Khashabi. Self-(In)Correct: LLMs Struggle with Refining Self-Generated Responses. AAAI 2025.
Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi. Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data. NAACL 2025 (oral).
Kevin Xu, Yeganeh Kordi, Kate Sanders, Yizhong Wang, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi. TurkingBench: A Challenge Benchmark for Web Agents. NAACL 2025.
Weiting Tan, Jingyu Zhang, Lingfeng Shen, Daniel Khashabi, Philipp Koehn. DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation. NeurIPS 2024.
Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He. k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text. Findings of ACL 2024.
Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, Jingyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts. Findings of ACL 2024.
Abe Bohan Hou*, Jingyu Zhang*, Tianxing He*, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov. SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. NAACL 2024.
Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He. On the Zero-Shot Generalization of Machine-Generated Text Detectors. Findings of EMNLP 2023.
Tianxing He*, Jingyu Zhang*, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov. On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. ACL 2023 (oral).
Jingyu Zhang, Alexandra DeLucia, Chenyu Zhang, Mark Dredze. Geo-Seq2seq: Twitter User Geolocation on Noisy Data through Sequence to Sequence Learning. Findings of ACL 2023.
Jingyu Zhang, James Glass, Tianxing He. PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation. *SEM 2023. Preliminary version accepted at 2nd Workshop on Efficient Natural Language and Speech Processing (ENLSP), NeurIPS 2022. Best Paper Award.
Jingyu Zhang, Alexandra DeLucia, Mark Dredze. Changes in Tweet Geolocation over Time: A Study with Carmen 2.0. Proceedings of the 8th Workshop on Noisy User-generated Text (W-NUT), COLING 2022.
Abhinav Chinta*, Jingyu Zhang*, Alexandra DeLucia, Anna L. Buzcak, Mark Dredze. Study of Manifestation of Civil Unrest on Twitter. Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT), EMNLP 2021.
*Equal Contribution
I was a course assistant for EN.601.465/665: Natural Language Processing, taught by Jason Eisner, in Fall 2022 and Fall 2021.
I was a section leader for Code in Place 2021, hosted by Stanford University.
šļøšļøšļø In my free time, I enjoy go-karting and sim racing. Iām a big car enthusiast and love watching motorsports such as formula 1. My favorite driver is Zhou Guanyu, the first ever Chinese driver to compete in F1.