Jingyu (Jack) Zhang







I am a CS PhD student at the Center for Language and Speech Processing, Johns Hopkins University, advised by Daniel Khashabi and Benjamin Van Durme.

I completed my undergraduate degree also from Johns Hopkins University majoring in Computer Science, Mathematics, Applied Mathematics and Statistics, and minoring in Economics. GO HOP! 💙🤍

My research interest lies in the area of Natural Language Processing, particularly on topics related to the safety, trustworthiness, and alignment of foundation models. In the past, I have collaborated with Mark Dredze at JHU CLSP, Yulia Tsvetkov and Tianxing He at the University of Washington, and Jim Glass at MIT CSAIL.


Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi. Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data. arXiv preprint.

Dongwei Jiang, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, Daniel Khashabi. Self-(In)Correct: LLMs Struggle with Refining Self-Generated Responses. arXiv preprint.

Kevin Xu, Yeganeh Kordi, Kate Sanders, Yizhong Wang, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi. TurkingBench: A Challenge Benchmark for Web Agents. arXiv preprint.

Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He. k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text. arXiv preprint.

Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, Jingyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts. arXiv preprint.

Abe Bohan Hou*, Jingyu Zhang*, Tianxing He*, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov. SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. NAACL 2024.

Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He. On the Zero-Shot Generalization of Machine-Generated Text Detectors. Findings of EMNLP 2023.

Tianxing He*, Jingyu Zhang*, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov. On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. ACL 2023. Oral Presentation.

Jingyu Zhang, Alexandra DeLucia, Chenyu Zhang, Mark Dredze. Geo-Seq2seq: Twitter User Geolocation on Noisy Data through Sequence to Sequence Learning. Findings of ACL 2023.

Jingyu Zhang, James Glass, Tianxing He. PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation. *SEM 2023. Preliminary version accepted at 2nd Workshop on Efficient Natural Language and Speech Processing (ENLSP), NeurIPS 2022. Best Paper Award.

Jingyu Zhang, Alexandra DeLucia, Mark Dredze. Changes in Tweet Geolocation over Time: A Study with Carmen 2.0. Proceedings of the 8th Workshop on Noisy User-generated Text (W-NUT), COLING 2022.

Abhinav Chinta*, Jingyu Zhang*, Alexandra DeLucia, Anna L. Buzcak, Mark Dredze. Study of Manifestation of Civil Unrest on Twitter. Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT), EMNLP 2021.

*Equal Contribution


I was a course assistant for EN.601.465/665: Natural Language Processing, taught by Jason Eisner, in Fall 2022 and Fall 2021.

I was a section leader for Code in Place 2021, hosted by Stanford University.