Jingyu (Jack) Zhang

I am a CS PhD student at the Center for Language and Speech Processing, Johns Hopkins University, advised by Daniel Khashabi and Benjamin Van Durme. I’m also a part-time student researcher at Microsoft.

My research interest lies in the area of natural language processing, particularly in the safety, trustworthiness, and alignment of foundation models. My recent work focuses on safety alignment and enhancing attribution of LLMs.

In the past, I have collaborated with Mark Dredze at JHU CLSP, Yulia Tsvetkov and Tianxing He at the University of Washington, and Jim Glass at MIT CSAIL. I completed my B.S. also from JHU with majors in Computer Science, Mathematics, Applied Mathematics, and minor in Economics. GO HOP! 💙🤍

Publications

Jingyu Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Benjamin Van Durme. Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements. arXiv preprint.

Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi. Rationalyst: Pre-training Process-Supervision for Improving Reasoning. arXiv preprint.

Zhengping Jiang, Jingyu Zhang, Nathaniel Weir, Seth Ebner, Miriam Wanner, Kate Sanders, Daniel Khashabi, Anqi Liu, Benjamin Van Durme. Core: Robust Factual Precision Scoring with Informative Sub-Claim Identification. arXiv preprint.

Jingyu Zhang, Marc Marone, Tianjian Li, Benjamin Van Durme, Daniel Khashabi. Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data. arXiv preprint.

Dongwei Jiang, Jingyu Zhang, Orion Weller, Nathaniel Weir, Benjamin Van Durme, Daniel Khashabi. Self-(In)Correct: LLMs Struggle with Refining Self-Generated Responses. arXiv preprint.

Kevin Xu, Yeganeh Kordi, Kate Sanders, Yizhong Wang, Adam Byerly, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi. TurkingBench: A Challenge Benchmark for Web Agents. arXiv preprint.

Weiting Tan, Jingyu Zhang, Lingfeng Shen, Daniel Khashabi, Philipp Koehn. DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation. NeurIPS 2024.

Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He. k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text. Findings of ACL 2024.

Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, Jingyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts. Findings of ACL 2024.

Abe Bohan Hou*, Jingyu Zhang*, Tianxing He*, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov. SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. NAACL 2024.

Xiao Pu, Jingyu Zhang, Xiaochuang Han, Yulia Tsvetkov, Tianxing He. On the Zero-Shot Generalization of Machine-Generated Text Detectors. Findings of EMNLP 2023.

Tianxing He*, Jingyu Zhang*, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov. On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. ACL 2023. Oral Presentation.

Jingyu Zhang, Alexandra DeLucia, Chenyu Zhang, Mark Dredze. Geo-Seq2seq: Twitter User Geolocation on Noisy Data through Sequence to Sequence Learning. Findings of ACL 2023.

Jingyu Zhang, James Glass, Tianxing He. PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation. *SEM 2023. Preliminary version accepted at 2nd Workshop on Efficient Natural Language and Speech Processing (ENLSP), NeurIPS 2022. Best Paper Award.

Jingyu Zhang, Alexandra DeLucia, Mark Dredze. Changes in Tweet Geolocation over Time: A Study with Carmen 2.0. Proceedings of the 8th Workshop on Noisy User-generated Text (W-NUT), COLING 2022.

Abhinav Chinta*, Jingyu Zhang*, Alexandra DeLucia, Anna L. Buzcak, Mark Dredze. Study of Manifestation of Civil Unrest on Twitter. Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT), EMNLP 2021.

*Equal Contribution

Teaching

I was a course assistant for EN.601.465/665: Natural Language Processing, taught by Jason Eisner, in Fall 2022 and Fall 2021.

I was a section leader for Code in Place 2021, hosted by Stanford University.

Service

Application Mentor, JHU CLSP pre-application support program
Curriculum Committee, Department of Computer Science, Johns Hopkins University
Recruitment Committee, Center for Language and Speech Processing, Johns Hopkins University

Misc

🏎️🏎️🏎️ In my free time, I enjoy go-karting and sim racing. I’m a big car enthusiast and love watching motorsports such as formula 1. My favorite driver is Zhou Guanyu, the first ever Chinese driver to compete in F1.