About Me
Hi! I am Ansong Ni (倪安松), a Research Scientist at Meta FAIR. Previously, I finished my PhD in the Computer Science Department at Yale University, advised by Prof. Arman Cohan, and (the late) Prof. Dragomir Radev. Prior to Yale, I obtained my M.S. in CS degree from School of Computer Science at Carnegie Mellon University and B.Eng. from Nanjing University in China.
I worked as a research intern at Google DeepMind (Summer 2023), Meta AI (Summer 2022), MSR Redmond (Summer 2021), AI2 (Summer 2020), MSR Asia (Summer, Fall 2017).
Research Interest
I teach large language models (LLMs) to solve complex tasks by reasoning in natural language and formal language (e.g., code). More recently, I’m interested in the domain of self-improving LLMs using synthetic data.
Previously, I have also done research in more traditional NLP tasks, as semantic parsing, question answering and text summarization.
For more details about my research, please refer to my publication list below.
Selected Publications and Preprints
For a full list, please refer to my Google Scholar or Semantic Scholar.
(* denotes equal contribution)
Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin
NExT: Teaching Large Language Models to Reason about Code Execution
Preprint.
[arxiv]Ansong Ni, Pengcheng Yin, Yilun Zhao, Martin Riddell, Troy Feng, Rui Shen, Stephen Yin, Ye Liu, Semih Yavuz, Caiming Xiong, Shafiq Joty, Yingbo Zhou, Dragomir Radev, and Arman Cohan
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models
Preprint.
[arxiv]Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, and Xi Victoria Lin
LEVER: Learning to Verify Language-to-Code Generation with Execution
The 2023 International Conference on Machine Learning (ICML’23)
[arxiv] [code]Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, and Jianfeng Gao
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions
The 2023 International Conference on Learning Representations (ICLR’23)
[arxiv] [code]Zhangir Azerbayev, Ansong Ni, Hailey Schoelkopf, and Dragomir Radev
Explicit Knowledge Transfer for Weakly-Supervised Code Generation
Deep Learning For Code (DL4C) Workshop @ ICLR’23
[arxiv]Tianbao Xie*, Chen Henry Wu*,…, Ansong Ni,…, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, and Tao Yu
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP’22)
[arxiv] [website] [code]Ansong Ni, Matt Gardner, and Pradeep Dasigi
Mitigating False-Negative Contexts in Multi-document Question Answering with Retrieval Marginalization
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP’21)
[arxiv] [code]Ansong Ni, Zhangir Azerbayev, Mutethia Mutuma, Troy Feng, Yusen Zhang, Tao Yu, Ahmed Hassan Awadallah, and Dragomir Radev*
SummerTime: Text Summarization Toolkit for Non-experts
The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP’21) Demo Track [arxiv] [code] (230+ stars)Ansong Ni*, Daniel Ramos*, Aidan Yang, Ines Lynce, Vasco Manquinho, Ruben Martins, and Claire Le Goues
SOAR: A Synthesis Approach for Data Science API Refactoring
The 43th International Conference on Software Engineering (ICSE’21)
[arxiv] [code] [talk]Ansong Ni, Pengcheng Yin, and Graham Neubig
Merging Weak and Active Supervision for Semantic Parsing
The 34th AAAI Conference on Artificial Intelligence (AAAI’20)
[arxiv] [code]
Talks and Presentations
- Foundation Models for Code and Math, Guest Lecture @ Yale CPSC 488/588 “AI Foundation Models”, Dec 2023 | @ HKU “Natural Language Processing”, Apr 2024
[slides] [recordings] - Enhancing Language Models for Program Synthesis using Execution, Apr 2023, Invited Talk @ UT Austin TAUR Lab
- Enhancing Language Models for Program Synthesis using Execution, Mar 2023, Invited Talk @ HKUST CSE [recordings]
- Enhancing Language Models for Program Synthesis using Execution, Mar 2023, Invited Talk @ MIT CSAIL
[slides] [recordings] - Learning from Self-Sampled Correct and Partially-Correct Programs, June 2022, Paper Presentation @ Meta AI Reading Group
- Merging Weak and Active Supervision for Semantic Parsing, Feb 2020, Oral Paper Presentation @ AAAI Conference
Professional Services
- Program Committee/Reviewer
- ICLR 2024
- ICML 2023, 2024
- NeurIPS 2022, 2023, 2024
- COLM 2024
- ACL 2023, 2024
- EMNLP 2022
- ACL Rolling Reviews (ARR) 2021-2022
- DL4C Workshop @ ICLR 2023
- SUKI Workshop @ NAACL 2022
- IntEx-SemPar Workshop @ EMNLP 2020
Work Experience
- Meta AI – FAIR, 2024 - Now Research Scientist.
- Google DeepMind – Learning for Code Team, Summer 2023
Research Intern. Hosts: Pengcheng Yin and Charles Sutton - Meta AI – FAIR NLP Group, Summer 2022
Research Intern. Hosts: Victoria Lin and Sida Wang - Microsoft Research – Deep Learning Group, Summer+Fall 2021
Research Intern. Hosts: Alex Polozov, Chris Meek, Chenglong Wang and Jeevana Priya Inala - Allen Institute for AI – AllenNLP Team, Research Intern, Summer 2020
Research Intern. Hosts: Pradeep Dasigi and Matt Gardner - Carnegie Mellon University – Institute of Software Research, Fall 2020
Research Assistant. Host: Claire Le Goues - Microsoft Research Asia – Software Analytics Group, Summer+Fall 2017
Research Intern. Host: Shi Han
Education
- Ph.D. in Computer Science, Yale University, 2020.8 - 2024.6
- M.S. in Computer Science, Carnegie Mellon University, 2018.8-2019.12
- B.Eng. in Software Engineering, Nanjing University, 2014.8-2018.6
Miscellaneous
- My first name can be pronounce as-is: [An-Song], last name is like [Nee].
- A tribute to my late advisor Drago.
- I know how to say “I don’t speak [this language]” in 9 different languages (Mandarin, English, Spanish, Cantonese, Italian, Greek, Korean, Hindi, Hebrew). Very handy, maybe you should know them too.
- I love soccer and I am a Barcelona and Messi fan. I don’t remember how Barca did in UCL 2018-2024 and don’t remind me. My favorite games are:
- 2015 UCL final: Barca 3-1 Juventus
- 2017 UCL round of 16: Barca 6-1 PSG (6-5 on agg.)
- 2022 WC final: Argentina 3-3 France (4-2 in pen.)
- I am from Nanchang, Jiangxi Province in China. I’ve also lived in Nanjing, Beijing, Berkeley, Pittsburgh, Houston, New Haven, Menlo Park, and Sunnyvale.