Skip to main content

Philip Meng & Abhay Gupta

Philip Meng & Abhay Gupta

2025 Davidson Fellow
$25,000 Scholarship

Philip Meng, Age 18
Hometown: Andover, MA

Abhay Gupta, Age 17
Hometown: Hopewell, NY

Technology: “EnDive: A Cross-Dialect Benchmark for Fairness and Performance in Large Language Models”

About Philip

Hi! My name is Philip Meng, and I’m a rising senior at Phillips Academy in Andover, Massachusetts. I hope to continue combining machine learning and entrepreneurship to develop technology that creates meaningful social impact.

Outside the classroom, I serve as Andover’s student body co-president and compete on the varsity squash team, where I have been recognized as a three-time US Squash Scholar-Athlete. My passion for startups, AI, and venture capital led me to co-found Launchpad, a high school startup and nonprofit incubator with more than 30 school chapters worldwide, and to create The Early Founder, a media platform sharing entrepreneurial stories with over 15 million views. In parallel, my AI research for social good has been recognized at NeurIPS, ICLR, NAACL, and EMNLP.

Skip testimonial carousel

"It is truly an honor to be selected as a Davidson Fellow. This recognition not only validates the hard work and passion I have poured into my project, but also welcomes me into a community of brilliant, innovation‑driven individuals dedicated to making a positive impact on society. As a member of this community, I’m inspired to continue my work in AI research and to pursue innovation that uplifts others."

About Abhay

I’m Abhay Gupta, a high school student from New York who is passionate about using AI to make the world a little fairer. I am currently applying to colleges and plan to major in data science and linguistics. I hope to continue developing research and tools that make AI more inclusive, accessible, and impactful.

Outside the classroom, I compete on my school’s varsity tennis team and have been playing drums for the past eight years, performing at both school and community events. I am also the co-founder of ResearchChat, a free platform that helps students engage with academic research. It is used across my school district in classes and clubs, with more than 4,000 active users to date. My passion for AI and equity has led me to work on research projects focused on fairness in language models, which have been recognized at NeurIPS, ACL, EMNLP, and ICLR.

Skip testimonial carousel

"Being named a Davidson Fellow is surreal. It means so much to have this work recognized, especially knowing how selective and meaningful this award is. For me, it’s not just about the project, but about joining a group of young people who care deeply about solving real problems and pushing ideas forward. I’m incredibly grateful and excited to be part of this community."

Project Description

Our project, EnDive (English Diversity), is a cross-dialect benchmark that evaluates the fairness and performance of state-of-the-art large language models (LLMs) across five underrepresented English dialects: African American Vernacular English, Chicano English, Jamaican English, Indian English, and Colloquial Singaporean English. LLMs such as OpenAI’s ChatGPT, Anthropic’s Claude, and Meta’s LLaMa now reach hundreds of millions of weekly users, influencing decisions from high school classrooms to workplaces. Yet, as we discovered, they often struggle with dialectal inputs, misclassifying nonstandard English as incorrect or even toxic. These biases can have significant real-world consequences, such as disadvantaging job applicants or misgrading students for dialect-specific work. Through EnDive, we provide tools and evidence for developers and researchers to identify and address these biases, helping ensure that future AI systems treat all English speakers fairly and equitably.

Deeper Dive

For our project, we created a cross-dialect benchmark, EnDive (English Diversity), that evaluates the fairness and performance of large language models (LLMs) across five underrepresented English dialects. EnDive is particularly significant for two reasons. First, our benchmark is especially comprehensive: spanning 12 reasoning tasks and performing more than 300 evaluations with seven of the most widely used LLMs through zero-shot and chain-of-thought prompting, our project’s scope is far more extensive than most natural language processing papers. Second, our use of a wide variety of analytical and evaluation techniques is novel. We employed human validators to verify translation faithfulness, fluency, and formality; used ROUGE diversity scores, lexical diversity evaluations, BARTScore evaluations, and preference tests; and conducted select qualitative analyses. EnDive builds on our earlier work, AAVENUE, which focused on African American Vernacular English. AAVENUE was recognized at top AI workshops at EMNLP and NeurIPS High School Track and has been cited by research institutions including Microsoft, Google Research, Oxford, and Stanford. After attending EMNLP ’24 and engaging with leading NLP researchers — including the Stanford SALT Lab researchers we cited in our paper — we recognized the urgent need to expand our benchmark to multiple dialects and reasoning tasks. This led to the creation of EnDive, a more comprehensive framework designed to reveal hidden AI biases and guide the development of more inclusive language technologies.

We believe EnDive can directly improve quality of life for millions of people by ensuring that AI systems treat all English speakers fairly, regardless of dialect. By revealing systemic performance gaps in LLMs, our benchmark provides developers with the tools to identify and address hidden biases before deployment in high-stakes applications such as education, hiring, and healthcare. This work not only promotes equity in AI-powered tools but also affirms the cultural and linguistic identities of communities historically underrepresented and misrepresented in technology. In the long term, we envision EnDive as a foundation for dialect-aware NLP systems that are both powerful and inclusive, contributing to a world where technology serves everyone — not just those who speak Standard American English.

Q&A

If you could have dinner with the five most interesting people in the world, living or dead, who would they be?

Philip: Steve Jobs, Leonardo daVinci, Confucius, J. Robert Oppenheimer, Roger Federer.

Abhay: Jenson Huang, Elon Musk, LeBron James, Batman, and MrBeast

If you could magically become fluent in any language, what would it be?

Philip: I would love to become fluent in Portuguese. I've been to Lisbon several times, yet haven't managed to pick it up.

Abhay: French, I think it’s a beautiful language

What are the top three foreign countries you’d like to visit?

Philip: Japan (for the food and convenience stores), Peru (to brush up on my Spanish and to see Machu Picchu), and Thailand (for the beaches).

Abhay: Japan, Switzerland, and Italy

Skip carousel

Click image to download high resolution files

In The News

BOSTON — Three Massachusetts students have been named 2025 Davidson Fellows, one of the nation’s most prestigious honors for students 18 and younger. Elizabeth Hanechak of Russell, Ethan Yan of Groton, and Philip Meng of Andover will share $175,000 in scholarships as part of the program’s 25th anniversary year, which is awarding a record $825,000 to 21 students nationwide.

Download the full press release here