American NLP researcher and MIT associate professor whose work uses the compositional structure of language — as both a design principle and a research object — to build machine learning systems that learn from human guidance and to understand why such systems work.
Profile
| Nationality | American |
| Current Institution(s) | MIT CSAIL; MIT EECS (ITT Career Development Professor; Associate Professor, AI+D) |
| Research Areas | Natural Language Processing, Compositionality, Grounded Language Learning, Neural Module Networks, In-Context Learning, Language & Machine Learning |
| Doctoral Advisor | Dan Klein |
| Doctoral Thesis | Learning from Language (UC Berkeley, 2018) |
| Website | web.mit.edu/jda/www |
| X / Twitter | @jacobandreas |
| GitHub | jacobandreas |
| Google Scholar | Jacob Andreas |
Overview
Jacob Andreas is an associate professor at MIT in the Department of Electrical Engineering and Computer Science and the Computer Science and Artificial Intelligence Laboratory (CSAIL), where he holds the ITT Career Development Professorship in Computer Technology. He directs the Language & Intelligence (LINGO) group. His research occupies the intersection of natural language processing and machine learning, pursuing a central question from two directions: how can the compositional structure of language be used to build more capable and interpretable learning systems, and what does studying language understanding tell us about the computational principles underlying human cognition? He is best known for introducing neural module networks (NMNs), an architecture that dynamically composes modular neural components according to the syntactic structure of questions — a foundational contribution to neuro-symbolic and compositional AI. He trained at Columbia (B.S.), Cambridge (M.Phil. as a Churchill Scholar), and Berkeley (PhD under Dan Klein). Noam Chomsky is his great-great-grand-advisor, by way of Klein, Christopher Manning, and Joan Bresnan.
Early Life & Education
Columbia University — B.S. in Computer Science (2012)
Andreas completed his undergraduate degree in computer science at Columbia University, where he won the Theodore R. Bashkow Prize for computer science research and the Russell C. Mills Prize for coursework — early indicators of the dual research-and-theoretical emphasis that characterizes his subsequent career. During this period he spent time with Columbia’s NLP Group and the (since-disbanded) Center for Computational Learning Systems.
University of Cambridge — M.Phil. (2013)
On a Winston Churchill Scholarship, Andreas pursued a Master of Philosophy at the Cambridge Computer Laboratory, working with the Natural Language and Information Processing (NLIP) Group. His M.Phil. dissertation received the Cambridge Computer Laboratory Dissertation Prize — one of the most competitive research prizes at that stage of a UK academic career.
UC Berkeley — PhD in EECS (2013–2018)
Andreas completed his doctoral work at the Berkeley NLP Group and the Berkeley AI Research Lab (BAIR) under the supervision of Dan Klein. His dissertation, Learning from Language, developed a unified framework for using the compositional structure of language to inform the structure of machine learning models across multiple settings: visual question answering (neural module networks), reinforcement learning (policy sketches), and representation analysis. He was supported by an NSF Graduate Research Fellowship (2013–2016) and a Facebook Graduate Fellowship (2016–2018). He graduated in 2018.
His advisor lineage traces directly to Noam Chomsky: Jacob Andreas → Dan Klein → Christopher Manning → Joan Bresnan → Noam Chomsky.
Career
Post-Berkeley (2018–2019)
Following his PhD, Andreas briefly held research and visiting positions before joining MIT. During this transition period he continued developing the neural module network framework and its extensions.
MIT — Associate Professor (2019–present)
Andreas joined MIT EECS as a faculty member in the Department of Electrical Engineering and Computer Science and CSAIL. He holds the ITT Career Development Professorship in Computer Technology, a named chair for early-career faculty at MIT. He directs the Language & Intelligence (LINGO) Group and pursues three interrelated research programs.
Interactive learning from language. A core argument of Andreas’s research is that natural-language supervision is qualitatively richer than the labeled examples or scalar rewards used in standard machine learning, and that building systems that can learn from the kind of instruction humans provide to each other requires new architectures and training methods. This has led to work on learning reward functions, planning representations, and model behaviors from natural language descriptions rather than low-level demonstrations.
Scientific understanding of neural models. Andreas has developed methods for probing and interpreting what neural language models learn, asking whether the representations they acquire correspond to human-interpretable categories. His 2022 ICLR paper on natural language descriptions of deep visual features provided a tool for generating natural-language explanations of what individual neurons detect. His 2023 ICLR work on in-context learning proposed a theoretical analysis of what algorithm gradient-based in-context learning actually implements, framing it as implicit Bayesian inference over linear models.
Compositionality and human-like language understanding. Running throughout his work is a concern with how humans and models handle compositional structure — the property that the meaning of a complex expression is built from the meanings of its parts. His work has analyzed compositionality as lexical symmetry, developed tree projections for characterizing how well transformers represent compositional structure internally, and built computational models of pragmatic reasoning. A 2024 paper on Deductive Closure Training and a 2025 ICLR paper on eliciting human preferences with language models extend this agenda into the alignment space.
His group regularly collaborates with cognitive scientists and linguists, reflecting his commitment to bidirectional exchange between the study of human language and machine learning systems. He is a recognized teacher: he has received MIT’s Junior Bose Award for Teaching (MIT School of Engineering, 2023) and the Kolokotrones Education Award (MIT EECS, 2021), and was named a co-winner of MIT’s Edgerton Award in April 2026.
Key Contributions
-
Neural Module Networks (NMNs, NAACL 2016, Best Paper; CVPR 2016) — “Learning to Compose Neural Networks for Question Answering” and “Deep Compositional Question Answering with Neural Module Networks,” with Marcus Rohrbach, Trevor Darrell, and Dan Klein. Introduced a class of architectures in which a natural language question is first parsed into a structured computation tree, then used to dynamically assemble a network from a library of reusable neural modules — one for locating objects, one for classifying attributes, one for comparing, and so on — which are trained jointly. NMNs were among the first systems to explicitly couple linguistic compositional structure with neural network architecture, achieving state-of-the-art on multiple visual question answering benchmarks and influencing years of subsequent work in compositional and neuro-symbolic AI.
-
Policy Sketches for Multitask RL (ICML 2017, Best Paper Honorable Mention) — “Modular Multitask Reinforcement Learning with Policy Sketches.” Extended the modular compositionality idea to reinforcement learning: high-level natural language “sketches” describing the subgoal structure of a task scaffold learning of modular sub-policies, enabling faster learning and transfer across tasks. Demonstrated that language annotation considerably more efficient than standard reward shaping.
-
Natural Language Descriptions of Deep Visual Features (ICLR 2022) — Proposed MILAN (Mutual-Information-guided Linguistic Annotations of Neurons), a framework for generating natural-language descriptions of what individual neurons or feature detectors in a neural network respond to. Provided one of the first scalable tools for automatically interpreting the functional roles of internal representations in large vision models.
-
What Learning Algorithm is In-Context Learning? (ICLR 2023, Notable Paper) — An analytic investigation of what in-context learning implements, showing that for linear models under gradient descent, in-context learning is equivalent to implicit Bayesian inference. A theoretical contribution that helped place in-context learning within the framework of classical learning theory and influenced subsequent work on understanding large language model behavior.
-
Compositionality as Lexical Symmetry (ACL 2023, Area Chair’s Award) — Gave a formal mathematical characterization of compositionality as a symmetry property of lexical items in a language, connecting the intuitive notion of compositionality to group-theoretic structure and providing tools for measuring it in neural models.
-
Eliciting Human Preferences with Language Models (ICLR 2025) — Explored using language models as active query systems for eliciting latent human preferences, contributing to the growing intersection of Andreas’s compositionality and language-learning agenda with alignment-relevant problems.
Awards & Recognition
- Edgerton Award, MIT EECS (2026) — MIT’s award for exceptional contributions to teaching, research, and service by assistant or associate professors; shared recognition announced April 2026.
- Samsung AI Researcher of the Year (2021) — Named as a top young AI researcher.
- Junior Bose Award for Teaching, MIT School of Engineering (2023) — MIT School of Engineering’s teaching excellence award for junior faculty.
- Kolokotrones Education Award, MIT EECS (2021) — Departmental teaching excellence award.
- NSF CAREER Award — National Science Foundation award for early-career researchers with high potential for leadership.
- Kavli Fellow, National Academy of Sciences — Named a Kavli Fellow by the NAS for contributions to AI and language research.
- Best Paper, NAACL 2016 — For “Learning to Compose Neural Networks for Question Answering,” the neural module networks paper.
- Best Paper Honorable Mention, ICML 2017 — For “Modular Multitask Reinforcement Learning with Policy Sketches.”
- Best Paper, NAACL 2024 — For “Visual Grounding Helps Learn Word Meanings in Low-Data Regimes.”
- Area Chair’s Award, ACL 2023 — For “Compositionality as Lexical Symmetry.”
- Notable Paper, ICLR 2023 — For “What Learning Algorithm is In-Context Learning?”
- Facebook Graduate Fellowship (2016–2018) — Competitive doctoral fellowship in AI research.
- NSF Graduate Research Fellowship (2013–2016) — National doctoral fellowship.
- M.Phil. Dissertation Prize, Cambridge Computer Laboratory (2013) — Top dissertation prize at Cambridge’s Computer Lab.
- Winston Churchill Scholarship (2012–2013) — Competitive US-to-UK postgraduate scholarship for STEM graduate study.
- Theodore R. Bashkow Prize (Columbia, 2012) — Prize for outstanding computer science research as an undergraduate.
Key Relationships
- Dan Klein — PhD advisor at Berkeley; computational linguist and Berkeley NLP Group leader whose work on parsing, grammar induction, and structured prediction directly shaped Andreas’s compositional approach. The Klein-Andreas intellectual lineage is one of the clearest in modern NLP.
- Christopher Manning — Grand-advisor (Klein’s PhD advisor) and leading Stanford NLP researcher; part of the academic lineage connecting modern NLP empiricism to Chomskyan formalism.
- Noam Chomsky — Great-great-grand-advisor; the conceptual concern with compositionality and linguistic structure that pervades Andreas’s work is in direct intellectual descent from Chomskyan generative grammar, mediated through Bresnan, Manning, and Klein.
- Trevor Darrell — Berkeley EECS professor and co-author on the neural module networks papers; provided the computer vision perspective that made NMNs concrete on visual question answering tasks.
- Marcus Rohrbach — Co-author on the original NMN papers; brought visual grounding and multimodal expertise to the neural module network framework.
- Kevin Knight — Kevin Knight number one; the “Kevin Knight number” is a playful homage in the NLP community; Andreas co-authored directly with Knight, signaling his embeddedness in the formal/statistical NLP tradition.
Personal Style
Andreas’s research style is unusually self-conscious about the relationship between linguistic theory and machine learning — he routinely frames machine learning problems in terms of concepts drawn from formal linguistics (compositionality, syntax, lexical semantics, pragmatics) and conversely uses machine learning methods to test linguistic hypotheses. His academic writing is precise and spare; his presentations and blog posts show a dry wit and a tolerance for formally specified thought experiments. He has spoken and written about the value of slow, careful theoretical work in an era dominated by scaling results, arguing that understanding why models work is as important as making them work better. His advising statement, publicly posted, is candid about the kind of student he works with best: those drawn to foundational questions about language and mind, not those primarily chasing leaderboard positions. His teaching reputation at MIT — reflected in multiple awards — suggests an unusual commitment to undergraduate and graduate pedagogy alongside research.
References
- Personal website: web.mit.edu/jda/www
- MIT EECS profile: eecs.mit.edu
- LINGO Group: lingo.csail.mit.edu
- Google Scholar: scholar.google.com
- Berkeley PhD dissertation: eecs.berkeley.edu
- CV: mit.edu
- Simons Institute bio: simons.berkeley.edu
- Digg profile: digg.com/u/x/jacobandreas