Diederik P. (Durk) Kingma

Dutch machine learning researcher who co-invented the Variational Autoencoder and the Adam optimizer — two of the most foundational and most-cited contributions in the history of deep learning — and has since advanced normalizing flows, variational diffusion models, and responsible AI development at OpenAI, Google Brain, and Anthropic.


Profile

Born Netherlands (date not publicly disclosed)
Nationality Dutch
Current Institution(s) Anthropic (Research Scientist, 2024–present)
Research Areas Generative Modeling, Variational Inference, Optimization, Normalizing Flows, Diffusion Models, Large-Scale Machine Learning
Doctoral Advisor Max Welling
Doctoral Thesis Variational Inference and Deep Learning: A New Synthesis (University of Amsterdam, 2017, cum laude)
Website dpkingma.com
X / Twitter @dpkingma
GitHub dpkingma
Google Scholar Diederik P. Kingma

Overview

Diederik P. Kingma — known by the Frisian nickname Durk, pronounced like Dirk — is a Dutch machine learning researcher whose two most recognized contributions, the Variational Autoencoder (VAE, 2013) and the Adam optimizer (2014), are among the most-cited papers in the history of computer science. The VAE established the reparameterization trick and the evidence lower bound as the central machinery for scalable latent-variable deep learning; the Adam optimizer became the default training algorithm for virtually every neural network trained after 2015. Kingma was a founding team member and algorithms lead at OpenAI (2015–2018), spent six years as a Research Scientist at Google Brain and Google DeepMind (2018–2024), and joined Anthropic in October 2024, working remotely from the Netherlands. His PhD at the University of Amsterdam, completed cum laude under Max Welling in 2017, was the first cum laude in the UvA CS department in thirty years. His Google Scholar profile reflects hundreds of thousands of citations, driven primarily by the Adam paper, one of the most-cited works in any scientific discipline.


Early Life & Education

Kingma was born and raised in the Netherlands. He began research at New York University in Yann LeCun’s laboratory in 2009 as a Junior Research Scientist — a formative early encounter with the deep learning research program before it had become mainstream. He returned to LeCun’s lab for a second period in 2012. Between these stints, he co-founded Advanza, a Dutch technology company, serving as its technical lead from 2010 to 2012; Advanza was successfully acquired in 2016.

Kingma began his PhD in 2013 at the University of Amsterdam under the supervision of Max Welling, working on deep learning and generative models. During his doctoral years he also spent the summers of 2014 and 2015 at DeepMind in London for collaborations, and received Google’s inaugural European Doctoral Fellowship in Deep Learning in 2015. He completed his PhD in 2017 with the distinction cum laude — the highest honor in the Dutch system — and the first such distinction in the UvA CS department in thirty years. His thesis, Variational Inference and Deep Learning: A New Synthesis, integrated the reparameterization-based variational inference framework he had developed with a broader treatment of deep generative modeling.


Career

NYU / Advanza (2009–2013)

Kingma’s earliest research positions at Yann LeCun’s laboratory exposed him to the neural network tradition that would shortly become the dominant paradigm in machine learning. His co-founding of Advanza during the intervening years added an early startup and product-building dimension to his background that distinguished him from purely academic researchers.

University of Amsterdam — PhD (2013–2017)

Two papers produced during Kingma’s PhD years transformed the field.

Variational Autoencoder (ICLR 2014). “Auto-Encoding Variational Bayes,” co-authored with Max Welling, introduced the VAE: a neural network architecture for learning deep latent-variable models in which an encoder maps inputs to a distribution over a latent space and a decoder reconstructs inputs from sampled latent representations. The critical technical contribution was the reparameterization trick, which allowed gradients to flow through stochastic sampling operations and made joint training of encoder and decoder by stochastic gradient descent tractable. The VAE unified probabilistic generative modeling with scalable deep learning for the first time, established the evidence lower bound (ELBO) as a training objective for deep generative models, and became the conceptual backbone of latent diffusion models including Stable Diffusion. Independently, Danilo Rezende, Shakir Mohamed, and Daan Wierstra published a closely related approach (SGVB) at the same time. The VAE paper received the ICLR 2024 Test of Time Award — the inaugural year of that award — in recognition of its lasting impact.

Adam Optimizer (ICLR 2015). “Adam: A Method for Stochastic Optimization,” co-authored with Jimmy Ba, introduced the Adam algorithm: an adaptive learning-rate optimizer that maintains per-parameter first and second moment estimates and applies bias correction. Adam combined the benefits of AdaGrad (adaptation to sparse gradients) and RMSProp (effective in non-stationary settings) in a single algorithm with intuitive hyperparameters. It was immediately adopted as the default optimizer for training neural networks across virtually every domain and remained so for a decade. The Adam paper became one of the most-cited papers in all of computer science and across all scientific disciplines, with citations numbering in the hundreds of thousands. It received the ICLR 2025 Test of Time Award (with Jimmy Ba).

OpenAI — Founding Team, Research Scientist, Algorithms Team Lead (2015–2018)

Kingma joined OpenAI as a member of its founding team in 2015 and served as a Research Scientist and leader of the Algorithms team, focused on fundamental research into generative AI methods. During this period he continued developing the variational inference framework (Improved Variational Inference with Inverse Autoregressive Flow, NIPS 2016, with Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling — introducing normalizing flows as a route to more expressive approximate posteriors) and contributed to work on semi-supervised learning and representation learning. He left OpenAI in 2018.

Google Brain / Google DeepMind — Research Scientist (2018–2024)

After a brief period as a part-time angel investor and advisor, Kingma rejoined Google’s research organization in July 2018, starting at Google Brain. He led research projects across generative models for text, image, and video.

Glow (NeurIPS 2018). “Glow: Generative Flow with Invertible 1×1 Convolutions,” co-authored with Prafulla Dhariwal, introduced a normalizing flow model using invertible 1×1 convolutions as the core architectural element. Glow generated high-resolution, photorealistic face images and provided a tractable exact likelihood model — a demonstration that flows could achieve visual quality comparable to GANs while remaining likelihood-based. The Glow demo, released as an interactive website, became one of the first publicly accessible demonstrations of high-fidelity photorealistic AI image generation.

Variational Diffusion Models (NeurIPS 2021). “Variational Diffusion Models,” co-authored with Tim Salimans, Ben Poole, and Jonathan Ho, unified the VAE and diffusion model frameworks by showing that diffusion models can be viewed as infinite-depth VAEs with a specific noise schedule. The paper demonstrated that these models could achieve state-of-the-art likelihoods on image density estimation benchmarks, simplified the theoretical understanding of diffusion models through a signal-to-noise ratio formulation, and proved an equivalence between several previously distinct proposed methods. It contributed to the theoretical grounding of the diffusion model paradigm that underlies most contemporary image and video generation systems.

During his Google period Kingma also contributed to research on large language models and other generative models for text and video, consistent with Google Brain’s expanding focus on foundation models.

Anthropic — Research Scientist (2024–present)

In October 2024, Kingma announced he was joining Anthropic, working mostly remotely from the Netherlands with regular visits to the San Francisco Bay Area. In announcing the move, he wrote that Anthropic’s approach to AI development resonated with his own beliefs about developing powerful AI systems responsibly. His arrival continued Anthropic’s pattern of recruiting prominent researchers from OpenAI and Google who share its safety-oriented research culture.


Key Contributions

  • Variational Autoencoder (VAE, ICLR 2014) — “Auto-Encoding Variational Bayes,” with Max Welling. Introduced the reparameterization trick and the evidence lower bound as the machinery for scalable training of deep latent-variable models. Established the canonical framework for deep generative modeling and representation learning; conceptually foundational to latent diffusion models including Stable Diffusion. Received the inaugural ICLR 2024 Test of Time Award. Over 35,000 citations.

  • Adam Optimizer (ICLR 2015) — “Adam: A Method for Stochastic Optimization,” with Jimmy Ba. Introduced adaptive moment estimation for per-parameter learning rates with bias correction. Became the default optimizer for training deep neural networks across virtually every domain for a decade. Among the most-cited scientific papers in any discipline, with over 200,000 citations. Received the ICLR 2025 Test of Time Award.

  • Inverse Autoregressive Flow (NIPS 2016) — “Improved Variational Inference with Inverse Autoregressive Flow,” with Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Extended the VAE framework by composing flexible normalizing flows to enrich the approximate posterior, substantially improving the expressiveness and performance of variational inference for latent-variable models.

  • Weight Normalization (NIPS 2016) — “Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks,” with Tim Salimans. Introduced reparameterizing weight vectors by their magnitude and direction, accelerating convergence and providing a simpler alternative to batch normalization in certain settings.

  • Glow (NeurIPS 2018) — “Glow: Generative Flow with Invertible 1×1 Convolutions,” with Prafulla Dhariwal. Demonstrated high-fidelity, exact-likelihood image generation using normalizing flows; produced the first widely seen publicly accessible demo of photorealistic AI face generation, predating widespread public exposure to image synthesis.

  • Variational Diffusion Models (NeurIPS 2021) — “Variational Diffusion Models,” with Tim Salimans, Ben Poole, and Jonathan Ho. Provided a principled variational framework unifying VAEs and diffusion models, achieved state-of-the-art likelihoods on image benchmarks, and contributed theoretical clarity to the signal-to-noise ratio formulation of diffusion processes that underpins modern text-to-image systems.


Awards & Recognition

  • ICLR 2025 Test of Time Award — For the Adam optimizer paper (with Jimmy Ba).
  • ICLR 2024 Test of Time Award (inaugural) — For the VAE paper (with Max Welling), awarded in the first year the conference introduced this recognition.
  • Dutch Datascience Award (2019) — From the Royal Holland Society of Sciences and Humanities, for contributions to machine learning research.
  • ELLIS PhD Award (2019) — From the European Laboratory for Learning and Intelligent Systems, for outstanding research achievements during the dissertation phase.
  • PhD cum laude, University of Amsterdam (2017) — The highest Dutch doctoral distinction; the first at the UvA CS department in thirty years.
  • Google European Doctoral Fellowship in Deep Learning (2015) — Google’s first such fellowship awarded in Europe.

Key Relationships

  • Max Welling — PhD supervisor at the University of Amsterdam and VAE co-author; the most consequential intellectual partnership of Kingma’s career. Welling’s probabilistic machine learning orientation shaped the entire VAE program and its connection to Bayesian inference.
  • Jimmy Ba — Co-author of the Adam optimizer; the paper’s impact is one of the most dramatic examples of a two-author work reshaping an entire field.
  • Tim Salimans — The closest long-running collaborator of Kingma’s industry years; co-authored Weight Normalization, Inverse Autoregressive Flow, and Variational Diffusion Models across both the OpenAI and Google Brain periods.
  • Prafulla Dhariwal — Google Brain and OpenAI colleague; Glow co-author; later known for DALL-E and diffusion model advances at OpenAI.
  • Jonathan Ho — Variational Diffusion Models co-author; separately known as lead author of DDPM (Denoising Diffusion Probabilistic Models), the paper that anchored the modern diffusion model paradigm.
  • Yann LeCun — Early mentor at NYU’s lab in 2009 and 2012; one of the first senior researchers to give Kingma research experience before his PhD.
  • Ilya Sutskever — OpenAI founding team colleague and co-author on the Inverse Autoregressive Flow paper.
  • Dario Amodei — Anthropic CEO and former OpenAI VP of Research; his recruitment of Kingma to Anthropic reflects shared history and values from the OpenAI period.

Personal Style

Kingma’s research practice is organized around a small number of foundational ideas pursued with mathematical depth rather than broad empirical coverage. The reparameterization trick, introduced for the VAE, is characteristic: a conceptually simple move that resolved a long-standing obstacle to training latent-variable models by gradient descent, and then turned out to apply across a wide range of problems. Adam is similarly constructed — an intuitive, well-motivated algorithm whose broad applicability was not immediately obvious, but which accumulated citations at a rate unmatched by most theoretical results. His trajectory across four institutions (OpenAI, Google Brain, Google DeepMind, Anthropic) reflects a preference for environments where fundamental research is treated as an end in itself rather than a means to product launches. He has made few public statements but those he has made — including his reasoning for joining Anthropic — emphasize the alignment between institutional values and his personal beliefs about responsible AI development. He continues to work from the Netherlands, maintaining a European base unusual among researchers at his level of recognition in the US-centric AI industry.


References