سيرغي ليفاين

Associate Professor at UC Berkeley and co-founder and Chief Scientist of Physical Intelligence, whose research on deep reinforcement learning, offline RL, and visuomotor policy learning has established him as one of the most cited researchers in robotics and machine learning, with over 247,000 Google Scholar citations.


Profile

Field Detail
Full name Sergey Levine
Date of birth Not publicly available
Nationality American
Primary institution UC Berkeley, EECS
Current role Associate Professor; Co-founder & Chief Scientist, Physical Intelligence
Research areas Deep reinforcement learning, offline RL, robot learning, imitation learning, visuomotor policy, foundation models for robotics
PhD thesis Computer Science, Stanford University, 2014
PhD advisor Vladlen Koltun
Research group RAIL Lab (Robotic Artificial Intelligence and Learning)
Personal website people.eecs.berkeley.edu/~svlevine
Substack Learning and Control
X / Twitter @svlevine
Google Scholar 8R35rCwAAAAJ

Overview

Sergey Levine is an Associate Professor of Electrical Engineering and Computer Sciences at UC Berkeley and co-founder and Chief Scientist of Physical Intelligence (pi.ai), the leading startup building foundation models for general-purpose robotics. His research program spans deep reinforcement learning for robotic control, offline RL, meta-learning, and large-scale visuomotor policy training, and has produced a sequence of widely adopted algorithms and benchmarks — including guided policy search, end-to-end visuomotor policies, Soft Actor-Critic (SAC), Conservative Q-Learning (CQL), and the D4RL benchmark suite — that collectively define the modern landscape of robot learning. With over 247,000 Google Scholar citations as of 2026, he is among the most cited researchers working at the intersection of ML and robotics. At Berkeley he leads the RAIL Lab and teaches the widely followed CS 285: Deep Reinforcement Learning course. He co-founded Physical Intelligence in March 2024 alongside Chelsea Finn, Karol Hausman, Brian Ichter, and others; the company has raised over $1.1 billion in total funding and is developing cross-embodiment robot foundation models.


Early Life & Education

Levine completed both his B.S. and M.S. in Computer Science at Stanford University in 2009, then remained at Stanford for his doctoral work. His Ph.D. in Computer Science was completed in 2014 under the supervision of Vladlen Koltun (then an assistant professor of computer science at Stanford and researcher at Adobe Research). His doctoral thesis developed the guided policy search framework, which uses trajectory optimization as a mechanism to guide and stabilize policy learning in complex robotic systems. Following his doctorate, he conducted postdoctoral research at the Robot Learning Lab at UC Berkeley with Pieter Abbeel, where he developed the end-to-end visuomotor policy work that became one of the most influential early papers in deep robotic learning. He also had an early internship at NVIDIA during his undergraduate years.


Career

Google Brain — Research Scientist (2015–c. 2021)

Levine joined Google Brain as a part-time research scientist in 2015, while completing his postdoc, and maintained a joint affiliation after joining Berkeley’s faculty in 2016. At Google Brain he collaborated on large-scale robot learning, contributing to the Learning Hand-Eye Coordination project (a fleet of 14 robot arms collecting grasping data concurrently), the Robotics Transformer (RT-1, RT-2) line of work, and several foundational deep RL papers including Soft Actor-Critic. His concurrent industry and academic affiliation during this period was productive in both directions, bringing scale and hardware access to Berkeley-led research and theoretical rigor to Google’s applied robotics efforts.

UC Berkeley, EECS — Associate Professor (Fall 2016–present)

Levine joined the Berkeley faculty in fall 2016, establishing the RAIL Lab (Robotic Artificial Intelligence and Learning Lab). The lab has produced an outsized body of research across deep RL, meta-learning, offline RL, and robot learning at scale. Notable doctoral graduates include Chelsea Finn (now Associate Professor at Stanford and co-founder of Physical Intelligence), who co-developed MAML under Levine’s supervision. Levine teaches CS 285: Deep Reinforcement Learning, one of the most widely watched open courses in the field, with lecture recordings available on YouTube. He also teaches CS 182: Deep Learning. He holds an NSF CAREER Award and a Sloan Research Fellowship (2019), and received the Presidential Early Career Award for Scientists and Engineers (PECASE).

Physical Intelligence (pi.ai) — Co-founder & Chief Scientist (March 2024–present)

Physical Intelligence was founded in early 2024 by Levine alongside Karol Hausman (CEO, formerly Google DeepMind), Chelsea Finn (Research Lead, Stanford), Brian Ichter (formerly Google Research), Lachy Groom (former Stripe executive), Adnan Esmail, and Quan Vuong. The company raised a $70 million seed round in March 2024 from Thrive Capital, Khosla Ventures, Lux Capital, OpenAI, and Sequoia Capital. A $400 million Series A (valuation: $2.4 billion) followed in late 2024, and a $600 million Series B led by CapitalG closed in November 2025, bringing total funding to over $1.1 billion. Physical Intelligence’s mission is to build cross-embodiment foundation models — a single generalist policy that can control any robot for any physical task — by combining large-scale robot data collection, algorithmic advances, and the techniques of foundation model training. Levine serves as Chief Scientist, guiding the scientific direction of the foundation models while maintaining his Berkeley faculty position.


Key Contributions

  • Guided Policy Search (GPS) (Levine and Koltun; ICML 2013) — Introduced trajectory optimization as a principled exploration and supervision strategy for policy learning, using differential dynamic programming to generate guiding samples and supervised learning to train the policy. GPS became the first scalable deep RL method for real robotic systems and underpinned several years of subsequent robot learning work.

  • End-to-End Training of Deep Visuomotor Policies (Levine, Finn, Darrell, Abbeel; Journal of Machine Learning Research, 2016) — Demonstrated that deep convolutional neural network policies mapping raw image observations directly to robot motor torques could be trained end-to-end using guided policy search, enabling robots to learn visually-guided manipulation tasks without hand-engineered perception modules. One of the most cited papers in deep robotic learning.

  • Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection (Levine et al.; International Journal of Robotics Research, 2018) — Deployed a fleet of 14 robotic arms that autonomously collected approximately 800,000 grasp attempts, training a deep neural network to predict grasp success from images. Demonstrated that scale and self-supervised data collection could substitute for manual engineering in robotic perception.

  • Model-Agnostic Meta-Learning (MAML) (Finn, Abbeel, Levine; ICML 2017) — Co-developed with doctoral student Chelsea Finn; proposed a meta-learning algorithm that optimizes for fast adaptation to new tasks via a small number of gradient steps, applicable to any gradient-based model. Among the most cited papers in meta-learning, with broad influence across few-shot learning and robotics.

  • Soft Actor-Critic (SAC) (Haarnoja, Zhou, Abbeel, Levine; ICML 2018) — Introduced a maximum entropy reinforcement learning framework with an off-policy actor-critic algorithm that stabilizes training and improves sample efficiency. SAC became the default baseline algorithm for continuous control tasks in deep RL and is among the most widely used RL algorithms in both research and applications.

  • Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (Levine, Kumar, Tucker, Fu; 2020) — Canonical survey that formalized offline RL as a field, articulated the distributional shift problem, and organized prior and future work under a unified framework. Cited as the standard reference for offline RL.

  • D4RL: Datasets for Deep Data-Driven Reinforcement Learning (Fu, Kumar, Nachum, Tucker, Levine; 2020) — Released the standard offline RL benchmark suite, providing datasets reflecting real-world data distribution challenges — human demonstrations, mixed-quality data, diverse task structures — that previously lacked systematic coverage. D4RL became the canonical evaluation framework for offline RL algorithms.

  • Conservative Q-Learning (CQL) (Kumar, Zhou, Tucker, Levine; NeurIPS 2020) — Proposed a pessimism-based offline RL algorithm that regularizes Q-values to be conservative on out-of-distribution actions, provably addressing overestimation in offline settings. CQL set state-of-the-art results on D4RL and influenced a generation of subsequent offline RL methods.

  • Implicit Q-Learning (IQL) (Kostrikov, Nair, Levine; ICLR 2022) — Introduced an offline RL algorithm that avoids querying out-of-distribution actions entirely by learning a value function over in-distribution data, combining simplicity with strong empirical performance on D4RL and real robot tasks.

  • RT-2: Vision-Language-Action Models (Brohan et al. including Levine; CoRL 2023) — Co-developed at Google with a large team; demonstrated that pretrained vision-language models could be fine-tuned directly as robotic action policies, enabling generalization to novel objects and instructions through emergent chain-of-thought-style reasoning. A foundational paper in the VLA (Vision-Language-Action) paradigm now pursued by Physical Intelligence and others.


Awards & Recognition

  • Presidential Early Career Award for Scientists and Engineers (PECASE) — Awarded by NSF for groundbreaking research advancing American innovation
  • Sloan Research Fellowship (2019) — Alfred P. Sloan Foundation
  • NSF CAREER Award
  • Google Scholar citations: 247,000+ (as of May 2026), placing him among the most cited researchers in ML and robotics

Key Relationships

  • Vladlen Koltun — PhD advisor at Stanford; computer scientist who co-authored guided policy search and shaped Levine’s early thinking on combining trajectory optimization with policy learning.

  • Pieter Abbeel — Postdoctoral host at Berkeley and long-term collaborator; the end-to-end visuomotor paper and SAC were produced in joint work; Abbeel’s Robot Learning Lab and Levine’s RAIL Lab represent complementary research programs at Berkeley.

  • Chelsea Finn — Levine’s most prominent PhD student; co-developer of MAML and the end-to-end visuomotor line of research; now Associate Professor at Stanford, co-founder and Research Lead at Physical Intelligence. The intellectual and institutional relationship between Levine and Finn is the academic lineage at the center of Physical Intelligence.

  • Karol Hausman — Co-founder and CEO of Physical Intelligence; formerly a researcher at Google DeepMind who co-founded the company with Levine and Finn.

  • Aviral Kumar — PhD student at Berkeley, first author on CQL and D4RL; represents the offline RL strand of Levine’s lab that became one of its most impactful research directions.

  • Tuomas Haarnoja — Collaborator and first author on Soft Actor-Critic; part of the Berkeley-Google joint research environment that produced SAC; now a researcher at Google DeepMind.

  • Vladlen Koltun (see above) and Sergey’s broader Stanford genealogy — Koltun himself was advised at Stanford; the chain of influence connects Levine to a line of computationally rigorous thinking about optimization and geometry adapted for robotic control.


Personal Style

Levine’s research is characterized by a consistent ambition to replace hand-engineering with general learning systems — a conviction expressed from guided policy search in 2013 through cross-embodiment foundation models in 2024. He has described the central challenge of robotics as not a hardware problem but a data and learning problem, arguing repeatedly in lectures, blog posts, and his Substack (“Learning and Control”) that the same data scaling dynamics that transformed NLP and vision will transform robotics, given the right algorithms and data infrastructure. He is an unusually active science communicator: CS 285 attracts hundreds of thousands of YouTube views and is widely used as a graduate course at other institutions; his Substack and Medium articles offer accessible explanations of offline RL and robot learning for broad technical audiences. On X (@svlevine), his posts reflect an informing-and-teaching vibe, consistent with someone who views public education as integral to the research mission rather than ancillary to it.


References