Hanie Sedghi

Staff Research Scientist

Google DeepMind

I am a Staff Research Scientist at Google DeepMind where I lead the DeepPhenomena team. My research addresses a pivotal challenge in AI: enabling robust planning and scientific reasoning in Foundational Models. My approach is to systematically deconstruct their reasoning failures to uncover the underlying mechanisms and develop targeted interventions. This work builds upon my long-standing research in the science of deep learning for more than a decade, aiming to push the boundaries of what these models can achieve.
I was a workshop chair for NeurIPS 2022 as well as tutorial chair for ICML 2022 and 2023, a program chair for CoLLAs 2023 and have been an area chair for NeurIPS, ICLR and ICML and a member of JMLR Editorial board.

Curriculum Vitae

Google Scholar

Follow @haniesedghi, email: haniesedghi(at)google.com

Current/past mentees

Xinran Zhao (PhD Student at CMU)
Pratyush Maini (PhD Student at CMU)
Hattie Zhou (now at Anthropic)
Mahsa Forouzesh (now at Google)
Rahim Entezari (now at Stability.AI)
Saurabh Garg (now at Thinking Machines Lab)
Samira Abnar (now at Apple ML Research)
Preetum Nakkiran (now at Apple ML Research)
Niladri Chatterji (now at Meta)

Publications

Improving Large Language Model Planning with Action Sequence Similarity
Xinran Zhao, Hanie Sedghi, Bernd Bohnet, Dale Schuurmans, Azade Nova
ICLR 2025, [arXiv:2505.01009]

Exploring and Benchmarking the Planning Capabilities of Large Language Models
B. Bohnet, A. Nova, A. Parisi, K. Swersky, K. Goshvadi, H. Dai, D. Schuurmans, N. Fiedel, Hanie Sedghi
[arXiv:2406.13094]

Training language models on the knowledge graph: Insights on hallucinations and their detectability
J. Hron, L. Culp, G. Elsayed, R. Liu, B. Adlam, et al
COLM 2024, [arXiv:2408.07852]

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
B. Bohnet, K. Swersky, R. Liu, P. Awasthi, A. Nova, et al
[arXiv:2406.00179]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Gemini team
[arXiv:2403.05530]

Beyond human data: Scaling self-training for problem-solving with language models
A. Singh, J. D. Co-Reyes, R. Agarwal, A. Anand, P. Patil, et al
[arXiv:2312.06585]

Frontier Language Models are not Robust to Adversarial Arithmetic
D. Freeman, L. Culp, A. Parisi, M. Bileschi, G. Elsayed, et al
[arXiv:2311.07587]

Can Neural Network Memorization be Localized?
Pratiyush Maini, Mike Mozer, Hanie Sedghi, Zachary C. Lipton, Zico Kolter, Chiyuan Zhang
ICML 2023, [arXiv:2307.09542]

Teaching Algorithmic Reasoning via In-Context Learning,
Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron Courville, Behnam Neyshabur, Hanie Sedghi
NeurIPS 2022 MathAI, [arXiv:2211.09066]

The role of Pre-training data in Transfer Learning,
Rahim Entezari, Mitchell Wortsman, Olga Saukh, Moein Shariatnia, Hanie Sedghi, Ludwig Schmidt
[arXiv:2302.13602]

Leveraging Unlabeled Data to Track Memorization,
Mahsa Forouzesh, Hanie Sedghi, Patrick Thiran
ICLR 2023, [arXiv:2212.04461]

REPAIR: REnormalizing Permuted Activations for Interpolation Repair,
Keller Jordan, Hanie Sedghi, Olga Saukh, Rahim Entezari, Behnam Neyshabur
ICLR 2023, [arXiv:2211.08403] [Code]

Layer-stack temperature scaling,
Amr Khalifa, Mike Mozer, Hanie Sedghi, Behnam Neyshabur, Ibrahim Abdulmohsin
In submission, [arXiv:2211.10193]

Exploring the limits of large scale pre-training,
Samira Abnar, Mostafa Dehghani, Behnam Neyshabur, Hanie Sedghi
ICLR 2022, spotlight, [arXiv:2110.02095]

Leveraging Unlabeled Data to Predict Out-of-Distribution Performance,
Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur , Hanie Sedghi
ICLR 2022, [arXiv:2201.04234]

The role of permutation invariance in linear mode connectivity of neural networks,
Rahim Entezari, Hanie Sedghi, Olga Saukh, Behnam Neyshabur
ICLR 2022, [arXiv:2110.06296]

Avoiding Spurious Correlations: Bridging Theory and Practice,
Thao Nguyen, Vaishnavh Nagarajan, Hanie Sedghi, Behnam Neyshabur
NeurIPS 2021, DistShift workshop, [paper]

Gradual Domain Adaptation in the Wild: When Intermediate Distributions are Absent,
Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi
[arXiv:2016.06080]

Understanding the effect of sparsity on neural networks robustness,
Lukas Timpl, Rahim Entezari, Hanie Sedghi, Behnam Neyshabur, Olga Saukh
Overparametrization: Pitfalls and Opportunities, ICML 2021

The Deep Bootstrap: Good Online Learners are Good Offline Generalizers,
Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
International Conference on Learning Representations (ICLR), 2021.
[arXiv:2010.08127] [GoogleAI blog post] [Cifar-5m dataset][code]

What is being transferred in transfer learning?,
Hanie Sedghi*, Behnam Neyshabur*, Chiyuan Zhang*. (equal contribution)
Neural Information Processing Systems (NeurIPS), 2020.
[arXiv:2008.11687] [code] [poster] [short video] [talk at Harvard ML Theory]

Regularizing the training of convolutional neural networks,
Vineet Gupta, Phil Long, Hanie Sedghi.
US Patent 16422797

The intriguing role of module criticality in the generalization of deep networks,
Niladri Chatterji, Behnam Neyshabur, Hanie Sedghi.
International Conference on Learning Representations (ICLR), 2020 (spotlight).
[arXiv:1912.00528] [code]

Generalization bounds for deep convolutional neural networks,
Phil Long*, Hanie Sedghi*. (alphabetical order)
International Conference on Learning Representations (ICLR), 2020.
[arXiv:1912.00528 ] [talk at Simons Institute]

On the effect of activation function on distribution of hidden nodes in a deep network,
Phil Long*, Hanie Sedghi*. (alphabetical order)
Neural Computation 31 (12), 2562-2580.
[arXiv:1901.02104 ]

The singular values of convolutional layers,
Hanie Sedghi, Vineet Gupta, Phil Long,
International Conference on Learning Representations (ICLR), 2019.
[arXiv:1805.10408 ][code]

MLSys: The new frontiers of machine learning systems,
Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R Ganger, Lise Getoor, Phillip B Gibbons, Garth A Gibson, Joseph E Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael I Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar
[arXiv:1904.03257 ]

Knowledge completion for generics using guided tensor factorization,
Hanie Sedghi, Ashish Sabharwal
Transactions of the Association for Computational Linguistics 6, 197-210,2018
[paper ]

How Good Are My Predictions? Efficiently Approximating Precision-Recall Curves for Massive Datasets.,
Ashish Sabharwal*, Hanie Sedghi* (alphabetical order)
Conference on Uncertainty in Artificial Intelligence (UAI), 2017
[paper ]

Provable tensor methods for learning mixtures of generalized linear models,
Hanie Sedghi, Majid Janzamin, Anima Anandkumar
Artificial Intelligence and Statistics (AIStats), 2016
[paper ]

Training Input-Output Recurrent Neural Networks through Spectral Methods,
Hanie Sedghi, Anima Anandkumar
[arXiv:1603.00954]

Beating the perils of non-convexity: Guaranteed training of neural networks using tensor methods
Majid Janzamin, Hanie Sedghi, Anima Anandkumar
Artificial Intelligence and Statistics (AIStats), 2016
[arXiv:1506.08473 ] [talk at MlConf]

FEAST at play: Feature ExtrAction using score function Tensors
Majid Janzamin*, Hanie Sedghi*, U.N. Niranjan, Anima Anandkumar (equal contribution)
Feature Extraction: Modern Questions and challenges, NeurIPS , 2015
[paper ]

Learning mixed membership community models in social tagging networks through tensor methods,
Anima Anandkumar, Hanie Sedghi
[arXiv:1503.04567]

Score Function Features for Discriminative Learning
Majid Janzamin, Hanie Sedghi, Anima Anandkumar
International Conference on Learning Representations (ICLR), 2015.
[arXiv:1412.2863]

Provable Methods for training neural networks with sparse connectivity,
Hanie Sedghi, Anima Anandkumar
International Conference on Learning Representations (ICLR), 2015.
[arXiv:1412.2693 ]

Stochastic optimization in high dimensions
Hanie Sedghi
University of Southern California.
[thesis ]

Multi-step stochastic ADMM in high dimensions: Applications to sparse optimization and matrix decomposition
Hanie Sedghi, Anima Anandkumar, Edmond Jonckheere
Neural Information Processing Systems (NIPS), 2014.
[conference version ] [full paper] [video]

Statistical Structure Learning to Ensure Data Integrity in Smart Grid
Hanie Sedghi, Edmond Jonckheere
IEEE Transactions on Smart Grid, Vol 6, issue 4.
[paper ]

Statistical Structure Learning of Smart Grid for Detection of False Data Injection
Hanie Sedghi, Edmond Jonckheere
IEEE Power and Energy Society General Meeting, 2013.
[paper ]

On Conditional Mutual Information in Gaussian-Markov Structured Grids
Hanie Sedghi, Edmond Jonckheere
Information and Control in Networks, G. Como, B. Bernhardson, and A. Rantzer, Springer.
[paper ]

A Misbehavior-Tolerant Multipath Routing Protocol for Wireless Ad hoc Networks
Hanie Sedghi, MohammadReza Pakravan, MohammadReza Aref
International Journal of Wireless Information Networks, 2011
[paper ]

A Game-Theoretic Approach for Power Allocation in Bidirectional Cooperative Communication
Majid Janzamin, MohammadReza Pakravan, Hanie Sedghi
IEEE Wireless Communication and Networking Conference, 2010
[paper ]