A PhD position is open in the Machine Learning Group QARMA at the Computer Science Lab of Aix-Marseille University. The position is available for 3 years starting November or December 2024. Information about the QARMA Group can be found at https://qarma.lis-lab.fr.
The PhD project is funded by the MLChem project of the AMidex Foundation.
Description: This project forms part of a collaboration between machine learners and theoretical chemists in Aix-Marseille University (AMU), with the aim of developing highly accurate Machine Learning (ML) models to enable fast and reliable Nonadiabatic molecular dynamics (NAMD). The challenge faced is to increase the accuracy of the ML predictions for any molecular geometries using physics-informed ML models based on sophisticated learning techniques. By benchmarking several existing ML models developed for ground-state MD, previous investigations revealed that when applied to NAMD, these ML models have significant errors in specific PES regions, degrading ML-NAMD [1,2]. To ensure ML-NAMD quality, our goal is to achieve a higher prediction accuracy by exploring sophisticated physics-informed ML models [3]. The aim of this PhD work is to develop new ML models combining previous findings in pure ML dealing with multitask learning [4] and gradient learning [5] with the more recent results in physics-informed ML in order to improve the prediction of excited-state energy gradients for unseen molecular geometries.
Building upon recent progresses made in the field [6, 7, 8, 9, 10], we will explore advanced techniques at all levels of the data science pipeline, aiming to systematically improve the quality of the ML model to fulfill the tight accuracy requirements in force predictions for NAMD simulations. By introducing the training paradigm of multitask learning and gradient learning, one can boost the performance of an ML model. Multitask learning is a paradigm where multiple tasks are learned simultaneously to improve the generalization performance of a learning task with the help of other related tasks. While the typical protocol for ML-NAMD is to train one model independently to predict each state’s PES energy, gradient, and couplings, ML-NAMD could significantly benefit from multitask extensions, which have not been employed up to now. Gradient learning is a less known but potentially valuable framework where the goal is to learn the gradient of a classification or a regression function, with or without supervision. In addition to the conventional learning of the energy using gradient information, we will explore strategies based on explicitly learning the gradient function, starting from neural networks in a multioutput and multitask setting and expanding to other designs. Gradient learning may be pivotal to inferring geometry and statistical dependence in the predictive ML-NAMD.
[1] Mukherjee et al. Philos Trans R Soc A 2022, 380, 20200382. https://doi.org/10.1098/rsta.2020.0382
[2] Pinheiro Jr et al. Chem Sci 2021, 12, 14396. www.doi.org/10.1039/d1sc03564a
[3] Karniadakis et al. Nat Rev Phys 2021, 3, 422. www.doi.org/10.1038/s42254-021-00314-5
[4] Crawshaw. arXiv 2020, in press. www.doi.org/10.48550/ARXIV.2009.09796
[5] Wu et al. J Mach Learn Res 2010, 11, 2175, https://jmlr.org/papers/v11/wu10a.html
[6] Gilmer et al. ICML 2017. https://proceedings.mlr.press/v70/gilmer17a
[7] Batatia et al. NeurIPS 2022, 11423-11436. https://dl.acm.org/doi/10.5555/3600270.3601100
[8] Batzner et al. Nat Comm. 2022, 13.1, 2453. https://doi.org/10.1038/s41467-022-29939-5
[9] Thölke et al. ICLR, 2022. https://openreview.net/forum?id=zNHzqZ9wrRB
[10] Ishiai et al. J Chem Theory Comput 2024, 819-831. https://doi.org/10.1021/acs.jctc.3c00995
Supervisors:
- Prof. Thierry Artières, Machine Learning team (QARMA), Computer Science lab (LIS, CNRS), Aix-Marseille University, Ecole Centrale Marseille thierry.artieres@lis-lab.fr
- Prof. Mario Barabati, Theoretical Chemistry team, Institut de Chimie Radicalaire (ICM, CNRS), Aix-Marseille University mario.barbatti@univ-amu.fr
- Prof. Hachem Kadri, Machine Learning team (QARMA), Computer Science lab (LIS, CNRS), Aix-Marseille University hachem.kadri@lis-lab.fr
Selection Criteria: We are looking for a highly motivated candidate with a master degree in machine learning or quantum chemistry. Experience in the use of machine learning in chemistry would be strongly appreciated. Good knowledge of deep learning frameworks. Strong communication, data presentation and visualization skills. Strong motivation to advance the project by pro-actively developing personal ideas.
Application procedure: All the correspondence regarding this position, including informal inquiry and formal application, should be addressed to Prof. Hachem Kadri.
Applications must include:
1) A cover letter detailing how you meet the selection criteria for the post;
2) A complete academic CV;
3) Master’s transcripts;
3) A sample of scientific output, e.g. a chapter of the thesis;
4) The e-mail contacts of at least two referees who have agreed to provide a reference letter;
Review of the applications will start on the 1st of October at the latest and the position will remain open until a suitable candidate is identified. A first round of interviews is expected to be held no later than mid October and will be held remotely.