Materials informatics

Artificial Intelligence for materials discovery 

Increasing computational capabilities present a tremendous opportunity to distill structure-property relations using statistical learning to create entirely novel strategies and intuition for designing materials. Computational materials discovery in practice, however, is a battle against complexity on two levels: high cost of quantum electronic-structure calculations of properties, and the vast space of materials to explore. For example, ab-initio molecular dynamics is an accurate but resource-intensive method to study rare events, and kinetic properties governing catalysis and ionic transport are sensitive to atomic structure, which makes brute-force materials screening challenging. Combining machine learning with physical models, we develop strategies to identify and eliminate redundant and irrelevant atomic and electronic degrees of freedom. This leads to higher computational efficiency without significant loss of predictive power, allowing screening of properties at previously inaccessible speed (e.g. thermoelectric and ionic transport). To reduce the number candidates to evaluate, the optimal approach is to first identify and validate rapidly computable descriptors that predict or classify desired material properties. Supervised learning techniques can then be used to quickly screen and interpolate the space of materials structures, bypassing many slow computations and narrowing down the interesting regions.


Automation of materials informatics

Computational materials discovery requires development of sophisticated software tools for automation of complex sequences of calculations and storage and analysis of large data sets. Motivated by the needs of diverse materials screening projects, Kozinsky pioneered the use of modern computer science paradigms of object-relational mapping and functional programming to couple code automation with database storage. Based on this unified architecture we founded the AiiDA project “Automated Interactive Infrastructure and Database”, actively developed in collaboration with EPFL, and powering a growing number of materials discovery efforts. Current work focuses on designing simplified architectures and implementing efficient ways to accelerate computational materials using latest data science and informatics technology. The goals are:

  • Intelligent adaptive automation of computational and data management tasks
  • Minimal high-level programming interfaces using object-oriented workflows
  • Graph storage and traversal of data captured on the fly
  • Full reproducibility and reuse of computed data and codes