Machine‐learning models are capable of capturing the structure–property relationship from a dataset of computationally demanding ab initio calculations. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of calculated electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 82 atoms per unit cell, makes this database a challenging platform for machine learning applications. In this paper, the focus is on predicting the band gap which represents one of the basic properties of a crystalline material. With this aim, a consistent dataset of 12 500 crystal structures and their corresponding DFT band gap are released, freely available for download at https://omdb.mathub.io/dataset. An ensemble of two state‐of‐the‐art models reach a mean absolute error (MAE) of 0.388 eV, which corresponds to a percentage error of 13% for an average band gap of 3.05 eV. Finally, the trained models are employed to predict the band gap for 260 092 materials contained within the Crystallography Open Database (COD) and made available online so that the predictions can be obtained for any arbitrary crystal structure uploaded by a user.
Advanced Quantum Technologies