Mass spectrometry (MS) is typically the first step in identification of an unknown compound; when coupled to chromatography, it can provide information on even sub-nanograms of compound in complex mixtures. Unfortunately, extracting this information from the raw data (mass spectrum) requires either manual interpretation, which is extremely time consuming, or a spectral database (library) search, which is restricted to identifying known, commonly available compounds. In Sean’s research project, he aims to first develop a computational method to generate the most likely molecular structure of an unknown compound, solely from MS data, by first devising a method which, given certain assumptions regarding the physico-chemical process within the mass spectrometer, generates all possible compounds (candidates) which could potentially produce such a mass spectrum. Then, he will develop a way of evaluating how well these candidates match to the experimental mass spectrum using a combination of statistical (machine learning) and physical simulation-based (quantum chemistry) methods.