Objective Machine learning has been effective in other areas of medicine, this study aims to investigate this with regards to HNC and identify which algorithm works best to classify malignant patients. Design An observational cohort study. Setting Queen Elizabeth University Hospital. Participants Patients who were referred via the USOC pathway between January 2019 and May 2021. Main outcome measures Predicting the diagnosis of patients from three categories, benign, potential malignant and malignant, using demographics and symptoms data. Results The logistic regression-based models with a penalty term worked best on the data, ridge achieving an AUC of 0.7081. The demographic features describing living alone and recreational drug use history were the most important variables alongside the red flag symptom of a neck lump. Conclusion Further studies should aim to collect larger samples of malignant and pre-malignant patients to improve the class imbalance and increase the performance of the machine learning models.