Convolutional Neural Networks (CNNs) can detect patterns that are otherwise difficult to identify and have been shown to excel in predicting fault characteristics in laboratory shear experiments and slow slip \emph{in situ}. Here we show that a suitably designed CNN can be trained to identify some precursory change in the seismic signal preceding some large natural earthquakes by up to a few hours, with a variable success rate. We use 65 $\textrm{M}_w\geq 6$ events in the NE pacific in and around Japan from 2012 to 2022. By repeating the training/testing cycle with variable random initial weights, we obtained up to 98\% in training accuracy and 96\% in testing accuracy in discriminating noise and precursor windows. In the $\sim 3$ hours preceding the earthquakes, the network identifies precursors progressively more frequently as earthquake time approaches. A final subset of more recent seismic events was used for a further verification, with mixed results. While the network appears to differentiate noise and precursor with a statistically positive incidence, the results are highly variable depending on the events that are analysed, with poor potential for generalisation. This may indicate that not all earthquakes in the catalog contain precursor signals, or at least no signal similar to the training subset. Discriminative features between precursor and noise windows appear most dominant over a frequency range of $\approx$ 0.1-0.9 Hz (in particular $\approx$0.16 and $\approx$0.21 Hz) broadly coinciding with observations made elsewhere of microseismic noise and broadband slow earthquake signal \cite{masuda_bridging_2020}.