The application of machine learning techniques in seismology has greatly advanced seismological analysis, especially for earthquake detection and seismic phase picking. However, machine learning approaches still face challenges in generalizing to datasets that differ from their original setting. Previous studies focused on retraining or transfer-training models for these scenarios, though restricted by the availability of high-quality labeled datasets. This paper demonstrates a new approach for augmenting already trained models without the need for additional training data. We propose four strategies - rescaling, model aggregation, shifting, and filtering - to enhance the performance of pre-trained models on out-of-distribution datasets. We further devise various methodologies to ensemble the individual predictions from these strategies to obtain a final unified prediction result featuring prediction robustness and detection sensitivity. We develop an open-source Python module quakephase that implements these methods and can flexibly process input continuous seismic data of any sampling rate. With quakephase and pre-trained ML models from SeisBench, we perform systematic benchmark tests on data recorded by different types of instruments, ranging from acoustic emission sensors to distributed acoustic sensing, and collected at different scales, spanning from laboratory acoustic emission events to major tectonic earthquakes. Our tests highlight that rescaling is essential for dealing with small-magnitude seismic events recorded at high sampling rates as well as larger magnitude events having long coda and remote events with long wave trains. Our results demonstrate that the proposed methods are effective in augmenting pre-trained models for out-of-distribution datasets, especially in scenarios with limited labeled data for transfer learning.