The fusion of multiple monitoring sensors is crucial to improve the accuracy and robustness of machinery fault diagnosis. However, existing fault diagnosis methods may underestimate the interference of noise in the multi-sensor fusion process, leading to unsatisfied performance. To handle this problem, this paper proposes a deep model based on the frequency adaptive wavelet pyramid. First, an adaptive frequency selection strategy is designed to prune the seriously polluted frequencies and only retain some key frequencies. Then, the self-attention mechanism is used to perform information fusion on the selected frequency bands of different sensors. Finally, a wavelet fusion pyramid is adopted by repeating the fusion process at multiple wavelet decomposition levels. In this way, different sensors can be fused in a more fine-grained manner. The experimental results on two multi-sensor-based fault diagnosis datasets demonstrate the anti-noise capability of our proposed method.