Seismological Society of America, Seismological Research Letters, 3(95), p. 1834-1848, 2024
DOI: 10.1785/0220230078
Full text: Unavailable
Abstract Unsupervised machine learning methods are gaining attention in the seismological community as more and larger datasets of continuous waveforms are collected. Recently, contrastive learning for unsupervised feature learning has shown great success in the field of computer vision and other domains, and we aim to transfer these methods to the domain of seismology. Contrastive learning algorithms use data augmentation to implement an instance-level discrimination task: The feature representations of two augmented versions of the same data example are trained to be similar, when at the same time dissimilar to other data examples. In particular, we use the popular contrastive learning method SimCLR. We test data augmentation strategies varying amplitude and frequency of seismological signals, and apply contrastive learning methods to automatically learn features. We use a dataset containing various mostly cryogenic waveforms detected by an STA/LTA short-term average/long-term average algorithm on continuous waveform recordings from the geophysical observatory at Neumayer station, Antarctica. The quality of the features is evaluated on a hand-labeled dataset that includes icequakes, earthquakes, and spikes, and on a larger unlabeled dataset using a classical clustering method, k-means. Results show that the approach separates the different hand-labeled groups with an accuracy of up to 88% and separates meaningful groups within the unlabeled data. Thus, we provide an effective tool for the unsupervised exploration of large seismological datasets and the automated compilation of event catalogs.