arXiv, 2020
DOI: 10.48550/arxiv.2008.10178
Oxford University Press, Monthly Notices of the Royal Astronomical Society, 4(499), p. 6009-6017, 2020
ABSTRACT The amount of observational data produced by time-domain astronomy is exponentially increasing. Human inspection alone is not an effective way to identify genuine transients from the data. An automatic real-bogus classifier is needed and machine learning techniques are commonly used to achieve this goal. Building a training set with a sufficiently large number of verified transients is challenging, due to the requirement of human verification. We present an approach for creating a training set by using all detections in the science images to be the sample of real detections and all detections in the difference images, which are generated by the process of difference imaging to detect transients, to be the samples of bogus detections. This strategy effectively minimizes the labour involved in the data labelling for supervised machine learning methods. We demonstrate the utility of the training set by using it to train several classifiers utilizing as the feature representation the normalized pixel values in 21 × 21 pixel stamps centred at the detection position, observed with the Gravitational-wave Optical Transient Observer (GOTO) prototype. The real-bogus classifier trained with this strategy can provide up to $95{{\ \rm per\ cent}}$ prediction accuracy on the real detections at a false alarm rate of $1{{\ \rm per\ cent}}$.