Machine Learning
Scientific frontiers advancing my research use artificial intelligence (AI), particularly machine learning (ML) to solve problems too difficult for traditional computational physics methods or to augment the need for manual inspection when processing noisy astronomical data, including dynamical and spectrocopic data from distant galaxies.
Supervised Machine Learning
Computer Assisted Spectroscopic Inspection of Gravitational Lensing Objects (CASIGLO)
Joseph E. Summers M.Sc. Project April 2022 [PDF] |
In addition to replacing manual inspection, which is both tedious and inconsistent, a neural network trained on existing spectroscopic detections many of which have been confirmed (and graded) by followup imaging including space telescopes, will make possible automated detection and characterization of lenses from new large datasets. A well-trained CASIGLO AI could detect many of the lensed objects overlooked by experts, significantly increasing the number of lensed objects available to be studied. |
Strategy | Create a neural network using supervised machine learning trained on the largest dataset of spectroscopically identified gravitational lenses, given by the ACSLens, BELLS, and SILO data. |
Goal | To extend the existing discoveries to new data from SDSS-V and DESI, without needing to manually inspect each candidate as was the done for the training set. |
Classifier | Using linear classification with a set of knowledge-based labels which meet a treshold limit provides a parameter set which provides the neural network with a linear function which spans the parameter space with an intuitive geometry. For example, a support vector machine with a hinge loss cost function or logistic regression, which is probabilistic using a logistic loss function. |
Hypothesis Space | Every input (feature) gets a multiplicate weight and a global additive bias, which are derived from the training. Note: least mean squared regression is a method used to find the best fit line for continuous data by minimizing the sum of squared errors, while chi-squared is a statistical test used to compare observed frequencies with expected frequencies in categorical data. |
Empirical Loss Minimization | The cost of linear regression can be minimized by picking a loss function using a regularizer which biases the learner towards simpler hypotheses rather than memorization |
Neural Network | Standard convolution network (Fukushima, LeCun) is used for simplicity, but other choices are available. |