The algorithm we use is a generalized instance of the coordinate descent approach [37], which iteratively optimizes matrix A for fixed S and then optimizes S fixing A according to the rules:where ?? is a learning rate, the superscripts, r, indicate iteration, and the subscripts, j, denote the j-th column of a matrix.