Title: | A Matrix-Free Multigrid Preconditioner for Spline Smoothing |
---|---|
Description: | Data smoothing with penalized splines is a popular method and is well established for one- or two-dimensional covariates. The extension to multiple covariates is straightforward but suffers from exponentially increasing memory requirements and computational complexity. This toolbox provides a matrix-free implementation of a conjugate gradient (CG) method for the regularized least squares problem resulting from tensor product B-spline smoothing with multivariate and scattered data. It further provides matrix-free preconditioned versions of the CG-algorithm where the user can choose between a simpler diagonal preconditioner and an advanced geometric multigrid preconditioner. The main advantage is that all algorithms are performed matrix-free and therefore require only a small amount of memory. For further detail see Siebenborn & Wagner (2021). |
Authors: | Martin Siebenborn [aut, cre, cph], Julian Wagner [aut, cph] |
Maintainer: | Martin Siebenborn <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2 |
Built: | 2024-11-09 04:52:40 UTC |
Source: | https://github.com/splinesmoothing/mgss |
Fits a smooth spline to a set of given observations using penalized splines with curvature or difference penalty and multiple covariates. The underlying linear system is solved with a matrix-free conjugated gradient (CG) method.
CG_smooth( m, q, lambda, X, y, pen_type = "curve", l = NULL, alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE )
CG_smooth( m, q, lambda, X, y, pen_type = "curve", l = NULL, alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE )
m |
Vector of non-negative integers. Each entry gives the number of inner knots for the respective covariate. |
q |
Vector of positive integers. Each entry gives the spline degree for the respective covariate. |
lambda |
Positive number as weight for the penalty term. |
X |
Matrix containing the covariates as columns and the units as rows. |
y |
Vector of length |
pen_type |
Utilized penalization method. Either |
l |
Positive integer vector of length |
alpha_start |
Vector of length |
K_max |
Positive integer as upper bound for the number of CG-iterations. Defaults to |
tolerance |
Positive number as error tolerance for the stopping criterion of the CG-method. Defaults to |
print_error |
Logical, indicating if the iteration error should be printed or not. |
Returns a list containing the input m
, q
, and Omega
. Further gives the fitted spline coefficients alpha
, the fitted values fitted_values
, the residuals residuals
, the root mean squared error rmse
and the R-squared value R_squared
.
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train CG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y)
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train CG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y)
Estimates the trace of the (unknown) hat-matrix by stochastic estimation in a matrix-free manner.
estimate_trace(m, q, lambda, X, pen_type = "curve", l = NULL, n_random = 5)
estimate_trace(m, q, lambda, X, pen_type = "curve", l = NULL, n_random = 5)
m |
Vector of non-negative integers. Each entry gives the number of inner knots for the respective covariate. |
q |
Vector of positive integers. Each entry gives the spline degree for the respective covariate. |
lambda |
Positive number as weight for the penalty term. |
X |
Matrix containing the covariates as columns and the units as rows. |
pen_type |
Utilized penalization method. Either |
l |
Positive integer vector of length |
n_random |
Positive integer for the number of random vectors in the trace estimate. Defaults to |
An estimate of the trace of the hat-matrix.
data <- generate_test_data(100, 2) X <- data$X_train estimate_trace(m = c(7,7), q = c(2,2), lambda = 0.1, X = X)
data <- generate_test_data(100, 2) X <- data$X_train estimate_trace(m = c(7,7), q = c(2,2), lambda = 0.1, X = X)
Generate a P
-dimensional test data set based on a sigmoid function.
generate_test_data(n, P, split = 0.8)
generate_test_data(n, P, split = 0.8)
n |
Numer of samples |
P |
Spatial dimension |
split |
A value between |
A list of the covarite matrices for the train and test data X_train
and X_test
and of the variable of interest y_train
and y_test
.
generate_test_data(100, 2)
generate_test_data(100, 2)
Fits a smooth spline to a set of given observations using penalized splines with curvature penalty and multiple covariates. The underlying linear system is solved with a matrix-free preconditioned conjugated gradient method using a geometric multigrid method as preconditioner.
MGCG_smooth( G, q, lambda, X, y, w = 0.1, nu = c(3, 1), alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE, coarse_grid_solver = "Cholesky" )
MGCG_smooth( G, q, lambda, X, y, w = 0.1, nu = c(3, 1), alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE, coarse_grid_solver = "Cholesky" )
G |
Positive integer greater than one for the maximum number of grids. |
q |
Vector of positive integers. Each entry gives the spline degree for the respective covariate. |
lambda |
Positive number as weight for the penalty term. |
X |
Matrix containing the covariates as columns and the units as rows. |
y |
Vector of length |
w |
Damping factor of the Jacobi smoother. Defaults to |
nu |
Two-dimensional vector of non-negative integers. Gives the number of pre- and post-smoothing steps in the multigrid algorithm. |
alpha_start |
Vector of length |
K_max |
Positive integer as upper bound for the number of MGCG-iterations. Defaults to |
tolerance |
Positive number as error tolerance for the stopping criterion of the MGCG-method. Defaults to |
print_error |
Logical, indicating if the iteration error should be printed or not. |
coarse_grid_solver |
Utilized coarse grid solver. Either |
Returns a list containing the input m = 2^G-1
, q
, and Omega
. Further gives the fitted spline coefficients alpha
, the fitted values fitted_values
, the residuals residuals
, the root mean squared error rmse
and the R-squared value R_squared
.
Siebenborn, M. and Wagner, J. (2019) A Multigrid Preconditioner for Tensor Product Spline Smoothing. arXiv:1901.00654
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train MGCG_smooth(G = 3, q = c(3,3), lambda = 0.1, w = 0.8, X = X, y = y)
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train MGCG_smooth(G = 3, q = c(3,3), lambda = 0.1, w = 0.8, X = X, y = y)
Fits a smooth spline to a set of given observations using penalized splines with curvature or difference penalty and multiple covariates. The underlying linear system is solved with a matrix-free preconditioned conjugated gradient (PCG) method using a diagonal preconditioner.
PCG_smooth( m, q, lambda, X, y, pen_type = "curve", l = NULL, alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE )
PCG_smooth( m, q, lambda, X, y, pen_type = "curve", l = NULL, alpha_start = NULL, K_max = NULL, tolerance = 1e-06, print_error = TRUE )
m |
Vector of non-negative integers. Each entry gives the number of inner knots for the respective covariate. |
q |
Vector of positive integers. Each entry gives the spline degree for the respective covariate. |
lambda |
Positive number as weight for the penalty term. |
X |
Matrix containing the covariates as columns and the units as rows. |
y |
Vector of length |
pen_type |
Utilized penalization method. Either |
l |
Positive integer vector of length |
alpha_start |
Vector of length |
K_max |
Positive integer as upper bound for the number of PCG-iterations. Defaults to |
tolerance |
Positive number as error tolerance for the stopping criterion of the PCG-method. Defaults to |
print_error |
Logical, indicating if the iteration error should be printed or not. |
Returns a list containing the input m
, q
, and Omega
. Further gives the fitted spline coefficients alpha
, the fitted values fitted_values
, the residuals residuals
, the root mean squared error rmse
and the R-squared value R_squared
.
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train PCG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y)
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train PCG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y)
Makes predictions of new observations from a fitted spline model.
predict_smooth(model_smooth, X)
predict_smooth(model_smooth, X)
model_smooth |
A spline model resulting from |
X |
Matrix containing the new observations. |
Vector of length nrow(X)
of predictions.
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train result <- PCG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y, print_error = FALSE) X_test <- data$X_test predict_smooth(model_smooth = result, X = X_test)
data <- generate_test_data(100, 2) X <- data$X_train y <- data$y_train result <- PCG_smooth(m = c(7,7), q = c(3,3), lambda = 0.1, X = X, y = y, print_error = FALSE) X_test <- data$X_test predict_smooth(model_smooth = result, X = X_test)