Ridge Regression
Ridge regression is the primary training method in SPIRES. It is a batch algorithm that collects all reservoir states from a training run, assembles them into a matrix, and solves a regularized least-squares problem to find the optimal readout weights. This produces a global optimum in a single pass.
The Training Problem
After driving the reservoir with an input sequence , the reservoir produces a sequence of state vectors , where and is the number of neurons.
The readout layer computes:
where is the readout weight matrix. The goal of training is to find such that approximates the target as closely as possible.
State Matrix Assembly
Define the state matrix by stacking the reservoir state vectors as rows:
and the target matrix similarly:
The training problem is to find that minimizes:
where denotes the Frobenius norm.
Tikhonov Regularization
The ordinary least-squares solution is numerically unstable when the state matrix is rank-deficient or ill-conditioned, which is common in reservoir computing because:
- The number of neurons may exceed the number of training samples .
- Correlated neuron activity produces near-singular covariance matrices.
- Spiking dynamics can produce degenerate states (e.g., all-zero or all-one columns).
Tikhonov regularization (ridge regression) adds a penalty term to the objective:
The closed-form solution is:
The regularization parameter has two effects:
- Numerical stability: Adding to ensures the matrix is positive definite and invertible.
- Generalization: Penalizing large weights prevents overfitting to noise in the training data.
Choosing the Regularization Parameter
The regularization parameter controls the bias-variance trade-off:
| Behavior | |
|---|---|
| Very small () | Near-zero regularization; fits training data closely but may overfit |
| Small ( to ) | Mild regularization; good for clean signals |
| Moderate ( to ) | Strong regularization; better for noisy data |
| Large ( to ) | Over-regularized; underfits the data |
A common approach is to sweep over a logarithmic grid (e.g., ) and select the value that minimizes error on a validation set. The SPIRES optimizer can automate this search.
In log-space, the optimal for the optimizer is stored as best_log10_ridge in the spires_opt_result struct.
SPIRES API
Ridge regression training is performed with spires_train_ridge():
spires_status spires_train_ridge(
spires_reservoir *r,
const double *input_series,
const double *target_series,
size_t series_length,
double lambda
);Parameters:
| Parameter | Description |
|---|---|
r | Pointer to an initialized reservoir |
input_series | Flat array of input values, length series_length * num_inputs |
target_series | Flat array of target values, length series_length * num_outputs |
series_length | Number of time steps in the training data |
lambda | Regularization parameter () |
Returns: SPIRES_OK on success, or an error status code.
Example Usage
#include <spires.h>
#include <stdio.h>
/* Assume reservoir r is already created */
/* Assume input_train[] and target_train[] are populated */
double lambda = 1e-6;
spires_status s = spires_train_ridge(r, input_train, target_train, N_TRAIN, lambda);
if (s != SPIRES_OK) {
fprintf(stderr, "Ridge training failed with status %d\n", s);
spires_reservoir_destroy(r);
return 1;
}
/* Reservoir is now trained -- run inference */
double *predictions = spires_run(r, input_test, N_TEST);Internal Implementation
When you call spires_train_ridge(), SPIRES performs the following steps:
- Reset the reservoir to its initial state.
- Drive the reservoir with the input series, recording the state vector at each time step .
- Assemble the state matrix and target matrix .
- Compute and using BLAS
dgemm. - Regularize by adding to the diagonal of .
- Solve the linear system using LAPACKE (
dposvfor symmetric positive definite systems). - Store the resulting inside the reservoir struct.
The heavy computation is in steps 4 and 6, which are performed by optimized BLAS and LAPACK routines. For a reservoir with neurons and time steps, the matrix is and the solve completes in milliseconds on modern hardware.
Memory Considerations
Ridge regression requires storing the entire state matrix in memory:
- Memory: bytes (for
doubleprecision) - For and : approximately 40 MB
If memory is a constraint (e.g., very long training sequences or very large reservoirs), consider:
- Reducing the training length (ensure it is still long enough to capture the dynamics).
- Using the online delta rule instead, which processes one time step at a time and does not store the state matrix.
Washout Period
For reservoir computing, the first few time steps of the reservoir response depend on the initial conditions rather than the input signal. These transient states should be discarded from the training data to avoid corrupting the readout weights. This is called the washout period.
A common practice is to discard the first 100—500 time steps of the state matrix before assembling . SPIRES handles this automatically when a washout length is specified, or you can account for it by starting your target series after the washout period.
Multi-Output Training
Ridge regression naturally supports multiple outputs. If num_outputs > 1, the target matrix has multiple columns, and the solution has multiple rows. Each output is trained jointly, sharing the same regularization parameter . If different outputs require different regularization, you can train them separately by setting num_outputs = 1 and calling spires_train_ridge() multiple times with different target series.