Skip to content

Losses API

anfis_toolbox.losses

Loss functions and their gradients for ANFIS Toolbox.

This module centralizes the loss definitions used during training to make it explicit which objective is being optimized. Trainers can import from here so the chosen loss is clear in one place.

CrossEntropyLoss

Bases: LossFunction

Categorical cross-entropy loss operating on logits.

Implements cross-entropy loss for multi-class classification tasks. Accepts raw logits (unbounded scores) and computes numerically stable loss using log-softmax formulation.

The loss is defined as

L = -(1/n) * Σ Σ y_true[i,j] * log(softmax(logits)[i,j])

And its gradient with respect to logits is

∇L = (1/n) * (softmax(logits) - y_true)

Numerical stability is achieved through
  • Stable log-softmax computation in loss method
  • Stable softmax via maximum subtraction in gradient method

gradient

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Compute gradient of cross-entropy with respect to logits.

The gradient simplifies to: softmax(logits) - one_hot(y_true) This form is derived from the chain rule applied to the cross-entropy loss.

Accepts integer labels or one-hot encoded targets. Returns gradient with the same shape as logits.

Parameters:

Name Type Description Default
y_true ndarray

Array of shape (n_samples,) of integer class labels, or one-hot encoded array of shape (n_samples, n_classes).

required
y_pred ndarray

Raw logit scores of shape (n_samples, n_classes).

required

Returns:

Type Description
ndarray

np.ndarray: Gradient of shape (n_samples, n_classes) with values typically in range [-1, 1] indicating direction to decrease loss.

Raises:

Type Description
ValueError

If one-hot y_true shape doesn't match logits shape.

loss

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute mean cross-entropy from integer labels or one-hot vs logits.

Uses stable log-softmax computation to prevent numerical underflow. Handles both integer class labels and one-hot encoded targets.

Parameters:

Name Type Description Default
y_true ndarray

Array of shape (n_samples,) of integer class labels (0 to n_classes-1), or one-hot encoded array of shape (n_samples, n_classes).

required
y_pred ndarray

Raw logit scores of shape (n_samples, n_classes).

required

Returns:

Name Type Description
float float

Mean cross-entropy loss across all samples.

Notes
  • Returns 0.0 if batch is empty (n_samples == 0)
  • Numerically stable for arbitrarily large or small logit values

prepare_targets

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Convert labels or one-hot encodings into dense float matrices.

Accepts either
  • 1D integer class labels (0 to n_classes-1)
  • 2D one-hot encoded targets

If 1D labels are provided, automatically converts to one-hot encoding. If model is provided with an n_classes attribute, validates consistency.

Parameters:

Name Type Description Default
y Any

Target labels as 1D array of integers or 2D one-hot array.

required
model Any | None

Optional model instance. If provided, uses model.n_classes to infer number of classes and validate dimensions.

None

Returns:

Type Description
ndarray

np.ndarray: One-hot encoded targets of shape (n_samples, n_classes).

Raises:

Type Description
ValueError

If y dimension is not 1 or 2, or if dimensions don't match model.

LossFunction

Base interface for losses used by trainers.

This abstract class defines the contract that all loss functions must implement. Subclasses should override the loss, gradient, and optionally prepare_targets methods to implement specific loss functions.

The typical workflow is
  1. Call prepare_targets to format raw targets into the expected format
  2. Call loss to compute the scalar loss value
  3. Call gradient to compute loss gradients for backpropagation

gradient

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Return the gradient of the loss with respect to the predictions.

loss

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute the scalar loss for the given targets and predictions.

prepare_targets

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Return targets in a format compatible with forward/gradient computations.

MSELoss

Bases: LossFunction

Mean squared error loss packaged for trainer consumption.

Implements the MSE loss function commonly used for regression tasks. MSE measures the average squared difference between predicted and actual values.

The loss is defined as

L = (1/n) * Σ(y_pred - y_true)²

And its gradient with respect to predictions is

∇L = (2/n) * (y_pred - y_true)

gradient

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Compute gradient of MSE with respect to predictions.

The gradient is computed as: ∇L = (2/n) * (y_pred - y_true)

Parameters:

Name Type Description Default
y_true ndarray

True target values, shape (n_samples, 1).

required
y_pred ndarray

Predicted values, same shape as y_true.

required

Returns:

Type Description
ndarray

np.ndarray: Gradient array with same shape as y_pred.

loss

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute the mean squared error (MSE).

Parameters:

Name Type Description Default
y_true ndarray

Array-like of true target values, shape (...,)

required
y_pred ndarray

Array-like of predicted values, same shape as y_true

required

Returns:

Type Description
float

The mean of squared differences over all elements as a float.

Notes
  • Inputs are coerced to NumPy arrays with dtype=float.
  • Broadcasting follows NumPy semantics. If shapes are not compatible for element-wise subtraction, a ValueError will be raised by NumPy.

prepare_targets

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Convert 1D targets into column vectors expected by MSE computations.

Parameters:

Name Type Description Default
y Any

Array-like target values. Can be 1D or already 2D.

required
model Any | None

Optional model instance (unused for MSE).

None

Returns:

Type Description
ndarray

np.ndarray: Targets as a 2D column vector of shape (n_samples, 1).

resolve_loss

resolve_loss(
    loss: str | LossFunction | None,
) -> LossFunction

Resolve user-provided loss spec into a concrete LossFunction instance.

Provides flexible loss specification allowing string names, instances, or None.

Parameters:

Name Type Description Default
loss str | LossFunction | None

Loss specification as one of: - None: Returns MSELoss() as default - str: Key from LOSS_REGISTRY (case-insensitive) - LossFunction: Returned as-is

required

Returns:

Name Type Description
LossFunction LossFunction

Instantiated loss function ready for use.

Raises:

Type Description
ValueError

If string loss is not in LOSS_REGISTRY.

TypeError

If loss is not None, str, or LossFunction instance.

Examples:

>>> loss1 = resolve_loss(None)  # Returns MSELoss()
>>> loss2 = resolve_loss("mse")
>>> loss3 = resolve_loss("cross_entropy")
>>> loss4 = resolve_loss(CrossEntropyLoss())