Losses API¶

anfis_toolbox.losses ¶

Loss functions and their gradients for ANFIS Toolbox.

This module centralizes the loss definitions used during training to make it explicit which objective is being optimized. Trainers can import from here so the chosen loss is clear in one place.

CrossEntropyLoss ¶

Bases: LossFunction

Categorical cross-entropy loss operating on logits.

Implements cross-entropy loss for multi-class classification tasks. Accepts raw logits (unbounded scores) and computes numerically stable loss using log-softmax formulation.

The loss is defined as

L = -(1/n) * Σ Σ y_true[i,j] * log(softmax(logits)[i,j])

And its gradient with respect to logits is

∇L = (1/n) * (softmax(logits) - y_true)

Numerical stability is achieved through

Stable log-softmax computation in loss method
Stable softmax via maximum subtraction in gradient method

gradient ¶

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Compute gradient of cross-entropy with respect to logits.

The gradient simplifies to: softmax(logits) - one_hot(y_true) This form is derived from the chain rule applied to the cross-entropy loss.

Accepts integer labels or one-hot encoded targets. Returns gradient with the same shape as logits.

Parameters:

Name	Type	Description	Default
`y_true`	`ndarray`	Array of shape (n_samples,) of integer class labels, or one-hot encoded array of shape (n_samples, n_classes).	required
`y_pred`	`ndarray`	Raw logit scores of shape (n_samples, n_classes).	required

Returns:

Type	Description
`ndarray`	np.ndarray: Gradient of shape (n_samples, n_classes) with values typically in range [-1, 1] indicating direction to decrease loss.

Raises:

Type	Description
`ValueError`	If one-hot y_true shape doesn't match logits shape.

loss ¶

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute mean cross-entropy from integer labels or one-hot vs logits.

Uses stable log-softmax computation to prevent numerical underflow. Handles both integer class labels and one-hot encoded targets.

Parameters:

Name	Type	Description	Default
`y_true`	`ndarray`	Array of shape (n_samples,) of integer class labels (0 to n_classes-1), or one-hot encoded array of shape (n_samples, n_classes).	required
`y_pred`	`ndarray`	Raw logit scores of shape (n_samples, n_classes).	required

Returns:

Name	Type	Description
`float`	`float`	Mean cross-entropy loss across all samples.

Notes

Returns 0.0 if batch is empty (n_samples == 0)
Numerically stable for arbitrarily large or small logit values

prepare_targets ¶

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Convert labels or one-hot encodings into dense float matrices.

Accepts either

1D integer class labels (0 to n_classes-1)
2D one-hot encoded targets

If 1D labels are provided, automatically converts to one-hot encoding. If model is provided with an n_classes attribute, validates consistency.

Parameters:

Name	Type	Description	Default
`y`	`Any`	Target labels as 1D array of integers or 2D one-hot array.	required
`model`	`Any \| None`	Optional model instance. If provided, uses model.n_classes to infer number of classes and validate dimensions.	`None`

Returns:

Type	Description
`ndarray`	np.ndarray: One-hot encoded targets of shape (n_samples, n_classes).

Raises:

Type	Description
`ValueError`	If y dimension is not 1 or 2, or if dimensions don't match model.

LossFunction ¶

Base interface for losses used by trainers.

This abstract class defines the contract that all loss functions must implement. Subclasses should override the loss, gradient, and optionally prepare_targets methods to implement specific loss functions.

The typical workflow is

Call prepare_targets to format raw targets into the expected format
Call loss to compute the scalar loss value
Call gradient to compute loss gradients for backpropagation

gradient ¶

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Return the gradient of the loss with respect to the predictions.

loss ¶

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute the scalar loss for the given targets and predictions.

prepare_targets ¶

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Return targets in a format compatible with forward/gradient computations.

MSELoss ¶

Bases: LossFunction

Mean squared error loss packaged for trainer consumption.

Implements the MSE loss function commonly used for regression tasks. MSE measures the average squared difference between predicted and actual values.

The loss is defined as

L = (1/n) * Σ(y_pred - y_true)²

And its gradient with respect to predictions is

∇L = (2/n) * (y_pred - y_true)

gradient ¶

gradient(y_true: ndarray, y_pred: ndarray) -> np.ndarray

Compute gradient of MSE with respect to predictions.

The gradient is computed as: ∇L = (2/n) * (y_pred - y_true)

Parameters:

Name	Type	Description	Default
`y_true`	`ndarray`	True target values, shape (n_samples, 1).	required
`y_pred`	`ndarray`	Predicted values, same shape as y_true.	required

Returns:

Type	Description
`ndarray`	np.ndarray: Gradient array with same shape as y_pred.

loss ¶

loss(y_true: ndarray, y_pred: ndarray) -> float

Compute the mean squared error (MSE).

Parameters:

Name	Type	Description	Default
`y_true`	`ndarray`	Array-like of true target values, shape (...,)	required
`y_pred`	`ndarray`	Array-like of predicted values, same shape as y_true	required

Returns:

Type	Description
`float`	The mean of squared differences over all elements as a float.

Notes

Inputs are coerced to NumPy arrays with dtype=float.
Broadcasting follows NumPy semantics. If shapes are not compatible for element-wise subtraction, a ValueError will be raised by NumPy.

prepare_targets ¶

prepare_targets(
    y: Any, *, model: Any | None = None
) -> np.ndarray

Convert 1D targets into column vectors expected by MSE computations.

Parameters:

Name	Type	Description	Default
`y`	`Any`	Array-like target values. Can be 1D or already 2D.	required
`model`	`Any \| None`	Optional model instance (unused for MSE).	`None`

Returns:

Type	Description
`ndarray`	np.ndarray: Targets as a 2D column vector of shape (n_samples, 1).

resolve_loss ¶

resolve_loss(
    loss: str | LossFunction | None,
) -> LossFunction

Resolve user-provided loss spec into a concrete LossFunction instance.

Provides flexible loss specification allowing string names, instances, or None.

Parameters:

Name	Type	Description	Default
`loss`	`str \| LossFunction \| None`	Loss specification as one of: - None: Returns MSELoss() as default - str: Key from LOSS_REGISTRY (case-insensitive) - LossFunction: Returned as-is	required

Returns:

Name	Type	Description
`LossFunction`	`LossFunction`	Instantiated loss function ready for use.

Raises:

Type	Description
`ValueError`	If string loss is not in LOSS_REGISTRY.
`TypeError`	If loss is not None, str, or LossFunction instance.

Examples:

>>> loss1 = resolve_loss(None)  # Returns MSELoss()
>>> loss2 = resolve_loss("mse")
>>> loss3 = resolve_loss("cross_entropy")
>>> loss4 = resolve_loss(CrossEntropyLoss())