Skip to content

Models

Concrete TSK model variants.

Each class in this module is a subclass of BaseTSK that bundles a specific antecedent strategy (t-norm), defuzzification head, and consequent architecture. Users typically access these through the sklearn-style estimator wrappers in highfis.estimators.

Model Family Overview

HTSK Configuration: t_norm="gmean" + SoftmaxLogDefuzzifier

Classes:
    - `HTSKClassifier`
    - `HTSKRegressor`

Behavior:
    - `softmax(log(w^{1/D}))`

TSK (vanilla) Configuration: t_norm="prod" + SumBasedDefuzzifier

Classes:
    - `TSKClassifier`
    - `TSKRegressor`

Behavior:
    - `w_r / Σw`

LogTSK Configuration: t_norm="prod" + InvLogDefuzzifier

Classes:
    - `LogTSKClassifier`
    - `LogTSKRegressor`

Behavior:
    - Inverse-log normalization of log-domain rule weights

DombiTSK Configuration: t_norm="dombi" + SumBasedDefuzzifier

Classes:
    - `DombiTSKClassifier`
    - `DombiTSKRegressor`

ADMTSK Configuration: adaptive Dombi T-norm + CompositeGMF + SumBasedDefuzzifier

Classes:
    - `ADMTSKClassifier`
    - `ADMTSKRegressor`

AYATSK Configuration: t_norm="yager" + SumBasedDefuzzifier

Classes:
    - `AYATSKClassifier`
    - `AYATSKRegressor`

AdaTSK Configuration: adaptive softmin (Ada-softmin) + SumBasedDefuzzifier

Classes:
    - `AdaTSKClassifier`
    - `AdaTSKRegressor`

ADPTSK Configuration: adaptive double-parameter softmin (ADP-softmin) + SumBasedDefuzzifier

Classes:
    - `ADPTSKClassifier`
    - `ADPTSKRegressor`

FSRE-AdaTSK Configuration: adaptive softmin + SoftmaxLogDefuzzifier

Classes:
    - `FSREAdaTSKClassifier`
    - `FSREAdaTSKRegressor`

DG-ALETSK Configuration: ALE-softmin + SoftmaxLogDefuzzifier

Classes:
    - `DGALETSKClassifier`
    - `DGALETSKRegressor`

DG-TSK Configuration: product + M-gate + SoftmaxLogDefuzzifier

Classes:
    - `DGTSKClassifier`
    - `DGTSKRegressor`

HDFIS Configuration: t_norm="prod" with DimensionDependentGaussianMF + SumBasedDefuzzifier for HDFIS-prod; t_norm="min" with frozen antecedents + SumBasedDefuzzifier for HDFIS-min.

Classes:
    - `HDFISProdClassifier`
    - `HDFISProdRegressor`
    - `HDFISMinClassifier`
    - `HDFISMinRegressor`
Notes
  • All variants normalize rule firing strengths across rules.
  • SoftmaxLogDefuzzifier improves numerical stability via log-space normalization.
  • InvLogDefuzzifier applies inverse-log normalization.
  • Adaptive softmin variants dynamically adjust aggregation behavior.
  • All classes are exported by this module and are intended for use as concrete TSK classifiers and regressors.

ADMTSKClassifier

Bases: BaseTSKClassifier

Adaptive Dombi TSK classifier with Composite Gaussian membership functions.

ADMTSK is an adaptive Dombi TSK fuzzy system designed for high-dimensional inference. It combines a Dombi T-norm antecedent with a positive lower-bound Composite Gaussian membership function (CGMF) and normalized first-order consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialize the ADMTSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of membership functions.

required
n_classes int

Number of output classes. Must be >= 2.

required
rule_base str

Rule base strategy, either "coco" or "cartesian".

'coco'
t_norm str

T-norm identifier. Defaults to "dombi".

'dombi'
adaptive bool

If True, compute adaptive lambda using the feature dimension and membership lower bound.

True
lambda_ float

Fixed Dombi parameter λ > 0 when adaptive is False.

1.0
lower_bound float

The lower bound for Composite GMF values.

1.0 / math.e
K float

Heuristic constant used to compute adaptive lambda.

10.0
t_norm_fn TNormFn | None

Optional custom T-norm implementation. Overrides adaptive and lambda_ when provided.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices for custom rule bases.

None
defuzzifier nn.Module | None

Optional defuzzifier module.

None
consequent_batch_norm bool

If True, apply batch normalization to consequent inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2 or if lambda_ is invalid when adaptive is False.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    t_norm: str = "dombi",
    adaptive: bool = True,
    lambda_: float = 1.0,
    lower_bound: float = 1.0 / math.e,
    K: float = 10.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the ADMTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            membership functions.
        n_classes: Number of output classes. Must be >= 2.
        rule_base: Rule base strategy, either ``"coco"`` or
            ``"cartesian"``.
        t_norm: T-norm identifier. Defaults to ``"dombi"``.
        adaptive: If True, compute adaptive lambda using the feature
            dimension and membership lower bound.
        lambda_: Fixed Dombi parameter ``λ > 0`` when adaptive is False.
        lower_bound: The lower bound for Composite GMF values.
        K: Heuristic constant used to compute adaptive lambda.
        t_norm_fn: Optional custom T-norm implementation. Overrides
            ``adaptive`` and ``lambda_`` when provided.
        rules: Explicit rule antecedent indices for custom rule bases.
        defuzzifier: Optional defuzzifier module.
        consequent_batch_norm: If True, apply batch normalization to
            consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2`` or if ``lambda_`` is invalid
            when adaptive is False.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if not adaptive and lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.n_classes = int(n_classes)
    self.adaptive = bool(adaptive)
    self.lambda_ = float(lambda_)
    self.lower_bound = float(lower_bound)
    self.K = float(K)

    if t_norm_fn is None:
        if self.adaptive:
            t_norm_fn = AdaptiveDombiTNorm(
                dimension=len(input_mfs),
                lower_bound=self.lower_bound,
                K=self.K,
            )
        else:
            t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

ADMTSKRegressor

Bases: BaseTSKRegressor

Adaptive Dombi TSK regressor with Composite Gaussian membership functions.

ADMTSK is an adaptive Dombi TSK fuzzy system designed for high-dimensional inference. It combines a Dombi T-norm antecedent with a positive lower-bound Composite Gaussian membership function (CGMF) and normalized first-order consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialize the ADMTSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of membership functions.

required
rule_base str

Rule base strategy, either "coco" or "cartesian".

'coco'
t_norm str

T-norm identifier. Defaults to "dombi".

'dombi'
adaptive bool

If True, compute adaptive lambda using the feature dimension and membership lower bound.

True
lambda_ float

Fixed Dombi parameter λ > 0 when adaptive is False.

1.0
lower_bound float

The lower bound for Composite GMF values.

1.0 / math.e
K float

Heuristic constant used to compute adaptive lambda.

10.0
t_norm_fn TNormFn | None

Optional custom T-norm implementation. Overrides adaptive and lambda_ when provided.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices for custom rule bases.

None
defuzzifier nn.Module | None

Optional defuzzifier module.

None
consequent_batch_norm bool

If True, apply batch normalization to consequent inputs.

False

Raises:

Type Description
ValueError

If lambda_ is invalid when adaptive is False.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    t_norm: str = "dombi",
    adaptive: bool = True,
    lambda_: float = 1.0,
    lower_bound: float = 1.0 / math.e,
    K: float = 10.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the ADMTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            membership functions.
        rule_base: Rule base strategy, either ``"coco"`` or
            ``"cartesian"``.
        t_norm: T-norm identifier. Defaults to ``"dombi"``.
        adaptive: If True, compute adaptive lambda using the feature
            dimension and membership lower bound.
        lambda_: Fixed Dombi parameter ``λ > 0`` when adaptive is False.
        lower_bound: The lower bound for Composite GMF values.
        K: Heuristic constant used to compute adaptive lambda.
        t_norm_fn: Optional custom T-norm implementation. Overrides
            ``adaptive`` and ``lambda_`` when provided.
        rules: Explicit rule antecedent indices for custom rule bases.
        defuzzifier: Optional defuzzifier module.
        consequent_batch_norm: If True, apply batch normalization to
            consequent inputs.

    Raises:
        ValueError: If ``lambda_`` is invalid when adaptive is False.
    """
    if not adaptive and lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.adaptive = bool(adaptive)
    self.lambda_ = float(lambda_)
    self.lower_bound = float(lower_bound)
    self.K = float(K)

    if t_norm_fn is None:
        if self.adaptive:
            t_norm_fn = AdaptiveDombiTNorm(
                dimension=len(input_mfs),
                lower_bound=self.lower_bound,
                K=self.K,
            )
        else:
            t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

ADPTSKClassifier

Bases: BaseTSKClassifier

TSK classifier with adaptive double-parameter softmin antecedent (ADPTSK).

The firing strengths of each rule are computed with the ADP-softmin operator, and membership functions are wrapped as Gaussian PIMFs to preserve a positive infimum during high-dimensional training.

Reference

Ma, M., Qian, L., Zhang, Y., Fang, Q., & Xue, G. (2025). An adaptive double-parameter softmin based Takagi-Sugeno-Kang fuzzy system for high-dimensional data. Fuzzy Sets and Systems, 521, 109582. https://doi.org/10.1016/j.fss.2025.109582

Initialise the ADPTSK classifier.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    kappa: float = 690.0,
    xi: float = 730.0,
    eps: float | None = None,
) -> None:
    """Initialise the ADPTSK classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.kappa = float(kappa)
    self.xi = float(xi)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = ADPSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        kappa=self.kappa,
        xi=self.xi,
        eps=self.eps,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

ADPTSKRegressor

Bases: BaseTSKRegressor

TSK regressor with adaptive double-parameter softmin antecedent (ADPTSK).

The firing strengths of each rule are computed with the ADP-softmin operator, and membership functions are wrapped as Gaussian PIMFs to preserve a positive infimum during high-dimensional training.

Reference

Ma, M., Qian, L., Zhang, Y., Fang, Q., & Xue, G. (2025). An adaptive double-parameter softmin based Takagi-Sugeno-Kang fuzzy system for high-dimensional data. Fuzzy Sets and Systems, 521, 109582. https://doi.org/10.1016/j.fss.2025.109582

Initialise the ADPTSK regressor.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    kappa: float = 690.0,
    xi: float = 730.0,
    eps: float | None = None,
) -> None:
    """Initialise the ADPTSK regressor."""
    self.kappa = float(kappa)
    self.xi = float(xi)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = ADPSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        kappa=self.kappa,
        xi=self.xi,
        eps=self.eps,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

AYATSKClassifier

Bases: BaseTSKClassifier

TSK classifier with an adaptive Yager T-norm in the antecedent.

AYATSK extends TSK by using an adaptive Yager T-norm aggregation and optional positive lower-bound membership functions to improve stability and performance in high-dimensional settings.

Reference

G. Xue, Y. Yang and J. Wang, "Adaptive Yager T-Norm-Based Takagi-Sugeno-Kang Fuzzy Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 55, no. 12, pp. 9802-9815, Dec. 2025, doi: 10.1109/TSMC.2025.3621346.

Initialise the AYATSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"coco" (default) or "cartesian".

'coco'
t_norm str

T-norm identifier (default "yager").

'yager'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    t_norm: str = "yager",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the AYATSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        t_norm: T-norm identifier (default ``"yager"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

AYATSKRegressor

Bases: BaseTSKRegressor

TSK regressor with an adaptive Yager T-norm in the antecedent.

AYATSK extends TSK by using an adaptive Yager T-norm aggregation and optional positive lower-bound membership functions to improve stability and performance in high-dimensional settings.

Reference

G. Xue, Y. Yang and J. Wang, "Adaptive Yager T-Norm-Based Takagi-Sugeno-Kang Fuzzy Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 55, no. 12, pp. 9802-9815, Dec. 2025, doi: 10.1109/TSMC.2025.3621346.

Initialise the AYATSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"coco" (default) or "cartesian".

'coco'
t_norm str

T-norm identifier (default "yager").

'yager'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    t_norm: str = "yager",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the AYATSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        t_norm: T-norm identifier (default ``"yager"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

AdaTSKClassifier

Bases: BaseTSKClassifier

TSK classifier with adaptive softmin antecedent (AdaTSK).

The firing strength of each rule is computed with the Ada-softmin operator.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the AdaTSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"coco" (default) or "cartesian".

'coco'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the Ada-softmin operator.

None

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
) -> None:
    """Initialise the AdaTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        eps=self.eps,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

AdaTSKRegressor

Bases: BaseTSKRegressor

TSK regressor with adaptive softmin antecedent (AdaTSK).

The firing strength of each rule is computed with the Ada-softmin operator.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the AdaTSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"coco" (default) or "cartesian".

'coco'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the Ada-softmin operator.

None
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
) -> None:
    """Initialise the AdaTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
    """
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        eps=self.eps,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

BaseTSKClassifier

Bases: BaseTSK

Abstract classifier base that provides task-specific training and inference helpers.

Initialize the TSK pipeline layers.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature names to sequences of :class:~highfis.memberships.MembershipFunction objects. Must not be empty.

required
rule_base str

Rule-base construction strategy. Supported values: "cartesian" (all MF combinations), "coco" (same-index compact), "en" (enhanced FRB), or "custom" (explicit rules via rules).

'cartesian'
t_norm str

Built-in T-norm name. Ignored when t_norm_fn is provided. Common values: "prod", "gmean", "min", "dombi", "yager".

'gmean'
t_norm_fn TNormFn | None

Optional custom T-norm callable. When provided, t_norm is internally set to "prod" and the rule layer applies this function instead.

None
rules Sequence[Sequence[int]] | None

Explicit rule index sequences. Required when rule_base is "custom".

None
defuzzifier nn.Module | None

Normalization module applied to raw rule firing strengths. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

If True, insert a :class:~torch.nn.BatchNorm1d layer on the inputs before the consequent computation.

False

Raises:

Type Description
ValueError

If input_mfs is empty.

Source code in highfis/base.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    *,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the TSK pipeline layers.

    Args:
        input_mfs: Mapping from feature names to sequences of
            :class:`~highfis.memberships.MembershipFunction` objects.
            Must not be empty.
        rule_base: Rule-base construction strategy.  Supported values:
            ``"cartesian"`` (all MF combinations), ``"coco"``
            (same-index compact), ``"en"`` (enhanced FRB), or
            ``"custom"`` (explicit rules via *rules*).
        t_norm: Built-in T-norm name.  Ignored when *t_norm_fn* is
            provided.  Common values: ``"prod"``, ``"gmean"``,
            ``"min"``, ``"dombi"``, ``"yager"``.
        t_norm_fn: Optional custom T-norm callable.  When provided,
            *t_norm* is internally set to ``"prod"`` and the rule
            layer applies this function instead.
        rules: Explicit rule index sequences.  Required when
            *rule_base* is ``"custom"``.
        defuzzifier: Normalization module applied to raw rule firing
            strengths.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: If ``True``, insert a
            :class:`~torch.nn.BatchNorm1d` layer on the inputs before
            the consequent computation.

    Raises:
        ValueError: If *input_mfs* is empty.
    """
    super().__init__()
    if not input_mfs:
        raise ValueError("input_mfs must not be empty")

    self.input_mfs = input_mfs
    self.input_names = list(input_mfs.keys())
    self.n_inputs = len(self.input_names)
    mf_per_input = [len(input_mfs[name]) for name in self.input_names]

    self.membership_layer = MembershipLayer(input_mfs)
    self.rule_layer = RuleLayer(
        self.input_names,
        mf_per_input,
        rules=rules,
        rule_base=rule_base,
        t_norm=t_norm if t_norm_fn is None else "prod",
        t_norm_fn=t_norm_fn,
    )
    self.n_rules = self.rule_layer.n_rules
    self.defuzzifier = defuzzifier or SoftmaxLogDefuzzifier()
    self.consequent_batch_norm = bool(consequent_batch_norm)
    self.consequent_bn = nn.BatchNorm1d(self.n_inputs) if self.consequent_batch_norm else None
    self.consequent_layer = self._build_consequent_layer()
    self.logger = logging.getLogger(f"{self.__class__.__module__}.{self.__class__.__name__}")
    if not self.logger.handlers:
        stream_handler = logging.StreamHandler(sys.stdout)
        stream_handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.addHandler(stream_handler)
        self.logger.setLevel(logging.INFO)
        self.logger.propagate = False

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

BaseTSKRegressor

Bases: BaseTSK

Abstract regressor base that provides task-specific training and inference helpers.

Initialize the TSK pipeline layers.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature names to sequences of :class:~highfis.memberships.MembershipFunction objects. Must not be empty.

required
rule_base str

Rule-base construction strategy. Supported values: "cartesian" (all MF combinations), "coco" (same-index compact), "en" (enhanced FRB), or "custom" (explicit rules via rules).

'cartesian'
t_norm str

Built-in T-norm name. Ignored when t_norm_fn is provided. Common values: "prod", "gmean", "min", "dombi", "yager".

'gmean'
t_norm_fn TNormFn | None

Optional custom T-norm callable. When provided, t_norm is internally set to "prod" and the rule layer applies this function instead.

None
rules Sequence[Sequence[int]] | None

Explicit rule index sequences. Required when rule_base is "custom".

None
defuzzifier nn.Module | None

Normalization module applied to raw rule firing strengths. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

If True, insert a :class:~torch.nn.BatchNorm1d layer on the inputs before the consequent computation.

False

Raises:

Type Description
ValueError

If input_mfs is empty.

Source code in highfis/base.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    *,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the TSK pipeline layers.

    Args:
        input_mfs: Mapping from feature names to sequences of
            :class:`~highfis.memberships.MembershipFunction` objects.
            Must not be empty.
        rule_base: Rule-base construction strategy.  Supported values:
            ``"cartesian"`` (all MF combinations), ``"coco"``
            (same-index compact), ``"en"`` (enhanced FRB), or
            ``"custom"`` (explicit rules via *rules*).
        t_norm: Built-in T-norm name.  Ignored when *t_norm_fn* is
            provided.  Common values: ``"prod"``, ``"gmean"``,
            ``"min"``, ``"dombi"``, ``"yager"``.
        t_norm_fn: Optional custom T-norm callable.  When provided,
            *t_norm* is internally set to ``"prod"`` and the rule
            layer applies this function instead.
        rules: Explicit rule index sequences.  Required when
            *rule_base* is ``"custom"``.
        defuzzifier: Normalization module applied to raw rule firing
            strengths.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: If ``True``, insert a
            :class:`~torch.nn.BatchNorm1d` layer on the inputs before
            the consequent computation.

    Raises:
        ValueError: If *input_mfs* is empty.
    """
    super().__init__()
    if not input_mfs:
        raise ValueError("input_mfs must not be empty")

    self.input_mfs = input_mfs
    self.input_names = list(input_mfs.keys())
    self.n_inputs = len(self.input_names)
    mf_per_input = [len(input_mfs[name]) for name in self.input_names]

    self.membership_layer = MembershipLayer(input_mfs)
    self.rule_layer = RuleLayer(
        self.input_names,
        mf_per_input,
        rules=rules,
        rule_base=rule_base,
        t_norm=t_norm if t_norm_fn is None else "prod",
        t_norm_fn=t_norm_fn,
    )
    self.n_rules = self.rule_layer.n_rules
    self.defuzzifier = defuzzifier or SoftmaxLogDefuzzifier()
    self.consequent_batch_norm = bool(consequent_batch_norm)
    self.consequent_bn = nn.BatchNorm1d(self.n_inputs) if self.consequent_batch_norm else None
    self.consequent_layer = self._build_consequent_layer()
    self.logger = logging.getLogger(f"{self.__class__.__module__}.{self.__class__.__name__}")
    if not self.logger.handlers:
        stream_handler = logging.StreamHandler(sys.stdout)
        stream_handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.addHandler(stream_handler)
        self.logger.setLevel(logging.INFO)
        self.logger.propagate = False

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

DGALETSKClassifier

Bases: BaseTSKClassifier

DG-ALETSK classifier with ALE-softmin antecedent and double-group gates.

DG-ALETSK extends FSRE-AdaTSK by replacing the adaptive softmin with the Adaptive Ln-Exp (ALE) softmin — a smoother variant with improved numerical stability. It also uses a zero-order consequent in the DG (data-guided) training phase and optionally converts to first-order after gate-based pruning.

Reference

G. Xue, J. Wang, B. Yuan and C. Dai, "DG-ALETSK: A High-Dimensional Fuzzy Approach With Simultaneous Feature Selection and Rule Extraction," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 11, pp. 3866-3880, Nov. 2023, doi: 10.1109/TFUZZ.2023.3270445.

Initialise the DG-ALETSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"coco" (default) or "cartesian".

'coco'
lambda_init float

Initial ALE-softmin parameter alpha > 0 (default 1.0).

1.0
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the ALE-softmin operator.

None
use_en_frb bool

Start directly from the Enhanced FRB (En-FRB).

False

Raises:

Type Description
ValueError

If n_classes < 2 or lambda_init <= 0.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    lambda_init: float = 1.0,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-ALETSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        lambda_init: Initial ALE-softmin parameter ``alpha > 0``
            (default ``1.0``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the ALE-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).

    Raises:
        ValueError: If ``n_classes < 2`` or ``lambda_init <= 0``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if lambda_init <= 0.0:
        raise ValueError("lambda_init must be > 0")

    self.n_classes = int(n_classes)
    self.lambda_init = float(lambda_init)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGALETSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        alpha_init=self.lambda_init,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

apply_thresholds

Apply threshold pruning to feature and rule gates.

Source code in highfis/models.py
def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Apply threshold pruning to feature and rule gates."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    rule_layer = self.rule_layer
    cast(Tensor, rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    consequent = self.consequent_layer
    cast(Tensor, consequent.theta_gates.data)[pruned_rules] = 0.0

compute_thresholds

Compute feature and rule thresholds from gate values and coefficient pairs.

Source code in highfis/models.py
def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute feature and rule thresholds from gate values and coefficient pairs."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

convert_to_first_order

Convert the DG phase zero-order consequent to first-order form.

Source code in highfis/models.py
def convert_to_first_order(self) -> None:
    """Convert the DG phase zero-order consequent to first-order form."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedClassificationZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_dg_phase

Train the DG phase using zero-order TSK and joint FS+RE.

Source code in highfis/models.py
def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG phase using zero-order TSK and joint FS+RE."""
    return self.fit(x, y, **kwargs)

fit_finetune

Fine-tune the DG-ALETSK model after converting to first-order TSK.

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-ALETSK model after converting to first-order TSK."""
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

get_feature_gate_values

Return normalized antecedent feature gate values for the DG phase.

Source code in highfis/models.py
def get_feature_gate_values(self) -> Tensor:
    """Return normalized antecedent feature gate values for the DG phase."""
    rule_layer = self.rule_layer
    return _gate_activation(rule_layer.lambda_gates)

get_rule_gate_values

Return normalized consequent rule gate values for the DG phase.

Source code in highfis/models.py
def get_rule_gate_values(self) -> Tensor:
    """Return normalized consequent rule gate values for the DG phase."""
    consequent = self.consequent_layer
    return _gate_activation(consequent.theta_gates)

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

search_thresholds

Search threshold coefficients for feature and rule pruning.

The search follows the DG-ALETSK paper strategy: thresholds are computed from gate values, applied to prune gates, and the first-order consequent parameters are refit with antecedents fixed.

Source code in highfis/models.py
def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search threshold coefficients for feature and rule pruning.

    The search follows the DG-ALETSK paper strategy: thresholds are
    computed from gate values, applied to prune gates, and the first-order
    consequent parameters are refit with antecedents fixed.
    """
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

DGALETSKRegressor

Bases: BaseTSKRegressor

DG-ALETSK regressor with ALE-softmin antecedent and double-group gates.

DG-ALETSK extends FSRE-AdaTSK by replacing the adaptive softmin with the Adaptive Ln-Exp (ALE) softmin — a smoother variant with improved numerical stability. It also uses a zero-order consequent in the DG (data-guided) training phase and optionally converts to first-order after gate-based pruning.

Reference

G. Xue, J. Wang, B. Yuan and C. Dai, "DG-ALETSK: A High-Dimensional Fuzzy Approach With Simultaneous Feature Selection and Rule Extraction," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 11, pp. 3866-3880, Nov. 2023, doi: 10.1109/TFUZZ.2023.3270445.

Initialise the DG-ALETSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"coco" (default) or "cartesian".

'coco'
lambda_init float

Initial ALE-softmin parameter alpha > 0 (default 1.0).

1.0
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the ALE-softmin operator.

None
use_en_frb bool

Start directly from the Enhanced FRB (En-FRB).

False

Raises:

Type Description
ValueError

If lambda_init <= 0.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    lambda_init: float = 1.0,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-ALETSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        lambda_init: Initial ALE-softmin parameter ``alpha > 0``
            (default ``1.0``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the ALE-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).

    Raises:
        ValueError: If ``lambda_init <= 0``.
    """
    if lambda_init <= 0.0:
        raise ValueError("lambda_init must be > 0")

    self.lambda_init = float(lambda_init)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGALETSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        alpha_init=self.lambda_init,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

apply_thresholds

Apply threshold pruning to feature and rule gates.

Source code in highfis/models.py
def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Apply threshold pruning to feature and rule gates."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    rule_layer = self.rule_layer
    cast(Tensor, rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    consequent = self.consequent_layer
    cast(Tensor, consequent.theta_gates.data)[pruned_rules] = 0.0

compute_thresholds

Compute feature and rule thresholds from gate values and coefficient pairs.

Source code in highfis/models.py
def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute feature and rule thresholds from gate values and coefficient pairs."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

convert_to_first_order

Convert the DG phase zero-order consequent to first-order form.

Source code in highfis/models.py
def convert_to_first_order(self) -> None:
    """Convert the DG phase zero-order consequent to first-order form."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedRegressionZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_dg_phase

Train the DG phase using zero-order TSK and joint FS+RE.

Source code in highfis/models.py
def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG phase using zero-order TSK and joint FS+RE."""
    return self.fit(x, y, **kwargs)

fit_finetune

Fine-tune the DG-ALETSK model after converting to first-order TSK.

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-ALETSK model after converting to first-order TSK."""
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

get_feature_gate_values

Return normalized antecedent feature gate values for the DG phase.

Source code in highfis/models.py
def get_feature_gate_values(self) -> Tensor:
    """Return normalized antecedent feature gate values for the DG phase."""
    rule_layer = self.rule_layer
    return _gate_activation(rule_layer.lambda_gates)

get_rule_gate_values

Return normalized consequent rule gate values for the DG phase.

Source code in highfis/models.py
def get_rule_gate_values(self) -> Tensor:
    """Return normalized consequent rule gate values for the DG phase."""
    consequent = self.consequent_layer
    return _gate_activation(consequent.theta_gates)

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

search_thresholds

Search threshold coefficients for feature and rule pruning.

Source code in highfis/models.py
def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search threshold coefficients for feature and rule pruning."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

DGTSKClassifier

Bases: BaseTSKClassifier

DG-TSK classifier with M-gate antecedent and point-based FRB (P-FRB).

DG-TSK uses a data-guided M-gate function to automatically select relevant features and rules.

Reference

Guangdong Xue, Jian Wang, Bingjie Zhang, Bin Yuan, Caili Dai, Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction, Fuzzy Sets and Systems, Volume 469, 2023, 108627, ISSN 0165-0114, https://doi.org/10.1016/j.fss.2023.108627.

Initialise the DG-TSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"coco" (default) or "cartesian".

'coco'
gate_fea str | Callable[[Tensor], Tensor] | None

Gate function for antecedent feature selection. "gate_m" (default) uses the M-gate from the DG-TSK paper. Can also be any callable Tensor → Tensor.

'gate_m'
gate_rule str | Callable[[Tensor], Tensor] | None

Gate function for consequent rule selection. Same options as gate_fea.

'gate_m'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon.

None
use_en_frb bool

Use the Enhanced FRB (P-FRB) rule base.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    gate_fea: str | Callable[[Tensor], Tensor] | None = "gate_m",
    gate_rule: str | Callable[[Tensor], Tensor] | None = "gate_m",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        gate_fea: Gate function for antecedent feature selection.
            ``"gate_m"`` (default) uses the M-gate from the DG-TSK paper.
            Can also be any callable ``Tensor → Tensor``.
        gate_rule: Gate function for consequent rule selection.
            Same options as ``gate_fea``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon.
        use_en_frb: Use the Enhanced FRB (P-FRB) rule base.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.gate_fea = gate_fea
    self.gate_rule = gate_rule
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGTSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        gate_fea=self.gate_fea,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

apply_thresholds

Prune DG-TSK feature and rule gates using the computed thresholds.

Source code in highfis/models.py
def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Prune DG-TSK feature and rule gates using the computed thresholds."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    cast(Tensor, self.rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    cast(Tensor, self.consequent_layer.theta_gates.data)[pruned_rules] = 0.0

compute_thresholds

Compute DG-TSK pruning thresholds from gate values and zeta parameters.

Source code in highfis/models.py
def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute DG-TSK pruning thresholds from gate values and zeta parameters."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

convert_to_first_order

Convert the DG-TSK model from zero-order to first-order consequent.

Source code in highfis/models.py
def convert_to_first_order(self) -> None:
    """Convert the DG-TSK model from zero-order to first-order consequent."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedClassificationZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_dg_phase

Train the DG-TSK zero-order phase before first-order conversion.

Source code in highfis/models.py
def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG-TSK zero-order phase before first-order conversion."""
    return self.fit(x, y, **kwargs)

fit_finetune

Fine-tune the DG-TSK classifier after conversion to first-order consequents.

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-TSK classifier after conversion to first-order consequents."""
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

get_feature_gate_values

Return normalized DG-TSK feature gate activations from lambda values.

Source code in highfis/models.py
def get_feature_gate_values(self) -> Tensor:
    """Return normalized DG-TSK feature gate activations from lambda values."""
    return self.rule_layer.gate_fn(self.rule_layer.lambda_gates)

get_rule_gate_values

Return normalized DG-TSK rule gate activations from theta values.

Source code in highfis/models.py
def get_rule_gate_values(self) -> Tensor:
    """Return normalized DG-TSK rule gate activations from theta values."""
    return self.consequent_layer.gate_fn(self.consequent_layer.theta_gates)

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

search_thresholds

Search DG-TSK threshold combinations and optionally apply the best candidate.

Source code in highfis/models.py
def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search DG-TSK threshold combinations and optionally apply the best candidate."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

DGTSKRegressor

Bases: BaseTSKRegressor

DG-TSK regressor with M-gate antecedent and point-based FRB (P-FRB).

DG-TSK uses a data-guided M-gate function to automatically select relevant features and rules.

Reference

Guangdong Xue, Jian Wang, Bingjie Zhang, Bin Yuan, Caili Dai, Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction, Fuzzy Sets and Systems, Volume 469, 2023, 108627, ISSN 0165-0114, https://doi.org/10.1016/j.fss.2023.108627.

Initialise the DG-TSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"coco" (default) or "cartesian".

'coco'
gate_fea str | Callable[[Tensor], Tensor] | None

Gate function for antecedent feature selection (default "gate_m").

'gate_m'
gate_rule str | Callable[[Tensor], Tensor] | None

Gate function for consequent rule selection (default "gate_m").

'gate_m'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon.

None
use_en_frb bool

Use the Enhanced FRB (P-FRB) rule base.

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    gate_fea: str | Callable[[Tensor], Tensor] | None = "gate_m",
    gate_rule: str | Callable[[Tensor], Tensor] | None = "gate_m",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        gate_fea: Gate function for antecedent feature selection
            (default ``"gate_m"``).
        gate_rule: Gate function for consequent rule selection
            (default ``"gate_m"``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon.
        use_en_frb: Use the Enhanced FRB (P-FRB) rule base.
    """
    self.gate_fea = gate_fea
    self.gate_rule = gate_rule
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGTSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        gate_fea=self.gate_fea,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

apply_thresholds

Prune DG-TSK feature and rule gates using the computed thresholds.

Source code in highfis/models.py
def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Prune DG-TSK feature and rule gates using the computed thresholds."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    cast(Tensor, self.rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    cast(Tensor, self.consequent_layer.theta_gates.data)[pruned_rules] = 0.0

compute_thresholds

Compute DG-TSK pruning thresholds from gate values and zeta parameters.

Source code in highfis/models.py
def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute DG-TSK pruning thresholds from gate values and zeta parameters."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

convert_to_first_order

Convert the DG-TSK regressor from zero-order to first-order consequent.

Source code in highfis/models.py
def convert_to_first_order(self) -> None:
    """Convert the DG-TSK regressor from zero-order to first-order consequent."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedRegressionZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_dg_phase

Train the DG-TSK regression zero-order phase before first-order conversion.

Source code in highfis/models.py
def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG-TSK regression zero-order phase before first-order conversion."""
    return self.fit(x, y, **kwargs)

fit_finetune

Fine-tune the DG-TSK regression model after converting to first order.

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-TSK regression model after converting to first order."""
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

get_feature_gate_values

Return normalized DG-TSK feature gate activations from lambda values.

Source code in highfis/models.py
def get_feature_gate_values(self) -> Tensor:
    """Return normalized DG-TSK feature gate activations from lambda values."""
    return self.rule_layer.gate_fn(self.rule_layer.lambda_gates)

get_rule_gate_values

Return normalized DG-TSK rule gate activations from theta values.

Source code in highfis/models.py
def get_rule_gate_values(self) -> Tensor:
    """Return normalized DG-TSK rule gate activations from theta values."""
    return self.consequent_layer.gate_fn(self.consequent_layer.theta_gates)

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

search_thresholds

Search DG-TSK regression threshold combinations and optionally apply the best candidate.

Source code in highfis/models.py
def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search DG-TSK regression threshold combinations and optionally apply the best candidate."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

DombiTSKClassifier

Bases: BaseTSKClassifier

TSK classifier with a fixed Dombi T-norm in the antecedent.

DombiTSK extends TSK fuzzy inference by using a Dombi t-norm aggregation in antecedent evaluation while keeping first-order linear consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialise the Dombi TSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

T-norm identifier (default "dombi").

'dombi'
lambda_ float

Dombi parameter λ > 0. λ = 1 gives the algebraic product.

1.0
t_norm_fn TNormFn | None

Optional custom t-norm callable; overrides lambda_ and t_norm when provided.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2 or lambda_ <= 0.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "dombi",
    lambda_: float = 1.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the Dombi TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: T-norm identifier (default ``"dombi"``).
        lambda_: Dombi parameter ``λ > 0``.  ``λ = 1`` gives the
            algebraic product.
        t_norm_fn: Optional custom t-norm callable; overrides
            ``lambda_`` and ``t_norm`` when provided.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2`` or ``lambda_ <= 0``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.n_classes = int(n_classes)
    self.lambda_ = float(lambda_)
    if t_norm_fn is None:
        t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

DombiTSKRegressor

Bases: BaseTSKRegressor

TSK regressor with a fixed Dombi T-norm in the antecedent.

DombiTSK extends TSK fuzzy inference by using a Dombi t-norm aggregation in antecedent evaluation while keeping first-order linear consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialise the Dombi TSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

T-norm identifier (default "dombi").

'dombi'
lambda_ float

Dombi parameter λ > 0.

1.0
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False

Raises:

Type Description
ValueError

If lambda_ <= 0.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "dombi",
    lambda_: float = 1.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the Dombi TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: T-norm identifier (default ``"dombi"``).
        lambda_: Dombi parameter ``λ > 0``.
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``lambda_ <= 0``.
    """
    if lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.lambda_ = float(lambda_)
    if t_norm_fn is None:
        t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

FSREAdaTSKClassifier

Bases: BaseTSKClassifier

FSRE-AdaTSK classifier with adaptive softmin antecedent and gated consequents.

FSRE-AdaTSK (Feature Selection and Rule Extraction) extends AdaTSK.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the FSRE-AdaTSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"coco" (default) or "cartesian".

'coco'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the Ada-softmin operator.

None
use_en_frb bool

Start directly from the Enhanced FRB (En-FRB) instead of CoCo-FRB.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the FSRE-AdaTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB)
            instead of CoCo-FRB.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

expand_to_en_frb

Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase.

Source code in highfis/models.py
def expand_to_en_frb(self) -> None:
    """Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase."""
    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(self.input_mfs[name]) for name in self.input_names],
        rule_base="en",
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_finetune

Fine-tune with no gates — plain TSK consequent (eq. 5).

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune with no gates — plain TSK consequent (eq. 5)."""
    self.consequent_layer.mode = "finetune"
    return self.fit(x, y, **kwargs)

fit_fs

Train the FS phase: only feature gates M(λ_d) are active (eq. 21).

Source code in highfis/models.py
def fit_fs(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the FS phase: only feature gates M(λ_d) are active (eq. 21)."""
    self.consequent_layer.mode = "fs"
    return self.fit(x, y, **kwargs)

fit_re

Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22).

Source code in highfis/models.py
def fit_re(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22)."""
    self.expand_to_en_frb()
    self.consequent_layer.mode = "re"
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

FSREAdaTSKRegressor

Bases: BaseTSKRegressor

FSRE-AdaTSK regressor with adaptive softmin antecedent and gated consequents.

FSRE-AdaTSK (Feature Selection and Rule Extraction) extends AdaTSK.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the FSRE-AdaTSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"coco" (default) or "cartesian".

'coco'
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices; ignored when use_en_frb=True.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
eps float | None

Numerical stability epsilon for the Ada-softmin operator.

None
use_en_frb bool

Start directly from the Enhanced FRB (En-FRB).

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the FSRE-AdaTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).
    """
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

expand_to_en_frb

Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase.

Source code in highfis/models.py
def expand_to_en_frb(self) -> None:
    """Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase."""
    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(self.input_mfs[name]) for name in self.input_names],
        rule_base="en",
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

fit_finetune

Fine-tune with no gates — plain TSK consequent (eq. 5).

Source code in highfis/models.py
def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune with no gates — plain TSK consequent (eq. 5)."""
    self.consequent_layer.mode = "finetune"
    return self.fit(x, y, **kwargs)

fit_fs

Train the FS phase: only feature gates M(λ_d) are active (eq. 21).

Source code in highfis/models.py
def fit_fs(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the FS phase: only feature gates M(λ_d) are active (eq. 21)."""
    self.consequent_layer.mode = "fs"
    return self.fit(x, y, **kwargs)

fit_re

Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22).

Source code in highfis/models.py
def fit_re(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22)."""
    self.expand_to_en_frb()
    self.consequent_layer.mode = "re"
    return self.fit(x, y, **kwargs)

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

HDFISMinClassifier

Bases: BaseTSKClassifier

HDFIS-min classifier with frozen antecedents and minimum aggregation.

HDFIS-min uses the minimum T-norm in the antecedent and only optimizes consequent parameters, which avoids the nondifferentiability of the minimum operator during training.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-min classifier.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "min",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-min classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )
    for param in self.membership_layer.parameters():
        param.requires_grad = False

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

HDFISMinRegressor

Bases: BaseTSKRegressor

HDFIS-min regressor with frozen antecedents and minimum aggregation.

HDFIS-min uses the minimum T-norm in the antecedent and only optimizes consequent parameters, which avoids the nondifferentiability of the minimum operator during training.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-min regressor.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "min",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-min regressor."""
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )
    for param in self.membership_layer.parameters():
        param.requires_grad = False

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

HDFISProdClassifier

Bases: BaseTSKClassifier

HDFIS-prod classifier with dimension-dependent Gaussian MFs.

HDFIS-prod combines the standard product T-norm with a dimension-dependent Gaussian membership function (DMF) to avoid numeric underflow in very high-dimensional feature spaces while preserving first-order TSK consequents.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-prod classifier.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-prod classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

HDFISProdRegressor

Bases: BaseTSKRegressor

HDFIS-prod regressor with dimension-dependent Gaussian MFs.

HDFIS-prod combines the standard product T-norm with a dimension-dependent Gaussian membership function (DMF) to avoid numeric underflow in very high-dimensional feature spaces while preserving first-order TSK consequents.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-prod regressor.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-prod regressor."""
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

HTSKClassifier

Bases: BaseTSKClassifier

HTSK classifier for high-dimensional TSK inference.

HTSK replaces the standard product t-norm with a geometric mean over membership values and performs rule normalization in log-space.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the HTSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

Rule-base construction strategy. "cartesian" builds the full Cartesian product; "coco" uses a one-cluster-per-rule scheme.

'cartesian'
t_norm str

Antecedent aggregation operator name (default "gmean" for HTSK).

'gmean'
t_norm_fn TNormFn | None

Optional custom t-norm callable; overrides t_norm when provided.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices. If None, rules are inferred from rule_base.

None
defuzzifier nn.Module | None

Custom defuzzifier module. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Apply batch normalisation to the consequent layer inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the HTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: Rule-base construction strategy.  ``"cartesian"``
            builds the full Cartesian product; ``"coco"`` uses a
            one-cluster-per-rule scheme.
        t_norm: Antecedent aggregation operator name (default
            ``"gmean"`` for HTSK).
        t_norm_fn: Optional custom t-norm callable; overrides
            ``t_norm`` when provided.
        rules: Explicit rule antecedent indices.  If ``None``, rules
            are inferred from ``rule_base``.
        defuzzifier: Custom defuzzifier module.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Apply batch normalisation to the
            consequent layer inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier,
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

HTSKRegressor

Bases: BaseTSKRegressor

HTSK regressor for high-dimensional TSK inference.

HTSK replaces the standard product t-norm with a geometric mean over membership values and performs rule normalization in log-space.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the HTSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

Rule-base construction strategy ("cartesian" or "coco").

'cartesian'
t_norm str

Antecedent aggregation operator (default "gmean").

'gmean'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SoftmaxLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the HTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: Rule-base construction strategy (``"cartesian"`` or
            ``"coco"``).
        t_norm: Antecedent aggregation operator (default ``"gmean"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier,
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

LogTSKClassifier

Bases: BaseTSKClassifier

LogTSK classifier with inverse-log normalization of log-domain rules.

Firing strengths are normalized using the inverse-log formula, which is immune to softmax saturation in high-dimensional input spaces.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the LogTSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

Antecedent aggregation operator (default "prod").

'prod'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.InvLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the LogTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or InvLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

LogTSKRegressor

Bases: BaseTSKRegressor

LogTSK regressor with inverse-log normalization of log-domain rules.

Firing strengths are normalized using the inverse-log formula, which is immune to softmax saturation in high-dimensional input spaces.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the LogTSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

Antecedent aggregation operator (default "prod").

'prod'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.InvLogDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the LogTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or InvLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

TSKClassifier

Bases: BaseTSKClassifier

Vanilla TSK classifier with sum-based rule normalization.

The vanilla Takagi-Sugeno-Kang inference computes rule firing strengths with the product t-norm and normalizes them by their total sum.

References

T. Takagi and M. Sugeno, "Fuzzy identification of systems and its applications to modeling and control," in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-15, no. 1, pp. 116-132, Jan.-Feb. 1985, doi: 10.1109/TSMC.1985.6313399.

Initialise the vanilla TSK classifier.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
n_classes int

Number of output classes (must be ≥ 2).

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

Antecedent aggregation operator (default "prod").

'prod'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False

Raises:

Type Description
ValueError

If n_classes < 2.

Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the vanilla TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted class indices.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

predict_proba

Return class probabilities computed with softmax.

Source code in highfis/models.py
def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

TSKRegressor

Bases: BaseTSKRegressor

Vanilla TSK regressor with sum-based rule normalization.

The vanilla Takagi-Sugeno-Kang inference computes rule firing strengths with the product t-norm and normalizes them by their total sum.

References

T. Takagi and M. Sugeno, "Fuzzy identification of systems and its applications to modeling and control," in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-15, no. 1, pp. 116-132, Jan.-Feb. 1985, doi: 10.1109/TSMC.1985.6313399.

Initialise the vanilla TSK regressor.

Parameters:

Name Type Description Default
input_mfs Mapping[str, Sequence[MembershipFunction]]

Mapping from feature name to a sequence of :class:~highfis.memberships.MembershipFunction objects.

required
rule_base str

"cartesian" or "coco" rule-base strategy.

'cartesian'
t_norm str

Antecedent aggregation operator (default "prod").

'prod'
t_norm_fn TNormFn | None

Optional custom t-norm callable.

None
rules Sequence[Sequence[int]] | None

Explicit rule antecedent indices.

None
defuzzifier nn.Module | None

Custom defuzzifier. Defaults to :class:~highfis.defuzzifiers.SumBasedDefuzzifier.

None
consequent_batch_norm bool

Batch normalisation on consequent inputs.

False
Source code in highfis/models.py
def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the vanilla TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

fit

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name Type Description Default
x Tensor

Training features of shape (N, n_inputs).

required
y Tensor

Training targets of shape (N,).

required
epochs int

Maximum number of training epochs.

200
learning_rate float

Learning rate for the default AdamW optimizer.

0.001
criterion Callable[[Tensor, Tensor], Tensor] | None

Optional loss function. Defaults to :meth:_default_criterion.

None
optimizer torch.optim.Optimizer | None

Optional pre-built optimizer. When None, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.

None
batch_size int | None

Mini-batch size. None uses the full dataset.

None
shuffle bool

If True, reshuffle sample indices each epoch.

True
ur_weight float

Non-negative weight for the uniform rule regularization term. 0.0 disables it.

0.0
ur_target float | None

Target uniform activation for UR. Must be in (0, 1] when provided. None defaults to 1 / n_rules.

None
verbose bool | int

Verbosity level. 0 = quiet, 1 = progress bar, 2 = per-epoch summary logging, 3 = per-epoch detailed logging. True is accepted as an alias for 2.

False
x_val Tensor | None

Optional validation features of shape (M, n_inputs).

None
y_val Tensor | None

Optional validation targets of shape (M,).

None
patience int | None

Number of consecutive epochs without improvement before early stopping. Set to None to disable early stopping. Only active when x_val and y_val are given.

20
restore_best bool

If True (default), restore the model weights from the best validation epoch when early stopping is used.

True
weight_decay float

L2 weight decay applied to consequent parameters by the default AdamW optimizer.

1e-08

Returns:

Type Description
dict[str, Any]

A dictionary with keys "train", "ur", and "val"

dict[str, Any]

containing per-epoch loss lists.

Raises:

Type Description
ValueError

If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside (0, 1].

Source code in highfis/base.py
def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

forward

Full forward pass through the TSK pipeline.

Source code in highfis/base.py
def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

forward_antecedents

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py
def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

predict

Return predicted values as a 1-D tensor.

Source code in highfis/models.py
def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)