Models

Concrete TSK model variants.

Each class in this module is a subclass of BaseTSK that bundles a specific antecedent strategy (t-norm), defuzzification head, and consequent architecture. Users typically access these through the sklearn-style estimator wrappers in highfis.estimators.

Model Family Overview

HTSK Configuration: t_norm="gmean" + SoftmaxLogDefuzzifier

Classes:
    - `HTSKClassifier`
    - `HTSKRegressor`

Behavior:
    - `softmax(log(w^{1/D}))`

TSK (vanilla) Configuration: t_norm="prod" + SumBasedDefuzzifier

Classes:
    - `TSKClassifier`
    - `TSKRegressor`

Behavior:
    - `w_r / Σw`

LogTSK Configuration: t_norm="prod" + InvLogDefuzzifier

Classes:
    - `LogTSKClassifier`
    - `LogTSKRegressor`

Behavior:
    - Inverse-log normalization of log-domain rule weights

DombiTSK Configuration: t_norm="dombi" + SumBasedDefuzzifier

Classes:
    - `DombiTSKClassifier`
    - `DombiTSKRegressor`

ADMTSK Configuration: adaptive Dombi T-norm + CompositeGMF + SumBasedDefuzzifier

Classes:
    - `ADMTSKClassifier`
    - `ADMTSKRegressor`

AYATSK Configuration: t_norm="yager" + SumBasedDefuzzifier

Classes:
    - `AYATSKClassifier`
    - `AYATSKRegressor`

AdaTSK Configuration: adaptive softmin (Ada-softmin) + SumBasedDefuzzifier

Classes:
    - `AdaTSKClassifier`
    - `AdaTSKRegressor`

ADPTSK Configuration: adaptive double-parameter softmin (ADP-softmin) + SumBasedDefuzzifier

Classes:
    - `ADPTSKClassifier`
    - `ADPTSKRegressor`

FSRE-AdaTSK Configuration: adaptive softmin + SoftmaxLogDefuzzifier

Classes:
    - `FSREAdaTSKClassifier`
    - `FSREAdaTSKRegressor`

DG-ALETSK Configuration: ALE-softmin + SoftmaxLogDefuzzifier

Classes:
    - `DGALETSKClassifier`
    - `DGALETSKRegressor`

DG-TSK Configuration: product + M-gate + SoftmaxLogDefuzzifier

Classes:
    - `DGTSKClassifier`
    - `DGTSKRegressor`

HDFIS Configuration: t_norm="prod" with DimensionDependentGaussianMF + SumBasedDefuzzifier for HDFIS-prod; t_norm="min" with frozen antecedents + SumBasedDefuzzifier for HDFIS-min.

Classes:
    - `HDFISProdClassifier`
    - `HDFISProdRegressor`
    - `HDFISMinClassifier`
    - `HDFISMinRegressor`

Notes

All variants normalize rule firing strengths across rules.
SoftmaxLogDefuzzifier improves numerical stability via log-space normalization.
InvLogDefuzzifier applies inverse-log normalization.
Adaptive softmin variants dynamically adjust aggregation behavior.
All classes are exported by this module and are intended for use as concrete TSK classifiers and regressors.

`ADMTSKClassifier`

Bases: BaseTSKClassifier

Adaptive Dombi TSK classifier with Composite Gaussian membership functions.

ADMTSK is an adaptive Dombi TSK fuzzy system designed for high-dimensional inference. It combines a Dombi T-norm antecedent with a positive lower-bound Composite Gaussian membership function (CGMF) and normalized first-order consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialize the ADMTSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of membership functions.	required
`n_classes`	`int`	Number of output classes. Must be >= 2.	required
`rule_base`	`str`	Rule base strategy, either `"coco"` or `"cartesian"`.	`'coco'`
`t_norm`	`str`	T-norm identifier. Defaults to `"dombi"`.	`'dombi'`
`adaptive`	`bool`	If True, compute adaptive lambda using the feature dimension and membership lower bound.	`True`
`lambda_`	`float`	Fixed Dombi parameter `λ > 0` when adaptive is False.	`1.0`
`lower_bound`	`float`	The lower bound for Composite GMF values.	`1.0 / math.e`
`K`	`float`	Heuristic constant used to compute adaptive lambda.	`10.0`
`t_norm_fn`	`TNormFn \| None`	Optional custom T-norm implementation. Overrides `adaptive` and `lambda_` when provided.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices for custom rule bases.	`None`
`defuzzifier`	`nn.Module \| None`	Optional defuzzifier module.	`None`
`consequent_batch_norm`	`bool`	If True, apply batch normalization to consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2` or if `lambda_` is invalid when adaptive is False.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    t_norm: str = "dombi",
    adaptive: bool = True,
    lambda_: float = 1.0,
    lower_bound: float = 1.0 / math.e,
    K: float = 10.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the ADMTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            membership functions.
        n_classes: Number of output classes. Must be >= 2.
        rule_base: Rule base strategy, either ``"coco"`` or
            ``"cartesian"``.
        t_norm: T-norm identifier. Defaults to ``"dombi"``.
        adaptive: If True, compute adaptive lambda using the feature
            dimension and membership lower bound.
        lambda_: Fixed Dombi parameter ``λ > 0`` when adaptive is False.
        lower_bound: The lower bound for Composite GMF values.
        K: Heuristic constant used to compute adaptive lambda.
        t_norm_fn: Optional custom T-norm implementation. Overrides
            ``adaptive`` and ``lambda_`` when provided.
        rules: Explicit rule antecedent indices for custom rule bases.
        defuzzifier: Optional defuzzifier module.
        consequent_batch_norm: If True, apply batch normalization to
            consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2`` or if ``lambda_`` is invalid
            when adaptive is False.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if not adaptive and lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.n_classes = int(n_classes)
    self.adaptive = bool(adaptive)
    self.lambda_ = float(lambda_)
    self.lower_bound = float(lower_bound)
    self.K = float(K)

    if t_norm_fn is None:
        if self.adaptive:
            t_norm_fn = AdaptiveDombiTNorm(
                dimension=len(input_mfs),
                lower_bound=self.lower_bound,
                K=self.K,
            )
        else:
            t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`ADMTSKRegressor`

Bases: BaseTSKRegressor

Adaptive Dombi TSK regressor with Composite Gaussian membership functions.

ADMTSK is an adaptive Dombi TSK fuzzy system designed for high-dimensional inference. It combines a Dombi T-norm antecedent with a positive lower-bound Composite Gaussian membership function (CGMF) and normalized first-order consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialize the ADMTSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of membership functions.	required
`rule_base`	`str`	Rule base strategy, either `"coco"` or `"cartesian"`.	`'coco'`
`t_norm`	`str`	T-norm identifier. Defaults to `"dombi"`.	`'dombi'`
`adaptive`	`bool`	If True, compute adaptive lambda using the feature dimension and membership lower bound.	`True`
`lambda_`	`float`	Fixed Dombi parameter `λ > 0` when adaptive is False.	`1.0`
`lower_bound`	`float`	The lower bound for Composite GMF values.	`1.0 / math.e`
`K`	`float`	Heuristic constant used to compute adaptive lambda.	`10.0`
`t_norm_fn`	`TNormFn \| None`	Optional custom T-norm implementation. Overrides `adaptive` and `lambda_` when provided.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices for custom rule bases.	`None`
`defuzzifier`	`nn.Module \| None`	Optional defuzzifier module.	`None`
`consequent_batch_norm`	`bool`	If True, apply batch normalization to consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `lambda_` is invalid when adaptive is False.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    t_norm: str = "dombi",
    adaptive: bool = True,
    lambda_: float = 1.0,
    lower_bound: float = 1.0 / math.e,
    K: float = 10.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the ADMTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            membership functions.
        rule_base: Rule base strategy, either ``"coco"`` or
            ``"cartesian"``.
        t_norm: T-norm identifier. Defaults to ``"dombi"``.
        adaptive: If True, compute adaptive lambda using the feature
            dimension and membership lower bound.
        lambda_: Fixed Dombi parameter ``λ > 0`` when adaptive is False.
        lower_bound: The lower bound for Composite GMF values.
        K: Heuristic constant used to compute adaptive lambda.
        t_norm_fn: Optional custom T-norm implementation. Overrides
            ``adaptive`` and ``lambda_`` when provided.
        rules: Explicit rule antecedent indices for custom rule bases.
        defuzzifier: Optional defuzzifier module.
        consequent_batch_norm: If True, apply batch normalization to
            consequent inputs.

    Raises:
        ValueError: If ``lambda_`` is invalid when adaptive is False.
    """
    if not adaptive and lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.adaptive = bool(adaptive)
    self.lambda_ = float(lambda_)
    self.lower_bound = float(lower_bound)
    self.K = float(K)

    if t_norm_fn is None:
        if self.adaptive:
            t_norm_fn = AdaptiveDombiTNorm(
                dimension=len(input_mfs),
                lower_bound=self.lower_bound,
                K=self.K,
            )
        else:
            t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`ADPTSKClassifier`

Bases: BaseTSKClassifier

TSK classifier with adaptive double-parameter softmin antecedent (ADPTSK).

The firing strengths of each rule are computed with the ADP-softmin operator, and membership functions are wrapped as Gaussian PIMFs to preserve a positive infimum during high-dimensional training.

Reference

Ma, M., Qian, L., Zhang, Y., Fang, Q., & Xue, G. (2025). An adaptive double-parameter softmin based Takagi-Sugeno-Kang fuzzy system for high-dimensional data. Fuzzy Sets and Systems, 521, 109582. https://doi.org/10.1016/j.fss.2025.109582

Initialise the ADPTSK classifier.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    kappa: float = 690.0,
    xi: float = 730.0,
    eps: float | None = None,
) -> None:
    """Initialise the ADPTSK classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.kappa = float(kappa)
    self.xi = float(xi)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = ADPSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        kappa=self.kappa,
        xi=self.xi,
        eps=self.eps,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`ADPTSKRegressor`

Bases: BaseTSKRegressor

TSK regressor with adaptive double-parameter softmin antecedent (ADPTSK).

The firing strengths of each rule are computed with the ADP-softmin operator, and membership functions are wrapped as Gaussian PIMFs to preserve a positive infimum during high-dimensional training.

Reference

Ma, M., Qian, L., Zhang, Y., Fang, Q., & Xue, G. (2025). An adaptive double-parameter softmin based Takagi-Sugeno-Kang fuzzy system for high-dimensional data. Fuzzy Sets and Systems, 521, 109582. https://doi.org/10.1016/j.fss.2025.109582

Initialise the ADPTSK regressor.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    kappa: float = 690.0,
    xi: float = 730.0,
    eps: float | None = None,
) -> None:
    """Initialise the ADPTSK regressor."""
    self.kappa = float(kappa)
    self.xi = float(xi)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = ADPSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        kappa=self.kappa,
        xi=self.xi,
        eps=self.eps,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`AYATSKClassifier`

Bases: BaseTSKClassifier

TSK classifier with an adaptive Yager T-norm in the antecedent.

AYATSK extends TSK by using an adaptive Yager T-norm aggregation and optional positive lower-bound membership functions to improve stability and performance in high-dimensional settings.

Reference

G. Xue, Y. Yang and J. Wang, "Adaptive Yager T-Norm-Based Takagi-Sugeno-Kang Fuzzy Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 55, no. 12, pp. 9802-9815, Dec. 2025, doi: 10.1109/TSMC.2025.3621346.

Initialise the AYATSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`t_norm`	`str`	T-norm identifier (default `"yager"`).	`'yager'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    t_norm: str = "yager",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the AYATSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        t_norm: T-norm identifier (default ``"yager"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`AYATSKRegressor`

Bases: BaseTSKRegressor

TSK regressor with an adaptive Yager T-norm in the antecedent.

AYATSK extends TSK by using an adaptive Yager T-norm aggregation and optional positive lower-bound membership functions to improve stability and performance in high-dimensional settings.

Reference

G. Xue, Y. Yang and J. Wang, "Adaptive Yager T-Norm-Based Takagi-Sugeno-Kang Fuzzy Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 55, no. 12, pp. 9802-9815, Dec. 2025, doi: 10.1109/TSMC.2025.3621346.

Initialise the AYATSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`t_norm`	`str`	T-norm identifier (default `"yager"`).	`'yager'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    t_norm: str = "yager",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the AYATSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        t_norm: T-norm identifier (default ``"yager"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`AdaTSKClassifier`

Bases: BaseTSKClassifier

TSK classifier with adaptive softmin antecedent (AdaTSK).

The firing strength of each rule is computed with the Ada-softmin operator.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the AdaTSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the Ada-softmin operator.	`None`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
) -> None:
    """Initialise the AdaTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        eps=self.eps,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`AdaTSKRegressor`

Bases: BaseTSKRegressor

TSK regressor with adaptive softmin antecedent (AdaTSK).

The firing strength of each rule is computed with the Ada-softmin operator.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the AdaTSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the Ada-softmin operator.	`None`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
) -> None:
    """Initialise the AdaTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
    """
    self.eps = eps

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules,
        rule_base=rule_base,
        eps=self.eps,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`BaseTSKClassifier`

Bases: BaseTSK

Abstract classifier base that provides task-specific training and inference helpers.

Initialize the TSK pipeline layers.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature names to sequences of :class:`~highfis.memberships.MembershipFunction` objects. Must not be empty.	required
`rule_base`	`str`	Rule-base construction strategy. Supported values: `"cartesian"` (all MF combinations), `"coco"` (same-index compact), `"en"` (enhanced FRB), or `"custom"` (explicit rules via rules).	`'cartesian'`
`t_norm`	`str`	Built-in T-norm name. Ignored when t_norm_fn is provided. Common values: `"prod"`, `"gmean"`, `"min"`, `"dombi"`, `"yager"`.	`'gmean'`
`t_norm_fn`	`TNormFn \| None`	Optional custom T-norm callable. When provided, t_norm is internally set to `"prod"` and the rule layer applies this function instead.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule index sequences. Required when rule_base is `"custom"`.	`None`
`defuzzifier`	`nn.Module \| None`	Normalization module applied to raw rule firing strengths. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	If `True`, insert a :class:`~torch.nn.BatchNorm1d` layer on the inputs before the consequent computation.	`False`

Raises:

Type	Description
`ValueError`	If input_mfs is empty.

Source code in highfis/base.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    *,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the TSK pipeline layers.

    Args:
        input_mfs: Mapping from feature names to sequences of
            :class:`~highfis.memberships.MembershipFunction` objects.
            Must not be empty.
        rule_base: Rule-base construction strategy.  Supported values:
            ``"cartesian"`` (all MF combinations), ``"coco"``
            (same-index compact), ``"en"`` (enhanced FRB), or
            ``"custom"`` (explicit rules via *rules*).
        t_norm: Built-in T-norm name.  Ignored when *t_norm_fn* is
            provided.  Common values: ``"prod"``, ``"gmean"``,
            ``"min"``, ``"dombi"``, ``"yager"``.
        t_norm_fn: Optional custom T-norm callable.  When provided,
            *t_norm* is internally set to ``"prod"`` and the rule
            layer applies this function instead.
        rules: Explicit rule index sequences.  Required when
            *rule_base* is ``"custom"``.
        defuzzifier: Normalization module applied to raw rule firing
            strengths.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: If ``True``, insert a
            :class:`~torch.nn.BatchNorm1d` layer on the inputs before
            the consequent computation.

    Raises:
        ValueError: If *input_mfs* is empty.
    """
    super().__init__()
    if not input_mfs:
        raise ValueError("input_mfs must not be empty")

    self.input_mfs = input_mfs
    self.input_names = list(input_mfs.keys())
    self.n_inputs = len(self.input_names)
    mf_per_input = [len(input_mfs[name]) for name in self.input_names]

    self.membership_layer = MembershipLayer(input_mfs)
    self.rule_layer = RuleLayer(
        self.input_names,
        mf_per_input,
        rules=rules,
        rule_base=rule_base,
        t_norm=t_norm if t_norm_fn is None else "prod",
        t_norm_fn=t_norm_fn,
    )
    self.n_rules = self.rule_layer.n_rules
    self.defuzzifier = defuzzifier or SoftmaxLogDefuzzifier()
    self.consequent_batch_norm = bool(consequent_batch_norm)
    self.consequent_bn = nn.BatchNorm1d(self.n_inputs) if self.consequent_batch_norm else None
    self.consequent_layer = self._build_consequent_layer()
    self.logger = logging.getLogger(f"{self.__class__.__module__}.{self.__class__.__name__}")
    if not self.logger.handlers:
        stream_handler = logging.StreamHandler(sys.stdout)
        stream_handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.addHandler(stream_handler)
        self.logger.setLevel(logging.INFO)
        self.logger.propagate = False

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`BaseTSKRegressor`

Bases: BaseTSK

Abstract regressor base that provides task-specific training and inference helpers.

Initialize the TSK pipeline layers.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature names to sequences of :class:`~highfis.memberships.MembershipFunction` objects. Must not be empty.	required
`rule_base`	`str`	Rule-base construction strategy. Supported values: `"cartesian"` (all MF combinations), `"coco"` (same-index compact), `"en"` (enhanced FRB), or `"custom"` (explicit rules via rules).	`'cartesian'`
`t_norm`	`str`	Built-in T-norm name. Ignored when t_norm_fn is provided. Common values: `"prod"`, `"gmean"`, `"min"`, `"dombi"`, `"yager"`.	`'gmean'`
`t_norm_fn`	`TNormFn \| None`	Optional custom T-norm callable. When provided, t_norm is internally set to `"prod"` and the rule layer applies this function instead.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule index sequences. Required when rule_base is `"custom"`.	`None`
`defuzzifier`	`nn.Module \| None`	Normalization module applied to raw rule firing strengths. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	If `True`, insert a :class:`~torch.nn.BatchNorm1d` layer on the inputs before the consequent computation.	`False`

Raises:

Type	Description
`ValueError`	If input_mfs is empty.

Source code in highfis/base.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    *,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the TSK pipeline layers.

    Args:
        input_mfs: Mapping from feature names to sequences of
            :class:`~highfis.memberships.MembershipFunction` objects.
            Must not be empty.
        rule_base: Rule-base construction strategy.  Supported values:
            ``"cartesian"`` (all MF combinations), ``"coco"``
            (same-index compact), ``"en"`` (enhanced FRB), or
            ``"custom"`` (explicit rules via *rules*).
        t_norm: Built-in T-norm name.  Ignored when *t_norm_fn* is
            provided.  Common values: ``"prod"``, ``"gmean"``,
            ``"min"``, ``"dombi"``, ``"yager"``.
        t_norm_fn: Optional custom T-norm callable.  When provided,
            *t_norm* is internally set to ``"prod"`` and the rule
            layer applies this function instead.
        rules: Explicit rule index sequences.  Required when
            *rule_base* is ``"custom"``.
        defuzzifier: Normalization module applied to raw rule firing
            strengths.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: If ``True``, insert a
            :class:`~torch.nn.BatchNorm1d` layer on the inputs before
            the consequent computation.

    Raises:
        ValueError: If *input_mfs* is empty.
    """
    super().__init__()
    if not input_mfs:
        raise ValueError("input_mfs must not be empty")

    self.input_mfs = input_mfs
    self.input_names = list(input_mfs.keys())
    self.n_inputs = len(self.input_names)
    mf_per_input = [len(input_mfs[name]) for name in self.input_names]

    self.membership_layer = MembershipLayer(input_mfs)
    self.rule_layer = RuleLayer(
        self.input_names,
        mf_per_input,
        rules=rules,
        rule_base=rule_base,
        t_norm=t_norm if t_norm_fn is None else "prod",
        t_norm_fn=t_norm_fn,
    )
    self.n_rules = self.rule_layer.n_rules
    self.defuzzifier = defuzzifier or SoftmaxLogDefuzzifier()
    self.consequent_batch_norm = bool(consequent_batch_norm)
    self.consequent_bn = nn.BatchNorm1d(self.n_inputs) if self.consequent_batch_norm else None
    self.consequent_layer = self._build_consequent_layer()
    self.logger = logging.getLogger(f"{self.__class__.__module__}.{self.__class__.__name__}")
    if not self.logger.handlers:
        stream_handler = logging.StreamHandler(sys.stdout)
        stream_handler.setFormatter(logging.Formatter("%(message)s"))
        self.logger.addHandler(stream_handler)
        self.logger.setLevel(logging.INFO)
        self.logger.propagate = False

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`DGALETSKClassifier`

Bases: BaseTSKClassifier

DG-ALETSK classifier with ALE-softmin antecedent and double-group gates.

DG-ALETSK extends FSRE-AdaTSK by replacing the adaptive softmin with the Adaptive Ln-Exp (ALE) softmin — a smoother variant with improved numerical stability. It also uses a zero-order consequent in the DG (data-guided) training phase and optionally converts to first-order after gate-based pruning.

Reference

G. Xue, J. Wang, B. Yuan and C. Dai, "DG-ALETSK: A High-Dimensional Fuzzy Approach With Simultaneous Feature Selection and Rule Extraction," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 11, pp. 3866-3880, Nov. 2023, doi: 10.1109/TFUZZ.2023.3270445.

Initialise the DG-ALETSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`lambda_init`	`float`	Initial ALE-softmin parameter `alpha > 0` (default `1.0`).	`1.0`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the ALE-softmin operator.	`None`
`use_en_frb`	`bool`	Start directly from the Enhanced FRB (En-FRB).	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2` or `lambda_init <= 0`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    lambda_init: float = 1.0,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-ALETSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        lambda_init: Initial ALE-softmin parameter ``alpha > 0``
            (default ``1.0``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the ALE-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).

    Raises:
        ValueError: If ``n_classes < 2`` or ``lambda_init <= 0``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if lambda_init <= 0.0:
        raise ValueError("lambda_init must be > 0")

    self.n_classes = int(n_classes)
    self.lambda_init = float(lambda_init)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGALETSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        alpha_init=self.lambda_init,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

`apply_thresholds`

Apply threshold pruning to feature and rule gates.

Source code in highfis/models.py

def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Apply threshold pruning to feature and rule gates."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    rule_layer = self.rule_layer
    cast(Tensor, rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    consequent = self.consequent_layer
    cast(Tensor, consequent.theta_gates.data)[pruned_rules] = 0.0

`compute_thresholds`

Compute feature and rule thresholds from gate values and coefficient pairs.

Source code in highfis/models.py

def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute feature and rule thresholds from gate values and coefficient pairs."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

`convert_to_first_order`

Convert the DG phase zero-order consequent to first-order form.

Source code in highfis/models.py

def convert_to_first_order(self) -> None:
    """Convert the DG phase zero-order consequent to first-order form."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedClassificationZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_dg_phase`

Train the DG phase using zero-order TSK and joint FS+RE.

Source code in highfis/models.py

def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG phase using zero-order TSK and joint FS+RE."""
    return self.fit(x, y, **kwargs)

`fit_finetune`

Fine-tune the DG-ALETSK model after converting to first-order TSK.

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-ALETSK model after converting to first-order TSK."""
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`get_feature_gate_values`

Return normalized antecedent feature gate values for the DG phase.

Source code in highfis/models.py

def get_feature_gate_values(self) -> Tensor:
    """Return normalized antecedent feature gate values for the DG phase."""
    rule_layer = self.rule_layer
    return _gate_activation(rule_layer.lambda_gates)

`get_rule_gate_values`

Return normalized consequent rule gate values for the DG phase.

Source code in highfis/models.py

def get_rule_gate_values(self) -> Tensor:
    """Return normalized consequent rule gate values for the DG phase."""
    consequent = self.consequent_layer
    return _gate_activation(consequent.theta_gates)

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`search_thresholds`

Search threshold coefficients for feature and rule pruning.

The search follows the DG-ALETSK paper strategy: thresholds are computed from gate values, applied to prune gates, and the first-order consequent parameters are refit with antecedents fixed.

Source code in highfis/models.py

def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search threshold coefficients for feature and rule pruning.

    The search follows the DG-ALETSK paper strategy: thresholds are
    computed from gate values, applied to prune gates, and the first-order
    consequent parameters are refit with antecedents fixed.
    """
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

`DGALETSKRegressor`

Bases: BaseTSKRegressor

DG-ALETSK regressor with ALE-softmin antecedent and double-group gates.

DG-ALETSK extends FSRE-AdaTSK by replacing the adaptive softmin with the Adaptive Ln-Exp (ALE) softmin — a smoother variant with improved numerical stability. It also uses a zero-order consequent in the DG (data-guided) training phase and optionally converts to first-order after gate-based pruning.

Reference

G. Xue, J. Wang, B. Yuan and C. Dai, "DG-ALETSK: A High-Dimensional Fuzzy Approach With Simultaneous Feature Selection and Rule Extraction," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 11, pp. 3866-3880, Nov. 2023, doi: 10.1109/TFUZZ.2023.3270445.

Initialise the DG-ALETSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`lambda_init`	`float`	Initial ALE-softmin parameter `alpha > 0` (default `1.0`).	`1.0`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the ALE-softmin operator.	`None`
`use_en_frb`	`bool`	Start directly from the Enhanced FRB (En-FRB).	`False`

Raises:

Type	Description
`ValueError`	If `lambda_init <= 0`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    lambda_init: float = 1.0,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-ALETSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        lambda_init: Initial ALE-softmin parameter ``alpha > 0``
            (default ``1.0``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the ALE-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).

    Raises:
        ValueError: If ``lambda_init <= 0``.
    """
    if lambda_init <= 0.0:
        raise ValueError("lambda_init must be > 0")

    self.lambda_init = float(lambda_init)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGALETSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        alpha_init=self.lambda_init,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

`apply_thresholds`

Apply threshold pruning to feature and rule gates.

Source code in highfis/models.py

def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Apply threshold pruning to feature and rule gates."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    rule_layer = self.rule_layer
    cast(Tensor, rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    consequent = self.consequent_layer
    cast(Tensor, consequent.theta_gates.data)[pruned_rules] = 0.0

`compute_thresholds`

Compute feature and rule thresholds from gate values and coefficient pairs.

Source code in highfis/models.py

def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute feature and rule thresholds from gate values and coefficient pairs."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

`convert_to_first_order`

Convert the DG phase zero-order consequent to first-order form.

Source code in highfis/models.py

def convert_to_first_order(self) -> None:
    """Convert the DG phase zero-order consequent to first-order form."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedRegressionZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_dg_phase`

Train the DG phase using zero-order TSK and joint FS+RE.

Source code in highfis/models.py

def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG phase using zero-order TSK and joint FS+RE."""
    return self.fit(x, y, **kwargs)

`fit_finetune`

Fine-tune the DG-ALETSK model after converting to first-order TSK.

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-ALETSK model after converting to first-order TSK."""
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`get_feature_gate_values`

Return normalized antecedent feature gate values for the DG phase.

Source code in highfis/models.py

def get_feature_gate_values(self) -> Tensor:
    """Return normalized antecedent feature gate values for the DG phase."""
    rule_layer = self.rule_layer
    return _gate_activation(rule_layer.lambda_gates)

`get_rule_gate_values`

Return normalized consequent rule gate values for the DG phase.

Source code in highfis/models.py

def get_rule_gate_values(self) -> Tensor:
    """Return normalized consequent rule gate values for the DG phase."""
    consequent = self.consequent_layer
    return _gate_activation(consequent.theta_gates)

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`search_thresholds`

Search threshold coefficients for feature and rule pruning.

Source code in highfis/models.py

def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search threshold coefficients for feature and rule pruning."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

`DGTSKClassifier`

Bases: BaseTSKClassifier

DG-TSK classifier with M-gate antecedent and point-based FRB (P-FRB).

DG-TSK uses a data-guided M-gate function to automatically select relevant features and rules.

Reference

Guangdong Xue, Jian Wang, Bingjie Zhang, Bin Yuan, Caili Dai, Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction, Fuzzy Sets and Systems, Volume 469, 2023, 108627, ISSN 0165-0114, https://doi.org/10.1016/j.fss.2023.108627.

Initialise the DG-TSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`gate_fea`	`str \| Callable[[Tensor], Tensor] \| None`	Gate function for antecedent feature selection. `"gate_m"` (default) uses the M-gate from the DG-TSK paper. Can also be any callable `Tensor → Tensor`.	`'gate_m'`
`gate_rule`	`str \| Callable[[Tensor], Tensor] \| None`	Gate function for consequent rule selection. Same options as `gate_fea`.	`'gate_m'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon.	`None`
`use_en_frb`	`bool`	Use the Enhanced FRB (P-FRB) rule base.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    gate_fea: str | Callable[[Tensor], Tensor] | None = "gate_m",
    gate_rule: str | Callable[[Tensor], Tensor] | None = "gate_m",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        gate_fea: Gate function for antecedent feature selection.
            ``"gate_m"`` (default) uses the M-gate from the DG-TSK paper.
            Can also be any callable ``Tensor → Tensor``.
        gate_rule: Gate function for consequent rule selection.
            Same options as ``gate_fea``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon.
        use_en_frb: Use the Enhanced FRB (P-FRB) rule base.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.gate_fea = gate_fea
    self.gate_rule = gate_rule
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGTSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        gate_fea=self.gate_fea,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

`apply_thresholds`

Prune DG-TSK feature and rule gates using the computed thresholds.

Source code in highfis/models.py

def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Prune DG-TSK feature and rule gates using the computed thresholds."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    cast(Tensor, self.rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    cast(Tensor, self.consequent_layer.theta_gates.data)[pruned_rules] = 0.0

`compute_thresholds`

Compute DG-TSK pruning thresholds from gate values and zeta parameters.

Source code in highfis/models.py

def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute DG-TSK pruning thresholds from gate values and zeta parameters."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

`convert_to_first_order`

Convert the DG-TSK model from zero-order to first-order consequent.

Source code in highfis/models.py

def convert_to_first_order(self) -> None:
    """Convert the DG-TSK model from zero-order to first-order consequent."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedClassificationZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_dg_phase`

Train the DG-TSK zero-order phase before first-order conversion.

Source code in highfis/models.py

def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG-TSK zero-order phase before first-order conversion."""
    return self.fit(x, y, **kwargs)

`fit_finetune`

Fine-tune the DG-TSK classifier after conversion to first-order consequents.

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-TSK classifier after conversion to first-order consequents."""
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`get_feature_gate_values`

Return normalized DG-TSK feature gate activations from lambda values.

Source code in highfis/models.py

def get_feature_gate_values(self) -> Tensor:
    """Return normalized DG-TSK feature gate activations from lambda values."""
    return self.rule_layer.gate_fn(self.rule_layer.lambda_gates)

`get_rule_gate_values`

Return normalized DG-TSK rule gate activations from theta values.

Source code in highfis/models.py

def get_rule_gate_values(self) -> Tensor:
    """Return normalized DG-TSK rule gate activations from theta values."""
    return self.consequent_layer.gate_fn(self.consequent_layer.theta_gates)

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`search_thresholds`

Search DG-TSK threshold combinations and optionally apply the best candidate.

Source code in highfis/models.py

def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search DG-TSK threshold combinations and optionally apply the best candidate."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedClassificationZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

`DGTSKRegressor`

Bases: BaseTSKRegressor

DG-TSK regressor with M-gate antecedent and point-based FRB (P-FRB).

DG-TSK uses a data-guided M-gate function to automatically select relevant features and rules.

Reference

Guangdong Xue, Jian Wang, Bingjie Zhang, Bin Yuan, Caili Dai, Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction, Fuzzy Sets and Systems, Volume 469, 2023, 108627, ISSN 0165-0114, https://doi.org/10.1016/j.fss.2023.108627.

Initialise the DG-TSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`gate_fea`	`str \| Callable[[Tensor], Tensor] \| None`	Gate function for antecedent feature selection (default `"gate_m"`).	`'gate_m'`
`gate_rule`	`str \| Callable[[Tensor], Tensor] \| None`	Gate function for consequent rule selection (default `"gate_m"`).	`'gate_m'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon.	`None`
`use_en_frb`	`bool`	Use the Enhanced FRB (P-FRB) rule base.	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    gate_fea: str | Callable[[Tensor], Tensor] | None = "gate_m",
    gate_rule: str | Callable[[Tensor], Tensor] | None = "gate_m",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the DG-TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        gate_fea: Gate function for antecedent feature selection
            (default ``"gate_m"``).
        gate_rule: Gate function for consequent rule selection
            (default ``"gate_m"``).
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon.
        use_en_frb: Use the Enhanced FRB (P-FRB) rule base.
    """
    self.gate_fea = gate_fea
    self.gate_rule = gate_rule
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = DGTSKRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        gate_fea=self.gate_fea,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_zero_order_consequent_layer()

`apply_thresholds`

Prune DG-TSK feature and rule gates using the computed thresholds.

Source code in highfis/models.py

def apply_thresholds(self, tau_lambda: float, tau_theta: float) -> None:
    """Prune DG-TSK feature and rule gates using the computed thresholds."""
    if not torch.isfinite(torch.tensor(tau_lambda)) or not torch.isfinite(torch.tensor(tau_theta)):
        raise ValueError("thresholds must be finite")

    feature_gate_values = self.get_feature_gate_values()
    pruned_features = feature_gate_values <= tau_lambda
    cast(Tensor, self.rule_layer.lambda_gates.data)[pruned_features] = 0.0

    rule_gate_values = self.get_rule_gate_values()
    pruned_rules = rule_gate_values <= tau_theta
    cast(Tensor, self.consequent_layer.theta_gates.data)[pruned_rules] = 0.0

`compute_thresholds`

Compute DG-TSK pruning thresholds from gate values and zeta parameters.

Source code in highfis/models.py

def compute_thresholds(self, zeta_lambda: float, zeta_theta: float) -> tuple[float, float]:
    """Compute DG-TSK pruning thresholds from gate values and zeta parameters."""
    tau_lambda = _threshold_from_zeta(self.get_feature_gate_values(), zeta_lambda)
    tau_theta = _threshold_from_zeta(self.get_rule_gate_values(), zeta_theta)
    return tau_lambda, tau_theta

`convert_to_first_order`

Convert the DG-TSK regressor from zero-order to first-order consequent.

Source code in highfis/models.py

def convert_to_first_order(self) -> None:
    """Convert the DG-TSK regressor from zero-order to first-order consequent."""
    previous = self.consequent_layer
    new_consequent = self._build_consequent_layer()
    if isinstance(previous, GatedRegressionZeroOrderConsequentLayer):
        new_consequent.theta_gates.data.copy_(previous.theta_gates.data)
    self.consequent_layer = new_consequent

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_dg_phase`

Train the DG-TSK regression zero-order phase before first-order conversion.

Source code in highfis/models.py

def fit_dg_phase(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the DG-TSK regression zero-order phase before first-order conversion."""
    return self.fit(x, y, **kwargs)

`fit_finetune`

Fine-tune the DG-TSK regression model after converting to first order.

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune the DG-TSK regression model after converting to first order."""
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`get_feature_gate_values`

Return normalized DG-TSK feature gate activations from lambda values.

Source code in highfis/models.py

def get_feature_gate_values(self) -> Tensor:
    """Return normalized DG-TSK feature gate activations from lambda values."""
    return self.rule_layer.gate_fn(self.rule_layer.lambda_gates)

`get_rule_gate_values`

Return normalized DG-TSK rule gate activations from theta values.

Source code in highfis/models.py

def get_rule_gate_values(self) -> Tensor:
    """Return normalized DG-TSK rule gate activations from theta values."""
    return self.consequent_layer.gate_fn(self.consequent_layer.theta_gates)

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`search_thresholds`

Search DG-TSK regression threshold combinations and optionally apply the best candidate.

Source code in highfis/models.py

def search_thresholds(
    self,
    x: Tensor,
    y: Tensor,
    zeta_lambda: Sequence[float] | None = None,
    zeta_theta: Sequence[float] | None = None,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    use_lse: bool = True,
    inplace: bool = True,
    verbose: bool = False,
) -> dict[str, Any]:
    """Search DG-TSK regression threshold combinations and optionally apply the best candidate."""
    if zeta_lambda is None:
        zeta_lambda = [0.0, 0.25, 0.5, 0.75, 1.0]
    if zeta_theta is None:
        zeta_theta = [0.0, 0.25, 0.5, 0.75, 1.0]

    x_eval = x_val if x_val is not None else x
    y_eval = y_val if y_val is not None else y

    best_score = float("-inf")
    best_state: dict[str, Any] | None = None
    best_tau_lambda = 0.0
    best_tau_theta = 0.0
    best_zeta_lambda = 0.0
    best_zeta_theta = 0.0

    for zeta_l in zeta_lambda:
        for zeta_t in zeta_theta:
            candidate = copy.deepcopy(self)
            if isinstance(candidate.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
                candidate.convert_to_first_order()

            tau_l, tau_t = candidate.compute_thresholds(zeta_l, zeta_t)
            candidate.apply_thresholds(tau_l, tau_t)
            if use_lse:
                candidate._fit_first_order_consequents_lse(x, y)

            score = candidate._evaluate_threshold_score(x_eval, y_eval)
            if verbose:
                self._log("zeta_lambda=%s zeta_theta=%s score=%.6f", zeta_l, zeta_t, score, verbose=True)

            if score > best_score:
                best_score = score
                best_state = copy.deepcopy(candidate.state_dict())
                best_tau_lambda = tau_l
                best_tau_theta = tau_t
                best_zeta_lambda = zeta_l
                best_zeta_theta = zeta_t

    if best_state is None:
        raise RuntimeError("threshold search did not yield a valid candidate")

    result = {
        "best_score": best_score,
        "best_zeta_lambda": best_zeta_lambda,
        "best_zeta_theta": best_zeta_theta,
        "tau_lambda": best_tau_lambda,
        "tau_theta": best_tau_theta,
    }

    if inplace:
        if isinstance(self.consequent_layer, GatedRegressionZeroOrderConsequentLayer):
            self.convert_to_first_order()
        self.load_state_dict(best_state)

    return result

`DombiTSKClassifier`

Bases: BaseTSKClassifier

TSK classifier with a fixed Dombi T-norm in the antecedent.

DombiTSK extends TSK fuzzy inference by using a Dombi t-norm aggregation in antecedent evaluation while keeping first-order linear consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialise the Dombi TSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	T-norm identifier (default `"dombi"`).	`'dombi'`
`lambda_`	`float`	Dombi parameter `λ > 0`. `λ = 1` gives the algebraic product.	`1.0`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable; overrides `lambda_` and `t_norm` when provided.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2` or `lambda_ <= 0`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "dombi",
    lambda_: float = 1.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the Dombi TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: T-norm identifier (default ``"dombi"``).
        lambda_: Dombi parameter ``λ > 0``.  ``λ = 1`` gives the
            algebraic product.
        t_norm_fn: Optional custom t-norm callable; overrides
            ``lambda_`` and ``t_norm`` when provided.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2`` or ``lambda_ <= 0``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    if lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.n_classes = int(n_classes)
    self.lambda_ = float(lambda_)
    if t_norm_fn is None:
        t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`DombiTSKRegressor`

Bases: BaseTSKRegressor

TSK regressor with a fixed Dombi T-norm in the antecedent.

DombiTSK extends TSK fuzzy inference by using a Dombi t-norm aggregation in antecedent evaluation while keeping first-order linear consequents.

Reference

G. Xue, L. Hu, J. Wang and S. Ablameyko, "ADMTSK: A High-Dimensional Takagi-Sugeno-Kang Fuzzy System Based on Adaptive Dombi T-Norm," in IEEE Transactions on Fuzzy Systems, vol. 33, no. 6, pp. 1767-1780, June 2025, doi: 10.1109/TFUZZ.2025.3535640.

Initialise the Dombi TSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	T-norm identifier (default `"dombi"`).	`'dombi'`
`lambda_`	`float`	Dombi parameter `λ > 0`.	`1.0`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `lambda_ <= 0`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "dombi",
    lambda_: float = 1.0,
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the Dombi TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: T-norm identifier (default ``"dombi"``).
        lambda_: Dombi parameter ``λ > 0``.
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``lambda_ <= 0``.
    """
    if lambda_ <= 0.0:
        raise ValueError("lambda_ must be > 0")

    self.lambda_ = float(lambda_)
    if t_norm_fn is None:
        t_norm_fn = DombiTNorm(lambda_=self.lambda_)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`FSREAdaTSKClassifier`

Bases: BaseTSKClassifier

FSRE-AdaTSK classifier with adaptive softmin antecedent and gated consequents.

FSRE-AdaTSK (Feature Selection and Rule Extraction) extends AdaTSK.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the FSRE-AdaTSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the Ada-softmin operator.	`None`
`use_en_frb`	`bool`	Start directly from the Enhanced FRB (En-FRB) instead of CoCo-FRB.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the FSRE-AdaTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB)
            instead of CoCo-FRB.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")

    self.n_classes = int(n_classes)
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

`expand_to_en_frb`

Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase.

Source code in highfis/models.py

def expand_to_en_frb(self) -> None:
    """Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase."""
    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(self.input_mfs[name]) for name in self.input_names],
        rule_base="en",
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_finetune`

Fine-tune with no gates — plain TSK consequent (eq. 5).

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune with no gates — plain TSK consequent (eq. 5)."""
    self.consequent_layer.mode = "finetune"
    return self.fit(x, y, **kwargs)

`fit_fs`

Train the FS phase: only feature gates M(λ_d) are active (eq. 21).

Source code in highfis/models.py

def fit_fs(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the FS phase: only feature gates M(λ_d) are active (eq. 21)."""
    self.consequent_layer.mode = "fs"
    return self.fit(x, y, **kwargs)

`fit_re`

Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22).

Source code in highfis/models.py

def fit_re(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22)."""
    self.expand_to_en_frb()
    self.consequent_layer.mode = "re"
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`FSREAdaTSKRegressor`

Bases: BaseTSKRegressor

FSRE-AdaTSK regressor with adaptive softmin antecedent and gated consequents.

FSRE-AdaTSK (Feature Selection and Rule Extraction) extends AdaTSK.

Reference

G. Xue, Q. Chang, J. Wang, K. Zhang and N. R. Pal, "An Adaptive Neuro-Fuzzy System With Integrated Feature Selection and Rule Extraction for High-Dimensional Classification Problems," in IEEE Transactions on Fuzzy Systems, vol. 31, no. 7, pp. 2167-2181, July 2023, doi: 10.1109/TFUZZ.2022.3220950.

Initialise the FSRE-AdaTSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"coco"` (default) or `"cartesian"`.	`'coco'`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices; ignored when `use_en_frb=True`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`
`eps`	`float \| None`	Numerical stability epsilon for the Ada-softmin operator.	`None`
`use_en_frb`	`bool`	Start directly from the Enhanced FRB (En-FRB).	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "coco",
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
    eps: float | None = None,
    use_en_frb: bool = False,
) -> None:
    """Initialise the FSRE-AdaTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"coco"`` (default) or ``"cartesian"``.
        rules: Explicit rule antecedent indices; ignored when
            ``use_en_frb=True``.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
        eps: Numerical stability epsilon for the Ada-softmin operator.
        use_en_frb: Start directly from the Enhanced FRB (En-FRB).
    """
    self.eps = eps
    self.use_en_frb = bool(use_en_frb)

    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm="prod",
        t_norm_fn=None,
        rules=rules,
        defuzzifier=defuzzifier or SoftmaxLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(input_mfs[name]) for name in self.input_names],
        rules=rules if not self.use_en_frb else None,
        rule_base="en" if self.use_en_frb else rule_base,
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

`expand_to_en_frb`

Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase.

Source code in highfis/models.py

def expand_to_en_frb(self) -> None:
    """Switch the rule layer to an Enhanced Fuzzy Rule Base for RE phase."""
    self.rule_layer = AdaSoftminRuleLayer(
        self.input_names,
        [len(self.input_mfs[name]) for name in self.input_names],
        rule_base="en",
        eps=self.eps,
    )
    self.n_rules = self.rule_layer.n_rules
    self.consequent_layer = self._build_consequent_layer()

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`fit_finetune`

Fine-tune with no gates — plain TSK consequent (eq. 5).

Source code in highfis/models.py

def fit_finetune(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Fine-tune with no gates — plain TSK consequent (eq. 5)."""
    self.consequent_layer.mode = "finetune"
    return self.fit(x, y, **kwargs)

`fit_fs`

Train the FS phase: only feature gates M(λ_d) are active (eq. 21).

Source code in highfis/models.py

def fit_fs(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Train the FS phase: only feature gates M(λ_d) are active (eq. 21)."""
    self.consequent_layer.mode = "fs"
    return self.fit(x, y, **kwargs)

`fit_re`

Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22).

Source code in highfis/models.py

def fit_re(self, x: Tensor, y: Tensor, **kwargs: Any) -> dict[str, Any]:
    """Expand to En-FRB and train the RE phase: only rule gates M(θ_r) active (eq. 22)."""
    self.expand_to_en_frb()
    self.consequent_layer.mode = "re"
    return self.fit(x, y, **kwargs)

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`HDFISMinClassifier`

Bases: BaseTSKClassifier

HDFIS-min classifier with frozen antecedents and minimum aggregation.

HDFIS-min uses the minimum T-norm in the antecedent and only optimizes consequent parameters, which avoids the nondifferentiability of the minimum operator during training.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-min classifier.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "min",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-min classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )
    for param in self.membership_layer.parameters():
        param.requires_grad = False

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`HDFISMinRegressor`

Bases: BaseTSKRegressor

HDFIS-min regressor with frozen antecedents and minimum aggregation.

HDFIS-min uses the minimum T-norm in the antecedent and only optimizes consequent parameters, which avoids the nondifferentiability of the minimum operator during training.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-min regressor.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "min",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-min regressor."""
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )
    for param in self.membership_layer.parameters():
        param.requires_grad = False

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`HDFISProdClassifier`

Bases: BaseTSKClassifier

HDFIS-prod classifier with dimension-dependent Gaussian MFs.

HDFIS-prod combines the standard product T-norm with a dimension-dependent Gaussian membership function (DMF) to avoid numeric underflow in very high-dimensional feature spaces while preserving first-order TSK consequents.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-prod classifier.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-prod classifier."""
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`HDFISProdRegressor`

Bases: BaseTSKRegressor

HDFIS-prod regressor with dimension-dependent Gaussian MFs.

HDFIS-prod combines the standard product T-norm with a dimension-dependent Gaussian membership function (DMF) to avoid numeric underflow in very high-dimensional feature spaces while preserving first-order TSK consequents.

References

G. Xue, J. Wang, K. Zhang and N. R. Pal, "High-Dimensional Fuzzy Inference Systems," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 507-519, Jan. 2024, doi: 10.1109/TSMC.2023.3311475.

Initialize the HDFIS-prod regressor.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialize the HDFIS-prod regressor."""
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`HTSKClassifier`

Bases: BaseTSKClassifier

HTSK classifier for high-dimensional TSK inference.

HTSK replaces the standard product t-norm with a geometric mean over membership values and performs rule normalization in log-space.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the HTSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	Rule-base construction strategy. `"cartesian"` builds the full Cartesian product; `"coco"` uses a one-cluster-per-rule scheme.	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator name (default `"gmean"` for HTSK).	`'gmean'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable; overrides `t_norm` when provided.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices. If `None`, rules are inferred from `rule_base`.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier module. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Apply batch normalisation to the consequent layer inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the HTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: Rule-base construction strategy.  ``"cartesian"``
            builds the full Cartesian product; ``"coco"`` uses a
            one-cluster-per-rule scheme.
        t_norm: Antecedent aggregation operator name (default
            ``"gmean"`` for HTSK).
        t_norm_fn: Optional custom t-norm callable; overrides
            ``t_norm`` when provided.
        rules: Explicit rule antecedent indices.  If ``None``, rules
            are inferred from ``rule_base``.
        defuzzifier: Custom defuzzifier module.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Apply batch normalisation to the
            consequent layer inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier,
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`HTSKRegressor`

Bases: BaseTSKRegressor

HTSK regressor for high-dimensional TSK inference.

HTSK replaces the standard product t-norm with a geometric mean over membership values and performs rule normalization in log-space.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the HTSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	Rule-base construction strategy (`"cartesian"` or `"coco"`).	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator (default `"gmean"`).	`'gmean'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "gmean",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the HTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: Rule-base construction strategy (``"cartesian"`` or
            ``"coco"``).
        t_norm: Antecedent aggregation operator (default ``"gmean"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SoftmaxLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier,
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`LogTSKClassifier`

Bases: BaseTSKClassifier

LogTSK classifier with inverse-log normalization of log-domain rules.

Firing strengths are normalized using the inverse-log formula, which is immune to softmax saturation in high-dimensional input spaces.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the LogTSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator (default `"prod"`).	`'prod'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the LogTSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or InvLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`LogTSKRegressor`

Bases: BaseTSKRegressor

LogTSK regressor with inverse-log normalization of log-domain rules.

Firing strengths are normalized using the inverse-log formula, which is immune to softmax saturation in high-dimensional input spaces.

References

Y. Cui, D. Wu and Y. Xu, "Curse of Dimensionality for TSK Fuzzy Neural Networks: Explanation and Solutions," 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 2021, pp. 1-8, doi: 10.1109/IJCNN52387.2021.9534265.

Initialise the LogTSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator (default `"prod"`).	`'prod'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the LogTSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.InvLogDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or InvLogDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)

`TSKClassifier`

Bases: BaseTSKClassifier

Vanilla TSK classifier with sum-based rule normalization.

The vanilla Takagi-Sugeno-Kang inference computes rule firing strengths with the product t-norm and normalizes them by their total sum.

References

T. Takagi and M. Sugeno, "Fuzzy identification of systems and its applications to modeling and control," in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-15, no. 1, pp. 116-132, Jan.-Feb. 1985, doi: 10.1109/TSMC.1985.6313399.

Initialise the vanilla TSK classifier.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`n_classes`	`int`	Number of output classes (must be ≥ 2).	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator (default `"prod"`).	`'prod'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Raises:

Type	Description
`ValueError`	If `n_classes < 2`.

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    n_classes: int,
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the vanilla TSK classifier.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        n_classes: Number of output classes (must be ≥ 2).
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.

    Raises:
        ValueError: If ``n_classes < 2``.
    """
    if n_classes < 2:
        raise ValueError("n_classes must be >= 2")
    self.n_classes = int(n_classes)
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted class indices.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted class indices."""
    with torch.no_grad():
        return torch.argmax(self.predict_proba(x), dim=1)

`predict_proba`

Return class probabilities computed with softmax.

Source code in highfis/models.py

def predict_proba(self, x: Tensor) -> Tensor:
    """Return class probabilities computed with softmax."""
    with torch.no_grad():
        logits = self.forward(x)
        return torch.softmax(logits, dim=1)

`TSKRegressor`

Bases: BaseTSKRegressor

Vanilla TSK regressor with sum-based rule normalization.

The vanilla Takagi-Sugeno-Kang inference computes rule firing strengths with the product t-norm and normalizes them by their total sum.

References

T. Takagi and M. Sugeno, "Fuzzy identification of systems and its applications to modeling and control," in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-15, no. 1, pp. 116-132, Jan.-Feb. 1985, doi: 10.1109/TSMC.1985.6313399.

Initialise the vanilla TSK regressor.

Parameters:

Name	Type	Description	Default
`input_mfs`	`Mapping[str, Sequence[MembershipFunction]]`	Mapping from feature name to a sequence of :class:`~highfis.memberships.MembershipFunction` objects.	required
`rule_base`	`str`	`"cartesian"` or `"coco"` rule-base strategy.	`'cartesian'`
`t_norm`	`str`	Antecedent aggregation operator (default `"prod"`).	`'prod'`
`t_norm_fn`	`TNormFn \| None`	Optional custom t-norm callable.	`None`
`rules`	`Sequence[Sequence[int]] \| None`	Explicit rule antecedent indices.	`None`
`defuzzifier`	`nn.Module \| None`	Custom defuzzifier. Defaults to :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.	`None`
`consequent_batch_norm`	`bool`	Batch normalisation on consequent inputs.	`False`

Source code in highfis/models.py

def __init__(
    self,
    input_mfs: Mapping[str, Sequence[MembershipFunction]],
    rule_base: str = "cartesian",
    t_norm: str = "prod",
    t_norm_fn: TNormFn | None = None,
    rules: Sequence[Sequence[int]] | None = None,
    defuzzifier: nn.Module | None = None,
    consequent_batch_norm: bool = False,
) -> None:
    """Initialise the vanilla TSK regressor.

    Args:
        input_mfs: Mapping from feature name to a sequence of
            :class:`~highfis.memberships.MembershipFunction` objects.
        rule_base: ``"cartesian"`` or ``"coco"`` rule-base strategy.
        t_norm: Antecedent aggregation operator (default ``"prod"``).
        t_norm_fn: Optional custom t-norm callable.
        rules: Explicit rule antecedent indices.
        defuzzifier: Custom defuzzifier.  Defaults to
            :class:`~highfis.defuzzifiers.SumBasedDefuzzifier`.
        consequent_batch_norm: Batch normalisation on consequent inputs.
    """
    super().__init__(
        input_mfs,
        rule_base=rule_base,
        t_norm=t_norm,
        t_norm_fn=t_norm_fn,
        rules=rules,
        defuzzifier=defuzzifier or SumBasedDefuzzifier(),
        consequent_batch_norm=consequent_batch_norm,
    )

`fit`

Train the model with optional early stopping.

When x_val and y_val are provided the model evaluates a task-specific metric (via :meth:_evaluate_validation) after every epoch and applies early stopping when the metric has not improved for patience consecutive epochs. By default the best model weights from validation are restored when restore_best=True.

Parameters:

Name	Type	Description	Default
`x`	`Tensor`	Training features of shape `(N, n_inputs)`.	required
`y`	`Tensor`	Training targets of shape `(N,)`.	required
`epochs`	`int`	Maximum number of training epochs.	`200`
`learning_rate`	`float`	Learning rate for the default AdamW optimizer.	`0.001`
`criterion`	`Callable[[Tensor, Tensor], Tensor] \| None`	Optional loss function. Defaults to :meth:`_default_criterion`.	`None`
`optimizer`	`torch.optim.Optimizer \| None`	Optional pre-built optimizer. When `None`, AdamW is constructed with separate parameter groups for antecedent (no weight decay) and consequent (weight_decay) layers.	`None`
`batch_size`	`int \| None`	Mini-batch size. `None` uses the full dataset.	`None`
`shuffle`	`bool`	If `True`, reshuffle sample indices each epoch.	`True`
`ur_weight`	`float`	Non-negative weight for the uniform rule regularization term. `0.0` disables it.	`0.0`
`ur_target`	`float \| None`	Target uniform activation for UR. Must be in `(0, 1]` when provided. `None` defaults to `1 / n_rules`.	`None`
`verbose`	`bool \| int`	Verbosity level. `0` = quiet, `1` = progress bar, `2` = per-epoch summary logging, `3` = per-epoch detailed logging. `True` is accepted as an alias for `2`.	`False`
`x_val`	`Tensor \| None`	Optional validation features of shape `(M, n_inputs)`.	`None`
`y_val`	`Tensor \| None`	Optional validation targets of shape `(M,)`.	`None`
`patience`	`int \| None`	Number of consecutive epochs without improvement before early stopping. Set to `None` to disable early stopping. Only active when x_val and y_val are given.	`20`
`restore_best`	`bool`	If `True` (default), restore the model weights from the best validation epoch when early stopping is used.	`True`
`weight_decay`	`float`	L2 weight decay applied to consequent parameters by the default AdamW optimizer.	`1e-08`

Returns:

Type	Description
`dict[str, Any]`	A dictionary with keys `"train"`, `"ur"`, and `"val"`
`dict[str, Any]`	containing per-epoch loss lists.

Raises:

Type	Description
`ValueError`	If shapes of x, y, x_val, or y_val are incompatible, or if ur_weight < 0 or ur_target is outside `(0, 1]`.

Source code in highfis/base.py

def fit(
    self,
    x: Tensor,
    y: Tensor,
    epochs: int = 200,
    learning_rate: float = 1e-3,
    criterion: Callable[[Tensor, Tensor], Tensor] | None = None,
    optimizer: torch.optim.Optimizer | None = None,
    batch_size: int | None = None,
    shuffle: bool = True,
    ur_weight: float = 0.0,
    ur_target: float | None = None,
    verbose: bool | int = False,
    x_val: Tensor | None = None,
    y_val: Tensor | None = None,
    patience: int | None = 20,
    restore_best: bool = True,
    weight_decay: float = 1e-8,
) -> dict[str, Any]:
    """Train the model with optional early stopping.

    When *x_val* and *y_val* are provided the model evaluates a
    task-specific metric (via :meth:`_evaluate_validation`) after every
    epoch and applies early stopping when the metric has not improved for
    *patience* consecutive epochs.
    By default the best model weights from validation are restored when
    ``restore_best=True``.

    Args:
        x: Training features of shape ``(N, n_inputs)``.
        y: Training targets of shape ``(N,)``.
        epochs: Maximum number of training epochs.
        learning_rate: Learning rate for the default AdamW optimizer.
        criterion: Optional loss function.  Defaults to
            :meth:`_default_criterion`.
        optimizer: Optional pre-built optimizer.  When ``None``, AdamW
            is constructed with separate parameter groups for antecedent
            (no weight decay) and consequent (*weight_decay*) layers.
        batch_size: Mini-batch size.  ``None`` uses the full dataset.
        shuffle: If ``True``, reshuffle sample indices each epoch.
        ur_weight: Non-negative weight for the uniform rule
            regularization term.  ``0.0`` disables it.
        ur_target: Target uniform activation for UR.  Must be in
            ``(0, 1]`` when provided.  ``None`` defaults to
            ``1 / n_rules``.
        verbose: Verbosity level. ``0`` = quiet, ``1`` = progress bar,
            ``2`` = per-epoch summary logging, ``3`` = per-epoch detailed
            logging. ``True`` is accepted as an alias for ``2``.
        x_val: Optional validation features of shape
            ``(M, n_inputs)``.
        y_val: Optional validation targets of shape ``(M,)``.
        patience: Number of consecutive epochs without improvement
            before early stopping.  Set to ``None`` to disable early
            stopping.  Only active when *x_val* and *y_val* are given.
        restore_best: If ``True`` (default), restore the model weights
            from the best validation epoch when early stopping is used.
        weight_decay: L2 weight decay applied to consequent parameters
            by the default AdamW optimizer.

    Returns:
        A dictionary with keys ``"train"``, ``"ur"``, and ``"val"``
        containing per-epoch loss lists.

    Raises:
        ValueError: If shapes of *x*, *y*, *x_val*, or *y_val* are
            incompatible, or if *ur_weight* < 0 or *ur_target* is
            outside ``(0, 1]``.
    """
    if x.ndim != 2 or x.shape[1] != self.n_inputs:
        raise ValueError(f"expected x shape (batch, {self.n_inputs}), got {tuple(x.shape)}")
    if y.ndim != 1:
        raise ValueError("expected y shape (batch,)")
    if ur_weight < 0.0:
        raise ValueError("ur_weight must be >= 0")
    if ur_target is not None and not (0.0 < ur_target <= 1.0):
        raise ValueError("ur_target must be in (0, 1] when provided")

    has_val = x_val is not None and y_val is not None
    if has_val:
        if x_val is None or y_val is None:  # pragma: no cover
            raise ValueError("x_val and y_val must both be provided")
        if x_val.ndim != 2 or x_val.shape[1] != self.n_inputs:
            raise ValueError(f"expected x_val shape (batch, {self.n_inputs}), got {tuple(x_val.shape)}")
        if y_val.ndim != 1:
            raise ValueError("expected y_val shape (batch,)")

    train_criterion = criterion or self._default_criterion()
    if optimizer is not None:
        train_optimizer = optimizer
    else:
        ante_params = list(self.membership_layer.parameters())
        rule_params = list(self.rule_layer.parameters())
        cons_params = list(self.consequent_layer.parameters())
        if self.consequent_bn is not None:
            cons_params.extend(self.consequent_bn.parameters())
        train_optimizer = torch.optim.AdamW(
            [
                {"params": ante_params, "weight_decay": 0.0},
                {"params": rule_params, "weight_decay": 0.0},
                {"params": cons_params, "weight_decay": weight_decay},
            ],
            lr=learning_rate,
        )

    history: dict[str, Any] = {"train": [], "ur": [], "val": []}
    best_metric = float("-inf")
    epochs_no_improve = 0
    best_state: dict[str, Any] | None = None
    verbose_level = self._resolve_verbose(verbose)

    self.train()
    pbar = None
    if verbose_level == 1:
        pbar = trange(epochs, desc="Training", leave=False)
        epoch_iterator = pbar
    else:
        epoch_iterator = range(epochs)

    for epoch in epoch_iterator:
        batch_losses: list[float] = []
        batch_ur_losses: list[float] = []
        for batch_idx in _iter_minibatch_indices(x.shape[0], batch_size=batch_size, shuffle=shuffle):
            x_b = x.index_select(0, batch_idx.to(device=x.device))
            y_b = y.index_select(0, batch_idx.to(device=y.device))

            train_optimizer.zero_grad(set_to_none=True)
            output, norm_w = self._forward_train(x_b)
            main_loss = self._compute_loss(train_criterion, output, y_b)

            ur_loss = _uniform_regularization_loss(norm_w, target=ur_target)
            loss = main_loss + (float(ur_weight) * ur_loss)
            loss.backward()
            train_optimizer.step()

            batch_losses.append(float(loss.detach().item()))
            batch_ur_losses.append(float(ur_loss.detach().item()))

        epoch_train_loss = float(sum(batch_losses) / max(len(batch_losses), 1))
        history["train"].append(epoch_train_loss)
        history["ur"].append(float(sum(batch_ur_losses) / max(len(batch_ur_losses), 1)))

        # --- validation & early stopping ---
        if has_val and x_val is not None and y_val is not None:
            self.eval()
            val_info = self._evaluate_validation(train_criterion, x_val, y_val)
            history["val"].append(val_info.get("val_loss", 0.0))
            # Store any extra keys (e.g. val_acc) in history
            for k, v in val_info.items():
                if k not in ("val_loss", "metric"):
                    history.setdefault(k, []).append(v)
            self.train()

            metric = val_info["metric"]
            if metric > best_metric:
                best_metric = metric
                epochs_no_improve = 0
                best_state = copy.deepcopy(self.state_dict())
            else:
                epochs_no_improve += 1

            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                postfix = [
                    f"train={epoch_train_loss:.4f}",
                    f"val={val_info.get('val_loss', 0.0):.4f}",
                ]
                pbar.set_postfix_str(" ".join(postfix))
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                log_parts = [
                    f"epoch={epoch + 1}/{epochs}",
                    f"train_loss={epoch_train_loss:.6f}",
                ]
                for k, v in val_info.items():
                    if k != "metric":
                        log_parts.append(f"{k}={v:.6f}" if isinstance(v, float) else f"{k}={v}")
                self._log(" ".join(log_parts), verbose=verbose_level)

            if patience is not None and epochs_no_improve >= patience:
                if verbose_level >= 2:
                    self._log(
                        "early stopping at epoch %s (patience=%s)",
                        epoch + 1,
                        patience,
                        verbose=verbose_level,
                    )
                break
        else:
            if verbose_level == 1:
                if pbar is None:  # pragma: no cover
                    raise RuntimeError("progress bar unavailable for verbose level 1")
                pbar.set_postfix_str(f"loss={epoch_train_loss:.4f}")
            if verbose_level >= 2 and (
                verbose_level == 3 or ((epoch + 1) % max(epochs // 10, 1) == 0 or epoch == 0)
            ):
                self._log(
                    "epoch=%s/%s loss=%.6f",
                    epoch + 1,
                    epochs,
                    epoch_train_loss,
                    verbose=verbose_level,
                )

    if pbar is not None:
        pbar.close()

    if restore_best and best_state is not None:
        self.load_state_dict(best_state)

    history["stopped_epoch"] = epoch + 1  # type: ignore[possibly-undefined]

    return history

`forward`

Full forward pass through the TSK pipeline.

Source code in highfis/base.py

def forward(self, x: Tensor) -> Tensor:
    """Full forward pass through the TSK pipeline."""
    output, _ = self._forward_train(x)
    return output

`forward_antecedents`

Compute normalized rule strengths from model antecedents.

Source code in highfis/base.py

def forward_antecedents(self, x: Tensor) -> Tensor:
    """Compute normalized rule strengths from model antecedents."""
    mu = self.membership_layer(x)
    w = self.rule_layer(mu)
    return cast(Tensor, self.defuzzifier(w))

`predict`

Return predicted values as a 1-D tensor.

Source code in highfis/models.py

def predict(self, x: Tensor) -> Tensor:
    """Return predicted values as a 1-D tensor."""
    with torch.no_grad():
        return self.forward(x).squeeze(1)