Bases: BaseEstimator, TransformerMixin, ABC
Base class for everything in manify.embedders.
This is an abstract class that that defines a common interface for all embedding methods. We assume only that a
ProductManifold object is given. We try to follow the scikit-learn API's fit/transform paradigm as closely as
possible, while accommodating the nuances of product manifold geometry and Pytorch/Geoopt.
| Attributes: |
-
pm
–
ProductManifold object associated with the embedder.
-
random_state
–
Random state for reproducibility.
-
device
–
Device for tensor computations. If not provided, defaults to pm.device.
-
loss_history_
(dict[str, list[float]])
–
History of loss values during training.
-
is_fitted_
(bool)
–
Boolean flag indicating if the embedder has been fitted.
|
Source code in manify/embedders/_base.py
| def __init__(self, pm: ProductManifold, random_state: int | None = None, device: str | None = None) -> None:
self.pm = pm
self.random_state = random_state
self.device = device or pm.device
self.loss_history_: dict[str, list[float]] = {}
self.is_fitted_: bool = False
|
fit(X=None, D=None, lr=0.01, burn_in_lr=0.001, curvature_lr=0.0, burn_in_iterations=2000, training_iterations=18000, loss_window_size=100, logging_interval=10)
abstractmethod
Abstract method to fit an embedder. Requires at least one of (features, distances).
| Parameters: |
-
X
(Float[Tensor, 'n_points n_features'] | None, default:
None
)
–
Features to embed. Used by Mixed-curvature VAE and Siamese Network classes.
-
D
(Float[Tensor, 'n_points n_points'] | None, default:
None
)
–
Distances to embed. Used by coordinate learning and Siamese Network classes.
-
lr
(float, default:
0.01
)
–
Learning rate for the main training phase.
-
burn_in_lr
(float, default:
0.001
)
–
Learning rate for the burn-in phase.
-
curvature_lr
(float, default:
0.0
)
–
Learning rate for optimizing manifold scale factors. Off (no learning) by default.
-
burn_in_iterations
(int, default:
2000
)
–
Number of iterations for the burn-in phase.
-
training_iterations
(int, default:
18000
)
–
Number of iterations for the main training phase.
-
loss_window_size
(int, default:
100
)
–
Window size for computing moving average loss.
-
logging_interval
(int, default:
10
)
–
Interval for logging training progress.
|
| Returns: |
-
self( 'BaseEmbedder'
) –
Fitted embedder instance.
|
Source code in manify/embedders/_base.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69 | @abstractmethod
def fit(
self,
X: Float[torch.Tensor, "n_points n_features"] | None = None,
D: Float[torch.Tensor, "n_points n_points"] | None = None,
lr: float = 1e-2,
burn_in_lr: float = 1e-3,
curvature_lr: float = 0.0, # Off by default
burn_in_iterations: int = 2_000,
training_iterations: int = 18_000,
loss_window_size: int = 100,
logging_interval: int = 10,
) -> "BaseEmbedder":
"""Abstract method to fit an embedder. Requires at least one of (features, distances).
Args:
X: Features to embed. Used by Mixed-curvature VAE and Siamese Network classes.
D: Distances to embed. Used by coordinate learning and Siamese Network classes.
lr: Learning rate for the main training phase.
burn_in_lr: Learning rate for the burn-in phase.
curvature_lr: Learning rate for optimizing manifold scale factors. Off (no learning) by default.
burn_in_iterations: Number of iterations for the burn-in phase.
training_iterations: Number of iterations for the main training phase.
loss_window_size: Window size for computing moving average loss.
logging_interval: Interval for logging training progress.
Returns:
self: Fitted embedder instance.
"""
pass
|
Apply embedding to new data. Not defined for coordinate learning.
| Parameters: |
-
X
(Float[Tensor, 'n_points n_features'] | None)
–
New features to embed using the trained embedder.
|
| Returns: |
-
X_embedded( Float[Tensor, 'n_points embedding_dim']
) –
Embedded representation of the input features.
|
Source code in manify/embedders/_base.py
71
72
73
74
75
76
77
78
79
80
81
82
83 | @abstractmethod
def transform(
self, X: Float[torch.Tensor, "n_points n_features"] | None
) -> Float[torch.Tensor, "n_points embedding_dim"]:
"""Apply embedding to new data. Not defined for coordinate learning.
Args:
X: New features to embed using the trained embedder.
Returns:
X_embedded: Embedded representation of the input features.
"""
pass
|
Fit the embedder and transform the data in one step.
| Parameters: |
-
X
(Float[Tensor, 'n_points n_features'] | None, default:
None
)
–
-
D
(Float[Tensor, 'n_points n_points'] | None, default:
None
)
–
-
**fit_kwargs
(Any, default:
{}
)
–
Additional keyword arguments for fitting.
|
| Returns: |
-
X_embedded( Float[Tensor, 'n_points embedding_dim']
) –
Embedded representation of the input features.
|
Source code in manify/embedders/_base.py
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101 | def fit_transform(
self,
X: Float[torch.Tensor, "n_points n_features"] | None = None,
D: Float[torch.Tensor, "n_points n_points"] | None = None,
**fit_kwargs: Any,
) -> Float[torch.Tensor, "n_points embedding_dim"]:
"""Fit the embedder and transform the data in one step.
Args:
X: Features to embed.
D: Distances to embed.
**fit_kwargs: Additional keyword arguments for fitting.
Returns:
X_embedded: Embedded representation of the input features.
"""
return self.fit(X=X, D=D, **fit_kwargs).transform(X=X)
|