Layers¶
manify.predictors.nn.layers
¶
Neural network layers for product manifolds.
KappaGCNLayer(in_features, out_features, manifold, nonlinearity=torch.relu)
¶
Bases: Module
Implementation for the Kappa GCN layer.
| Parameters: |
|
|---|
| Attributes: |
|
|---|
Source code in manify/predictors/nn/layers.py
33 34 35 36 37 38 39 40 41 42 43 44 45 | |
forward(X, A_hat=None)
¶
Forward pass for the Kappa GCN layer.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
KappaSequential(*layers)
¶
Bases: Module
Sequential container for κ-layers that properly handles adjacency matrices.
Similar to nn.Sequential but passes the adjacency matrix through each layer. All layers should accept (X, A_hat) and return X.
| Parameters: |
|
|---|
Source code in manify/predictors/nn/layers.py
114 115 116 | |
forward(X, A_hat=None)
¶
Forward pass through all layers.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
append(layer)
¶
Add a layer to the end of the sequence.
Source code in manify/predictors/nn/layers.py
134 135 136 | |
StereographicLogits(out_features, manifold, apply_softmax=False)
¶
Bases: Module
Stereographic logits layer for classification and regression on product manifolds.
Computes signed distances from hyperplanes in the product manifold space. Can optionally apply softmax for classification tasks.
| Parameters: |
|
|---|
Source code in manify/predictors/nn/layers.py
170 171 172 173 174 175 176 177 178 179 180 181 | |
forward(X, A_hat=None, aggregate_logits=False)
¶
Forward pass through stereographic logits.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | |
FermiDiracDecoder(manifold, learnable_params=True)
¶
Bases: Module
Fermi-Dirac decoder for link prediction tasks.
Computes pairwise distances and applies Fermi-Dirac transformation to predict edge probabilities.
| Parameters: |
|
|---|
Source code in manify/predictors/nn/layers.py
308 309 310 311 312 313 314 315 316 317 318 | |
forward(X, A_hat=None)
¶
Forward pass through Fermi-Dirac decoder.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | |
StereographicLayerNorm(manifold, embedding_dim)
¶
Bases: Module
Stereographic Layer Normalization.
Layer normalization is undefined directly on a curved manifold, so we apply an ordinary Euclidean
nn.LayerNorm in the tangent space at the origin (logmap0 -> LayerNorm -> expmap0).
For a stereographic ProductManifold the tangent space at the origin is Euclidean of dimension
manifold.dim and logmap0/expmap0 handle the per-component curvatures, so no explicit
curvature broadcasting is required. The output is re-projected onto the manifold for numerical
safety. In the curvature-zero limit this reduces to a plain LayerNorm.
| Parameters: |
|
|---|
| Attributes: |
|
|---|
Source code in manify/predictors/nn/layers.py
388 389 390 391 392 | |
forward(X)
¶
Apply layer normalization on the stereographic manifold.
Source code in manify/predictors/nn/layers.py
394 395 396 | |
GeometricLinearizedAttention(num_heads, head_dim)
¶
Bases: Module
Faithful gyrovector linear attention (FPS-T, arXiv:2309.04082, Eqs 6, 7, 11).
This is the kernelized mixed-curvature attention from "Curve Your Attention: Mixed-Curvature
Transformers for Graph Representation Learning". It operates per head on its own
:math:\kappa_h-stereographic space. Following the rest of manify it works on a single
graph (no batch dimension): inputs are [num_heads, n_nodes, head_dim] and the mask is the
[n_nodes, n_nodes] adjacency matrix (None means full attention).
Inputs
V-- value points on the per-head :math:\kappa_h-stereographic manifold.Q,K-- query/key tangent vectors at the correspondingV_i(Eq 5).
Scores (Eq 6): parallel-transport Q_i and K_j to the origin (parallel_transport0back),
apply the feature map :math:\phi(x)=\mathrm{ELU}(x)+1, and take a Euclidean inner product there:
:math:\alpha_{ij}\approx\phi(\tilde Q_i)^\top\phi(\tilde K_j).
Aggregation (Eq 7) is the Einstein midpoint, kernelized (Eq 11):
where :math:\lambda^\kappa is the conformal factor and :math:\tfrac12\otimes_\kappa is Mobius
scalar multiplication. Writing :math:\tilde V_i=\frac{\lambda^\kappa_{V_i}}{\lambda^\kappa_{V_i}-1}V_i
and :math:\phi'(\tilde K)_i=\phi(\tilde K_i)(\lambda^\kappa_{V_i}-1), the per-query output is
:math:\tfrac12\otimes_\kappa\big[\phi(\tilde Q)\,(\phi'(\tilde K)^\top \tilde V)\big]_i, which is
:math:O(N+M) in the full-attention case. As :math:\kappa\to0 the Einstein midpoint reduces to
the ordinary (Euclidean) weighted mean.
| Parameters: |
|
|---|
| Attributes: |
|
|---|
Source code in manify/predictors/nn/layers.py
438 439 440 441 442 443 444 | |
forward(Q, K, V, curvatures, mask=None)
¶
Forward pass for faithful gyrovector linear attention.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 | |
StereographicAttention(manifold, num_heads, dim, head_dim, init_curvatures=None)
¶
Bases: Module
Mixed-curvature multi-head attention for a single graph of [n_nodes, dim] tokens (FPS-T).
Faithful implementation of the multi-head attention from "Curve Your Attention" (arXiv:2309.04082).
Each head h operates on its OWN :math:\kappa_h-stereographic space with an independent
learnable curvature, and the multi-head output is the product over heads
(:math:\bigotimes_h \mathrm{st}_{\kappa_h}) -- heads are product-manifold components, so per-head
outputs are concatenated and never reshaped across heads. The per-head curvatures are decoupled
from the input manifold.
Per head (Eq 5): values are points on :math:\mathrm{st}_{\kappa_h} and queries/keys live in the
tangent space at the corresponding value point. We obtain these by mapping the input to the tangent
space at the origin (logmap0), applying an ordinary Euclidean nn.Linear to the head
dimension, and exponentiating into :math:\mathrm{st}_{\kappa_h} for the values; queries/keys are
the (Euclidean) projections re-based to the tangent space at each value point via parallel
transport from the origin. Aggregation is the gyrovector Einstein midpoint
(:class:GeometricLinearizedAttention). The masked output product manifold is mapped back to the
input manifold by a tangent-space linear projection (logmap0 -> Linear -> expmap0).
| Parameters: |
|
|---|
| Attributes: |
|
|---|
Source code in manify/predictors/nn/layers.py
547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 | |
forward(X, mask=None)
¶
Forward pass for the mixed-curvature attention layer.
Source code in manify/predictors/nn/layers.py
579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 | |
StereographicTransformer(manifold, num_heads, dim, head_dim, use_layer_norm=True, init_curvatures=None)
¶
Bases: Module
Mixed-curvature transformer block on a single graph of [n_nodes, dim] tokens (FPS-T).
A pre-norm transformer block adapted to a stereographic (product) manifold per "Curve Your
Attention" (arXiv:2309.04082): a mixed-curvature multi-head attention sublayer
(:class:StereographicAttention, faithful gyrovector Einstein-midpoint aggregation with learnable
per-head curvatures) followed by a manifold feedforward sublayer, each wrapped in a Mobius-addition
residual connection. Tokens are graph nodes; the mask is the adjacency matrix A_hat
(None for full attention). As the curvatures vanish the block reduces to a standard Euclidean
linear-attention transformer block.
| Parameters: |
|
|---|
| Attributes: |
|
|---|
Source code in manify/predictors/nn/layers.py
647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 | |
forward(X, mask=None)
¶
Forward pass through the mixed-curvature transformer block.
| Parameters: |
|
|---|
| Returns: |
|
|---|
Source code in manify/predictors/nn/layers.py
683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 | |