Skip to content

Question about Predicted Label Dimensions and the Usages of Test Node Label #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
YanJiangJerry opened this issue Mar 2, 2025 · 1 comment

Comments

@YanJiangJerry
Copy link

YanJiangJerry commented Mar 2, 2025

Thanks for your great work, I have a quick question regarding label dimension and Figure 3 in your paper. In the inductive attention mechanism, the t×c dimension—does the c-dimension correspond to the label dimension of the training set or the test set? Additionally, how does the Linear GNN trained on the training set ensure that the output matches the label dimension of the test set if the label dimension is not consistent, such as training on cora and inference on arxiv? I understand the label-permutation invariance of the distance of LinearGNN predictions, but how can the linearGNN predict the exact test label dimension without any labelled test data? Does GraphAny need to use some test node labels to get the W for linearGNN?
Thanks in advance.

@YanJiangJerry YanJiangJerry changed the title Question about Inductive Predicted Label Attention Dimensions Question about Predicted Label Dimensions Mar 2, 2025
@YanJiangJerry YanJiangJerry changed the title Question about Predicted Label Dimensions Question about Predicted Label Dimensions and the Usages of Test Node Label Mar 2, 2025
@DuanhaoranCC
Copy link

Hello, I have similar questions regarding author paper.

  1. As I understand, the linear GNN mentioned in the paper is essentially SGC, Laplacian transformations, and other message-passing schemes. However, I could not find the analytical solution explicitly stated. Could you please point out where the analytical solution is discussed in the paper?

  2. Additionally, entropy normalization and distance normalization seem to serve the purpose of stabilizing and smoothing the model.

  3. Finally, as you mentioned, the GraphAny model trained on Cora has only 7 classes, while the Arxiv dataset has 40 classes. How are the results on new datasets reported in the table obtained? Are they based on few-shot learning, in-context learning, or zero-shot evaluation?

Thank you in advance for your time and clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants