Skip to content

Potential Semantic Misalignment in Temporal Embedding During Cross-Market Pretraining #168

@SoYuCry

Description

@SoYuCry

🧩 Description

Hey team 👋 — I was reviewing the TemporalEmbedding part of Kronos and noticed something that might cause problems during multi-market pretraining.

Right now, the model always adds time embeddings (minute/hour/weekday/day/month) to the price features — both in pretraining and fine-tuning.
But these time features don’t mean the same thing across markets:

  • hour = 9 could be Tokyo’s market open,
  • while in New York that’s still pre-market,
  • and in London it’s mid-session.

So the same temporal token ends up representing completely different market states, which basically introduces structured noise into the latent space.


⚠️ Why it Matters — Temporal Noise Contamination

Since the input is combined as
x_embed = price_embed + time_embed,
any mismatch in temporal semantics directly pollutes the feature space.

That leads to a few issues:

  1. The encoder spends capacity trying to “de-noise” these temporal inconsistencies instead of learning clean price dynamics.
  2. Latent representations across markets get mixed up — embeddings no longer correspond to consistent time contexts.
  3. The decoder’s autoregressive predictions can become unstable or fail to converge, because it’s conditioned on noisy, misaligned latent codes.

In practice, this might explain why pretraining across very different markets sometimes gives unstable results or poor downstream transfer.


💡 Some idea

To improve semantic consistency and robustness of pretraining:

  1. Disable or freeze TemporalEmbedding during multi-market pretraining
    (e.g., use fixed sinusoidal encodings or set stamp=None).
  2. Enable learnable TemporalEmbedding during single-market fine-tuning,
    where temporal semantics are consistent (e.g., daily/weekly cycles).
  3. Optionally, introduce a market_id embedding to let the model condition temporal semantics by market.

🧩 问题概述

👋 团队好,我在 review Kronos 的 TemporalEmbedding 模块 时注意到一个可能在 多市场预训练阶段 引发问题的地方。目前,模型在预训练和微调阶段都会将时间嵌入(分钟 / 小时 / 星期 / 日期 / 月)与价格特征相加。
但这些时间特征在不同市场中的语义并不相同:

  • hour = 9 在东京可能代表开盘,
  • 在纽约仍处于盘前,
  • 在伦敦则是交易中段。

结果是,相同的时间 token 代表了完全不同的市场状态,这实际上为潜在空间引入了结构化噪声


⚠️ 问题影响

由于输入被组合为:

x_embed = price_embed + time_embed

时间语义的不一致会直接污染特征空间,导致以下问题:

  1. 编码器浪费模型容量去“去噪”这些时间差异,而不是学习纯净的价格动态;
  2. 不同市场的潜在表示被混合,嵌入不再对应一致的时间上下文;
  3. 解码器在噪声和错位的潜在代码条件下进行自回归预测,容易导致不稳定或不收敛。

在实践中,这可能解释了为什么在非常不同的市场之间进行预训练时,模型往往表现出不稳定或迁移效果较差的现象。


💡 改进方向

为了提升预训练阶段的语义一致性和鲁棒性,可以考虑以下方案:

  • 🔹 在多市场预训练阶段禁用或冻结 TemporalEmbedding
    (例如使用固定的正弦位置编码,或设置 stamp=None);

  • 🔹 在单市场微调阶段启用可学习的 TemporalEmbedding
    此时时间语义保持一致(例如日周期或周周期);

  • 🔹 可选:引入 market_id 嵌入
    让模型能够根据不同市场调整时间语义。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions