Skip to content

Performance #228

@lidatong

Description

@lidatong

Performance in general is on my radar as things to tackle next, as this library gains traction, and the top of a 1.0 release checklist.

In general after some thought I don't think caching / memoization is the right way to tackle this. A few reasons why:

  1. it requires careful thought about how it behaves under concurrency, specifically with respect to memory visibility
  2. could have a big memory footprint on large codebases with a lot of composite dataclasses, and potentially duplicated across threads!
  3. immutability -- should the cached object be mutable / how can we protect it from changes?

Instead, I think an approach involving code generation is the way to go -- similar to how the dataclasses core module itself is implemented. When you think about it, a schema is only generated once and known at "module-load time". In other languages we might call this "compile-time". We can see the code-generation approach utilized in codec schema libraries in other languages, be it json or even other data-interchange formats like protobuf

Going this route, the schema now is loaded as just more code, so to speak, instead of living in memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions