Expand PCA sizes #191

david-cortes-intel · 2025-10-28T17:17:18Z

Description

This PR expands the sizes of the synthetic datasets used to benchmark PCA.

Currently, these cases involve 3 components, in many cases out of thousands of features, which is not a representative application and thus not a good candidate for benchmarking. The PR expands those to 20 which is more reasonable.

It also makes the synthetic datasets wider (=more columns) and shorter (=fewer rows) as large-scale PCA is for the most part meant to be applied to wide datasets, and substantially increases the sizes of the inputs for .transform() as the benchmarks for those cases are very short.

Note that this PR might increase the time it takes to execute a benchmark run, especially from the data generation step. I do not know how much the timings will change if this is merged.

Checklist:

Completeness and readability

Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.

david-cortes-intel · 2025-10-28T17:56:17Z

CI errors should be fixed once this PR is merged in sklearnex: uxlfoundation/scikit-learn-intelex#2741

expand PCA sizes

1bcdccf

david-cortes-intel requested review from Vika-F and avolkov-intel October 28, 2025 17:17

david-cortes-intel requested a review from Alexsandruss as a code owner October 28, 2025 17:17

david-cortes-intel added the datasets Extension or fix load dataset label Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expand PCA sizes #191

Expand PCA sizes #191

Uh oh!

david-cortes-intel commented Oct 28, 2025

Uh oh!

david-cortes-intel commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Expand PCA sizes #191

Are you sure you want to change the base?

Expand PCA sizes #191

Uh oh!

Conversation

david-cortes-intel commented Oct 28, 2025

Description

Uh oh!

david-cortes-intel commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant