Skip to content

[🐛 Bug]: Running tests in parallel in docker causing tests to choose the wrong open window #15609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jonn-set opened this issue Apr 10, 2025 · 10 comments
Labels
A-needs-triaging A Selenium member will evaluate this soon! B-grid Everything grid and server related C-java Java Bindings D-chrome I-defect Something is not working as intended OS-linux

Comments

@jonn-set
Copy link

jonn-set commented Apr 10, 2025

Description

I am running Selenium tests in parallel inside a Docker container (standalone-chrome) using the Surefire plugin. 80-90% of my tests fail and the screenshots captured in the After hook seem to indicate that a different test was running, like its querying a customer that is not part of thread the test is running in, etc., extremely difficult to put a finger on it and say what's wrong. I also see a lot of org.openqa.selenium.StaleElementReferenceException: stale element reference: stale element not found in the current frame. Which seems to indicate the test is trying to access the wrong open browser window.

I also use Spring and am using SimpleThreadScope() to create a 'thread' scope, the WebDriver & ScenarioContext are all marked as @Scope(scopeName = "thread") so I believe the configuration is correct as they all seem to work as expected in Windows on Edge, however when I run this in docker, it does not work.

My Project code base is very big and cant be shared on the forum, but I will try to create a separate project and share the link here, in the meantime if anyone notice anything obvious I am doing wrong, can they comment and let me know please.

Reproducible Code

These are the options I pass into the RemoteWebDriver Object when I build it:

--enable-automation
--disable-extensions
--disable-client-side-phishing-detection
--no-default-browser-check
--no-first-run
--enable-logging=stderr
--log-level=0
--headless
--disable-dev-shm-usage
--no-sandbox
--user-data-dir=/home/seluser/.config/google-chrome/default


I am starting Docker with the following parameters:

docker run -d \
    -p 4444:4444 \
    --name selenium-chrome \
    --env SE_NODE_GRID_URL=http://localhost:4444 \
    --env SE_NODE_MAX_SESSIONS=4 \
    --env NODE_CONNECTION_LIMIT_PER_SESSION=4 \
    --env SE_NODE_OVERRIDE_MAX_SESSIONS=true \
	selenium-chrome


This is my Surefire config for parallel running of tests:

    <parallel>methods</parallel>
    <threadCountMethods>4</threadCountMethods>
    <perCoreThreadCount>false</perCoreThreadCount>
@jonn-set jonn-set added A-needs-triaging A Selenium member will evaluate this soon! I-defect Something is not working as intended labels Apr 10, 2025
@selenium-ci
Copy link
Member

@jonn-set, thank you for creating this issue. We will troubleshoot it as soon as we can.

Selenium Triage Team: remember to follow the Triage Guide

@github-actions github-actions bot added B-grid Everything grid and server related C-java Java Bindings D-chrome OS-linux labels Apr 10, 2025
@cgoldberg cgoldberg added the J-issue-template Applied to issues not following the template, or missing information. label Apr 10, 2025
@selenium-ci
Copy link
Member

Hi, @jonn-set.
Please follow the issue template; we need more information to reproduce the issue.

Either a complete code snippet and URL/HTML (if more than one file is needed, provide a GitHub repo and instructions to run the code), the specific versions used, or a more detailed description to help us understand the issue.

Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.

Reply to this issue when all information is provided. Thank you.

@cgoldberg
Copy link
Contributor

We would need to see code that can reproduce this along with some logging to even begin to troubleshoot this.

@jonn-set
Copy link
Author

I will create a project share it here, gimme some time please.

@cgoldberg cgoldberg added the J-awaiting answer Question asked of user; a reply moves it to triage again label Apr 10, 2025
@github-actions github-actions bot removed J-issue-template Applied to issues not following the template, or missing information. J-awaiting answer Question asked of user; a reply moves it to triage again labels Apr 10, 2025
@jonn-set
Copy link
Author

Sorry, I haven't been able to create a separate project to isolate this issue. But after doing some debugging, I noticed that when starting the container on RHEL 8 agent (which has Podman and not Docker), its unable to create more than one Browser window inside the container.

I am passing the below to docker run command but it does not seem to be working, its always opening only one window

    --env SE_NODE_MAX_SESSIONS=4 \
    --env NODE_CONNECTION_LIMIT_PER_SESSION=4 \
    --env SE_NODE_OVERRIDE_MAX_SESSIONS=true \

Screenshot from logs showing multiple driver instances being created but with the same Window handle:

Image

Can someone advise if I need to pass some other parameter to ensure it opens more than 1 browser window? Could it be processor limitation?

What is surprising is that with Podman running in WSL on Windows 10 (my local dev machine), it successfully opens 4 browsers windows inside the container and the tests work. The issue is only seen on RHEL 8 agents on CI.

@diemol
Copy link
Member

diemol commented Apr 23, 2025

You are most likely reusing the driver, probably using the static keyword in Java.

@jonn-set
Copy link
Author

You are most likely reusing the driver, probably using the static keyword in Java.

No I am not, I am using Spring to autowire driver instance, my driver instances are not static, more over it works locally on Windows, but does not on a RHEL 8 agent running on TeamCity in a container:

  @Bean(name = "driver", destroyMethod = "")
  @Scope(scopeName = "thread")
  public WebDriver getDriver() {
    WebDriver driver = setupChrome();
    String windowHandle = driver.getWindowHandle();
    log.info("Opened window with handle--> {}", windowHandle);
    Runtime.getRuntime()
        .addShutdownHook(
            new Thread(
                () -> {
                  driver.quit();
                  log.info("Thread {}: Shutdown WebDriver", getThreadNumber());
                }));
    return driver;
  }

This bean is thread scoped, my Thread Scope config:

@Configuration
@ComponentScan(basePackages = "com.xxx.xxx")
@Slf4j
public class ScopeConfig {
  @Bean
  public CustomScopeConfigurer threadScopeConfigurer() {
    CustomScopeConfigurer configurer = new CustomScopeConfigurer();
    SimpleThreadScope threadScope = new SimpleThreadScope();
    configurer.setScopes(Map.of("thread", threadScope));
    return configurer;
  }

All my test classes have this:

Image

@titusfortner
Copy link
Member

Few things
You shouldn't use the same user-data-dir with multiple sessions in parallel, Chrome doesn’t support sharing profiles safely, and this can definitely cause crashes or state leakage
I'm not sure how many tests you are running and how many in parallel, but this code won't quit any drivers until the JVM starts its shutdown process, which doesn't seem ideal.

TestNG is notoriously difficult to run in parallel. Your scope looks ok, unless you need to add a proxyMode because you are injecting the driver into multiple classes. Regardless, double check if it is running multiple with something like this in the getDriver() method:

log.info("Thread {} -> WebDriver instance: {}, Window handle: {}",
    Thread.currentThread().getId(),
    System.identityHashCode(driver),
    driver.getWindowHandle());

@titusfortner titusfortner added J-awaiting answer Question asked of user; a reply moves it to triage again and removed A-needs-triaging A Selenium member will evaluate this soon! labels Apr 23, 2025
@jonn-set
Copy link
Author

@titusfortner thanks for your reply. I am running around 90 tests with 4 threads in parallel. I had to use the addShutdownHook because the SimpleThreadScope does not support registerDestructionCallback and registering one manually was proving to be very difficult. I am not using TestNG, just JUnit 4+Cucumber.

I was explicitly setting chromiumOptions.add("--user-data-dir=/home/seluser/.config/google-chrome/default");

After removing this my tests are working fine, thanks for your help, really appreciate it.

Logs showing different handles now:
Image

I will run the suite a few times, if everything's fine, will close this then.

@github-actions github-actions bot added A-needs-triaging A Selenium member will evaluate this soon! and removed J-awaiting answer Question asked of user; a reply moves it to triage again labels Apr 24, 2025
@jonn-set
Copy link
Author

It works now, closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-needs-triaging A Selenium member will evaluate this soon! B-grid Everything grid and server related C-java Java Bindings D-chrome I-defect Something is not working as intended OS-linux
Projects
None yet
Development

No branches or pull requests

5 participants