Skip to content

Recover when connection cannot be established straight at startup #415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

yann-soubeyrand
Copy link
Contributor

@yann-soubeyrand yann-soubeyrand commented Jul 20, 2020

When the connection to the PostgreSQL instance cannot be established straight at startup, a race condition can happen when autoDiscoverDatabases is true. If discoverDatabaseDSNs fails, no dsn is set as the master database, and, if scrapeDSN succeeds, checkMapVersions will have omitted the default metrics in the server metric map. The metric map won't be updated unless the version returned by the PostgreSQL instance changes. With this patch, scrapeDSN won't be run unless discoverDatabaseDSNs succeeded and thus the race condition is eliminated.

When the connection to the PostgreSQL instance cannot be established straight
at startup, a race condition can happen when autoDiscoverDatabases is true. If
discoverDatabaseDSNs fails, no dsn is set as the master database, and, if
scrapeDSN succeeds, checkMapVersions will have omitted the default metrics in
the server metric map. The metric map won't be updated unless the version
returned by the PostgreSQL instance changes. With this patch, scrapeDSN won't
be run unless discoverDatabaseDSNs succeeded and thus the race condition is
eliminated.

Signed-off-by: Yann Soubeyrand <[email protected]>
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 64.573% when pulling 91c5150 on camptocamp:fix-race-condition into e2df41f on wrouesnel:master.

@wrouesnel wrouesnel merged commit aea6fae into prometheus-community:master Dec 24, 2020
@yann-soubeyrand yann-soubeyrand deleted the fix-race-condition branch January 25, 2021 17:49
SuperQ added a commit that referenced this pull request Feb 26, 2021
* Add CHANGELOG from existing tags.

Now released under Prometheus Community

* [CHANGE] Update build to use standard Prometheus promu/Dockerfile
* [ENHANCEMENT] Remove duplicate column in queries.yml #433
* [ENHANCEMENT] Add query for 'pg_replication_slots' #465
* [ENHANCEMENT] Allow a custom prefix for metric namespace #387
* [ENHANCEMENT] Improve PostgreSQL replication lag detection #395
* [ENHANCEMENT] Support connstring syntax when discovering databases #473
* [ENHANCEMENT] Detect SIReadLock locks in the pg_locks metric #421
* [BUGFIX] Fix pg_database_size_bytes metric in queries.yaml #357
* [BUGFIX] Don't ignore errors in parseUserQueries #362
* [BUGFIX] Fix queries.yaml for AWS RDS #370
* [BUGFIX] Recover when connection cannot be established at startup #415
* [BUGFIX] Don't retry if an error occurs #426
* [BUGFIX] Do not panic on incorrect env #457

Signed-off-by: Ben Kochie <[email protected]>
@SuperQ SuperQ mentioned this pull request Feb 26, 2021
SuperQ added a commit that referenced this pull request Mar 1, 2021
* Add CHANGELOG from existing tags.

First release under the Prometheus Community organisation.

* [CHANGE] Update build to use standard Prometheus promu/Dockerfile
* [ENHANCEMENT] Remove duplicate column in queries.yml #433
* [ENHANCEMENT] Add query for 'pg_replication_slots' #465
* [ENHANCEMENT] Allow a custom prefix for metric namespace #387
* [ENHANCEMENT] Improve PostgreSQL replication lag detection #395
* [ENHANCEMENT] Support connstring syntax when discovering databases #473
* [ENHANCEMENT] Detect SIReadLock locks in the pg_locks metric #421
* [BUGFIX] Fix pg_database_size_bytes metric in queries.yaml #357
* [BUGFIX] Don't ignore errors in parseUserQueries #362
* [BUGFIX] Fix queries.yaml for AWS RDS #370
* [BUGFIX] Recover when connection cannot be established at startup #415
* [BUGFIX] Don't retry if an error occurs #426
* [BUGFIX] Do not panic on incorrect env #457

Signed-off-by: Ben Kochie <[email protected]>
angaz pushed a commit to angaz/postgres_exporter that referenced this pull request Mar 3, 2022
* Add CHANGELOG from existing tags.

First release under the Prometheus Community organisation.

* [CHANGE] Update build to use standard Prometheus promu/Dockerfile
* [ENHANCEMENT] Remove duplicate column in queries.yml prometheus-community#433
* [ENHANCEMENT] Add query for 'pg_replication_slots' prometheus-community#465
* [ENHANCEMENT] Allow a custom prefix for metric namespace prometheus-community#387
* [ENHANCEMENT] Improve PostgreSQL replication lag detection prometheus-community#395
* [ENHANCEMENT] Support connstring syntax when discovering databases prometheus-community#473
* [ENHANCEMENT] Detect SIReadLock locks in the pg_locks metric prometheus-community#421
* [BUGFIX] Fix pg_database_size_bytes metric in queries.yaml prometheus-community#357
* [BUGFIX] Don't ignore errors in parseUserQueries prometheus-community#362
* [BUGFIX] Fix queries.yaml for AWS RDS prometheus-community#370
* [BUGFIX] Recover when connection cannot be established at startup prometheus-community#415
* [BUGFIX] Don't retry if an error occurs prometheus-community#426
* [BUGFIX] Do not panic on incorrect env prometheus-community#457

Signed-off-by: Ben Kochie <[email protected]>
ritbl pushed a commit to heniek/postgres_exporter that referenced this pull request Mar 19, 2023
…ometheus-community#415)

When the connection to the PostgreSQL instance cannot be established straight
at startup, a race condition can happen when autoDiscoverDatabases is true. If
discoverDatabaseDSNs fails, no dsn is set as the master database, and, if
scrapeDSN succeeds, checkMapVersions will have omitted the default metrics in
the server metric map. The metric map won't be updated unless the version
returned by the PostgreSQL instance changes. With this patch, scrapeDSN won't
be run unless discoverDatabaseDSNs succeeded and thus the race condition is
eliminated.

Signed-off-by: Yann Soubeyrand <[email protected]>
ritbl pushed a commit to heniek/postgres_exporter that referenced this pull request Mar 19, 2023
* Add CHANGELOG from existing tags.

First release under the Prometheus Community organisation.

* [CHANGE] Update build to use standard Prometheus promu/Dockerfile
* [ENHANCEMENT] Remove duplicate column in queries.yml prometheus-community#433
* [ENHANCEMENT] Add query for 'pg_replication_slots' prometheus-community#465
* [ENHANCEMENT] Allow a custom prefix for metric namespace prometheus-community#387
* [ENHANCEMENT] Improve PostgreSQL replication lag detection prometheus-community#395
* [ENHANCEMENT] Support connstring syntax when discovering databases prometheus-community#473
* [ENHANCEMENT] Detect SIReadLock locks in the pg_locks metric prometheus-community#421
* [BUGFIX] Fix pg_database_size_bytes metric in queries.yaml prometheus-community#357
* [BUGFIX] Don't ignore errors in parseUserQueries prometheus-community#362
* [BUGFIX] Fix queries.yaml for AWS RDS prometheus-community#370
* [BUGFIX] Recover when connection cannot be established at startup prometheus-community#415
* [BUGFIX] Don't retry if an error occurs prometheus-community#426
* [BUGFIX] Do not panic on incorrect env prometheus-community#457

Signed-off-by: Ben Kochie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants