Collection structure migration #903

pat-s · 2025-03-03T23:10:46Z

Changes

Moved vars into defaults/
Removed inventory -> Should be defined by users in the global inventory
Moved all tasks of the config_pbcluster.yml playbook into a role named user_pgcluster_config
Removed group_vars. Vars should be defined in the global user inventory

Advantages

Users can declare variables which deviate from the defaults in group_vars. This allows for a much better overview which variables have been altered in contrast to the defaults
The collection can be referenced in requirements.yml and installed via ansible-galaxy (when published there)
Removal of many redundant Include main variables and similar tasks as these are implicitly defined in defaults/
The "action" roles can applied by using a simple playbook like

- name: Autobase
  hosts: etcd_cluster # or any other group name which fits
  become: true
  gather_facts: true
  any_errors_fatal: true
  become_user: root

  roles:
    # (namespace and connection name are abitrary)
    - role: autobase.autobase.user_pgcluster_config

Notes

I only converted config_pbcluster.yml. All other "action" playbooks can/should be converted in the same way.

The naming of the new role is arbitrary, I only used a prefix to distinguish it somehow from the roles which should not be applied directly. Feel free to propose a another naming scheme.

I've successfully applied this new structure in a test project. It yields the same results as when applying the current playbook config_pbcluster.yml. Nevertheless, a thorough check/review should be applied, given all the changes and restructurings.

The migration of vars/ into defaults/ must happen for every role which should be applied by the user as otherwise these are not defined. If they stay in vars/, they will override user-defined group_vars due to Ansible variable precedence. This is the only downside I currently see, but during development, you should be able to use a global replace-all to modify all redundant entries in the defaults/ files.

UPD: #903 (comment)

pat-s · 2025-03-05T08:05:34Z

@vitabaks While this is labelled "WIP", I am waiting for a comment of yours here before doing additional work. In the end, you must work with it :) (and it would also be great if I wouldn't need to do all the refactoring myself).

Let me know what you think.

vitabaks · 2025-03-05T09:38:38Z

Let me know what you think.

I’m still unsure about this because I don’t see much benefit in it, but at the same time, I recognize how important it is to you and possibly to others as well.

I’ve reread it multiple times to think it through. As I understand it, you’re suggesting wrapping the playbook functionality into a role that calls other roles, similar to how playbooks previously worked, so that the playbook can be executed as a role.

However, I’m not a fan of duplicating variables. Previously, the vars directory served as a single point for parameter management, but now it would need to be duplicated for each playbook role. Another important consideration is that the number of playbooks will continue to grow—for example, I’ll soon be adding playbooks for scaling the etcd cluster.

pat-s · 2025-03-05T10:20:36Z

I’m still unsure about this because I don’t see much benefit in it, but at the same time, I recognize how important it is to you and possibly to others as well.

Probably, also given the discussion has already 6 upvotes. I'd argue that for all folks using ansible in a professional way, having a "sound" role/collection structure is very important.

I’ve reread it multiple times to think it through. As I understand it, you’re suggesting wrapping the playbook functionality into a role that calls other roles, similar to how playbooks previously worked, so that the playbook can be executed as a role.

Essentially, yes. There would be other ways also but this would result in even more file movements. I actually think it is a quite elegant way given the current structure and doesn't result in too much disturbance.

However, I’m not a fan of duplicating variables. Previously, the vars directory served as a single point for parameter management, but now it would need to be duplicated for each playbook role.

This is true but the main reason for that is that you're overusing roles in general. Many of the roles should be actually tasks within a role. Only the (currently) user-facing playbooks should be roles IMO, everything else should be tasks. In this structure, the duplication of vars/ would be reduced.

Another important consideration is that the number of playbooks will continue to grow—for example, I’ll soon be adding playbooks for scaling the etcd cluster.

Yeah but this would happen anyhow, wouldn't it? I.e. no matter if its a role or a task, the amount of YAML files is independent from it. Essentially it is up to you how much content and tasks you put into one. In the end, it is only about file organization.

A big upside to me is the fact that group_vars overrides are honored, which allows only putting variables which differ from the defaults into a file. Right now, users need to alter vars/ which all predefined vars in it, making it hard to almost impossible to keep a good overview of which vars where altered.

Again, this is just a suggestion which would make it a lot easier for users and make the project look more "professional" as it adheres to a "best practice" ansible structure. This would also allow publishing an official collection on Ansible Galaxy and making it easier to use for people preferring the "command line" approach.

vitabaks · 2025-03-05T10:59:12Z

and make the project look more "professional" as it adheres to a "best practice" ansible structure.

I'm not an Ansible expert, I'm an expert in Postgres databases, so I'm interested in help achieving this.

P.S. Maybe you have ideas on how to avoid duplicating defaults for each paybooks-role, maybe publish them in group_vars?

vitabaks · 2025-03-05T11:05:30Z

@pat-s For example, is the Kubespray project also considered poorly organized and unprofessional? Could you provide examples of best practices that we should aim for?

pat-s · 2025-03-05T11:55:29Z

For example, is the Kubespray project also considered poorly organized and unprofessional?

I just skimmed over it quickly. There each role seems to have a defaults/ with vars relating to it and no global vars/. This is good.

In general, there shouldn't be any group_vars/ or host_vars/ within a role/collection.
If you find yourself duplicating many entries, you can use a common role, put default vars there and source it from other roles (by default these vars would otherwise only be read from .

The problem with vars/ within a role/collection is that it takes precedence over user-defined vars.

In your case here, you have super many tasks which are split and re-used throughout different actions. Placing them in their own roles to have a good structure is fine, but it should be publicly stated in the README (or by using a prefix) that only specific ones should be "executed" and others are essentially just "helper" roles. This is why I used the user_ prefix.

Overall, Ansible allows for many structures (on a technical level) to work. This is both good and bad, as when starting out with a non-standard one, it will work (somehow) but will have its drawbacks.

As a user, you usually want to be able to call a role "as-is" and only specify variables which diverge from the defaults (either in group_vars or in vars: within the playbook).

Could you provide examples of best practices that we should aim for?

I guess most collections simply adhere to the recommended structures. Not so many need that amount of helper roles as yours does, so often enough, all roles in a collection are actually "callable" by the user.

A totally different approach would be to keep the playbooks as is and force users to call them via ansible-playbook namespace.collection.playbookName. Yet even with this approach you need to move the vars/ files into defaults/ or a common shared role, as otherwise users cannot override them (+ add required assets to make it a valid collection in the first place).

This approach has the downside that users cannot directly use their own playbook to just call a role but they are forced to call ansible-playbook directly or use ansible.builtin.import_playbook within a playbook.

The upside (in your specific case) would be to not having to rewrite your playbooks into roles.

vitabaks · 2025-03-05T12:15:26Z

Using the common role seems to look better, because I would like to have one path to the parameters.

pat-s · 2025-03-05T16:53:46Z

OK. I think the best middleway might be to stick with the playbooks and only move vars/ to commmon/defaults and source this. This avoids having to restructure many files while still allowing for a proper collection structure.

pat-s · 2025-03-05T17:58:27Z

@vitabaks This new structure now works allows users to

Install the collection via ansible-galaxy install
Specify it in a requirements.yml
Set group_vars/ which override the implicit defaults

On the user level, setting custom vars in group_vars/ is actually required now. I.e. this should be highlighted in the release notes. It is a breaking change anyhow.

Users also need to set the inventory themselves now. When installing a collection, there shouldn't be a bundled inventory.
This means there should be an example inventory file (inventory.example) which shows how the inventory can be defined. The docs should also state which ones are mandatory (to deploy the essential stack) and which ones are optional.

The probably trickiest part are your custom group_vars/ definitions (like is_master). These can be worked around by requiring users to add the following definitions into their inventory:

[master:vars]
is_master=true

[replica:vars]
is_master=false

I added galaxy.yml which you need to populate. This is needed if you want to publish to ansible galaxy (which I would recommend).

In summary:

Users install the collection via ansible-galaxy
Users populate their inventory following the example inventory
Users define custom var settings in their group_vars/
Users call the playbooks in a playbook which looks like

- name: Playbook
  hosts: etcd_cluster # or other group name

  tasks:
    - name: Run playbook
      ansible.builtin.include_playbook: autobase.autobase.configure_pgcluster

…ks#904)

Co-authored-by: Artem Safiiulin <[email protected]>

vitabaks · 2025-03-06T14:00:30Z

The probably trickiest part are your custom group_vars/ definitions (like is_master). These can be worked around by requiring users to add the following definitions into their inventory

I'm thinking of getting rid of the is_master variable in favor of conditions for host groups, this will simplify the configuration.

when: inventory_hostname == groups['master'][0]

when: inventory_hostname in groups['replica']

pat-s · 2025-03-06T16:01:05Z

Yes, this should simplify this and remove the need for the custom inventory entry.

pat-s · 2025-03-06T16:02:56Z

Btw there is also this GHA here to automatically publish a collection for releases. HTH in case there are issues/questions.

README.md

pat-s · 2025-03-23T14:25:28Z

It may also be worth adding a README to the automation catalog with a description of the details of the methods of use.

Indeed, listing all available playbooks including a short description when to use which would be very helpful. Do you want to do this including additional cleanup? From my side, the PR is ready.
It was still WIP as galaxy.yml still needs to be adjusted, but this should also be done by you.
WRT to that, usually the namespace is different from the collection name. Not sure if that makes sense in your case here, using the same name for both is also theoretically possible.

remove "via git"

pat-s · 2025-04-01T20:17:30Z

Thanks for cleaning up and finishing the remaining tasks.

You might wanna consider using https://github.com/buluma/ansible_galaxy_collection or a similar action to automate deployments to Ansible Galaxy for every autobase release.

vitabaks · 2025-04-02T16:33:40Z

You might wanna consider using https://github.com/buluma/ansible_galaxy_collection or a similar action to automate deployments to Ansible Galaxy for every autobase release.

I'll take a look at it, thanks.

So far, I have manually uploaded for version 2.2.0 -> https://galaxy.ansible.com/ui/repo/published/vitabaks/autobase/

@pat-s Everything seems to be in order, could you also test it?

pat-s · 2025-04-03T17:55:50Z

I've been using this structure in my envs since I opened the PR. So you can consider it to be tested.

Anything that comes up afterwards can also be fixed in subsequent patch PRs.

Good thing this is happening overall, I think it's a big day for this project 👍️

pat-s added 2 commits March 3, 2025 23:37

use collection structure

43cd964

add debian vars

37d2e0f

pat-s added 5 commits March 5, 2025 18:26

move defaults vars into roles/common

b3a9a44

remove custom role

fbb9941

add required collection items

67c2cc0

clean

c6190b6

rename inventory to .example

a129709

vitabaks and others added 8 commits March 5, 2025 18:59

Install Netdata monitoring by default; update config template (vitaba…

20453b5

…ks#904)

Change HTTP links to HTTPS. (vitabaks#905)

bf3eea1

fix for "Connection ifno" task (vitabaks#909)

020062b

Co-authored-by: Artem Safiiulin <[email protected]>

set default is_master vars in inventory

dcd11c8

Merge branch 'master' into collection

ecf03ae

add changelog.yaml

7c75eb8

add header

017b88e

Remove group_vars from molecule config

ffc44ea

vitabaks added 3 commits March 6, 2025 21:08

Molecule: Update path for dcs_type.yml file

eb407ad

tests: update path to defaults

216993d

Update README.md

5346a6b

vitabaks reviewed Mar 23, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

Merge branch 'master' into collection

36f0ba1

pat-s marked this pull request as ready for review March 23, 2025 14:28

pat-s mentioned this pull request Mar 31, 2025

Alex #935

Closed

vitabaks added 19 commits April 1, 2025 12:33

Update system.yml

6e7e26e

Update README.md

9047049

update path to upgrade.yml vars file

3a287ab

Update main.yml

090c810

Update pre_checks.yml

70ae047

Update README

fb346b9

Update requires_ansible

887c0c5

Update main.yml

e7e871a

Update galaxy.yml

35374c0

update galaxy tags

fa78c58

fix document-start

cf95bb7

Fix "Could not find file README.md"

bce96c4

Update README

53ab520

Fix "role name invalid format"

d529fae

io-scheduler

701d00f

Update main.yml

3641dbb

fix typo in tests

dcc4802

Fix "No role readme found"

af3bcc9

Update README

450edc7

remove "via git"

vitabaks merged commit 879f8a0 into vitabaks:master Apr 4, 2025
24 of 26 checks passed

This was referenced May 11, 2025

Move to a role-based collection structure #991

Closed

Add common role as dependency #992

Merged

Uh oh!

Collection structure migration #903

Collection structure migration #903

Uh oh!

Conversation

pat-s commented Mar 3, 2025 • edited by vitabaks Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Advantages

Notes

Uh oh!

pat-s commented Mar 5, 2025

Uh oh!

vitabaks commented Mar 5, 2025

Uh oh!

pat-s commented Mar 5, 2025

Uh oh!

vitabaks commented Mar 5, 2025

Uh oh!

vitabaks commented Mar 5, 2025

Uh oh!

pat-s commented Mar 5, 2025

Uh oh!

vitabaks commented Mar 5, 2025

Uh oh!

pat-s commented Mar 5, 2025

Uh oh!

pat-s commented Mar 5, 2025

Uh oh!

vitabaks commented Mar 6, 2025

Uh oh!

pat-s commented Mar 6, 2025

Uh oh!

pat-s commented Mar 6, 2025

Uh oh!

Uh oh!

pat-s commented Mar 23, 2025

Uh oh!

pat-s commented Apr 1, 2025

Uh oh!

vitabaks commented Apr 2, 2025

Uh oh!

pat-s commented Apr 3, 2025

Uh oh!

Uh oh!

Uh oh!

pat-s commented Mar 3, 2025 •

edited by vitabaks

Loading