Skip to content

Collection structure migration #903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Apr 4, 2025
Merged

Collection structure migration #903

merged 51 commits into from
Apr 4, 2025

Conversation

pat-s
Copy link
Contributor

@pat-s pat-s commented Mar 3, 2025

ref #272

Changes

  • Moved vars into defaults/
  • Removed inventory -> Should be defined by users in the global inventory
  • Moved all tasks of the config_pbcluster.yml playbook into a role named user_pgcluster_config
  • Removed group_vars. Vars should be defined in the global user inventory

Advantages

  • Users can declare variables which deviate from the defaults in group_vars. This allows for a much better overview which variables have been altered in contrast to the defaults
  • The collection can be referenced in requirements.yml and installed via ansible-galaxy (when published there)
  • Removal of many redundant Include main variables and similar tasks as these are implicitly defined in defaults/
  • The "action" roles can applied by using a simple playbook like
- name: Autobase
  hosts: etcd_cluster # or any other group name which fits
  become: true
  gather_facts: true
  any_errors_fatal: true
  become_user: root

  roles:
    # (namespace and connection name are abitrary)
    - role: autobase.autobase.user_pgcluster_config

Notes

I only converted config_pbcluster.yml. All other "action" playbooks can/should be converted in the same way.

The naming of the new role is arbitrary, I only used a prefix to distinguish it somehow from the roles which should not be applied directly. Feel free to propose a another naming scheme.

I've successfully applied this new structure in a test project. It yields the same results as when applying the current playbook config_pbcluster.yml. Nevertheless, a thorough check/review should be applied, given all the changes and restructurings.

The migration of vars/ into defaults/ must happen for every role which should be applied by the user as otherwise these are not defined. If they stay in vars/, they will override user-defined group_vars due to Ansible variable precedence. This is the only downside I currently see, but during development, you should be able to use a global replace-all to modify all redundant entries in the defaults/ files.

UPD: #903 (comment)

@pat-s
Copy link
Contributor Author

pat-s commented Mar 5, 2025

@vitabaks While this is labelled "WIP", I am waiting for a comment of yours here before doing additional work. In the end, you must work with it :) (and it would also be great if I wouldn't need to do all the refactoring myself).

Let me know what you think.

@vitabaks
Copy link
Owner

vitabaks commented Mar 5, 2025

Let me know what you think.

I’m still unsure about this because I don’t see much benefit in it, but at the same time, I recognize how important it is to you and possibly to others as well.

I’ve reread it multiple times to think it through. As I understand it, you’re suggesting wrapping the playbook functionality into a role that calls other roles, similar to how playbooks previously worked, so that the playbook can be executed as a role.

However, I’m not a fan of duplicating variables. Previously, the vars directory served as a single point for parameter management, but now it would need to be duplicated for each playbook role. Another important consideration is that the number of playbooks will continue to grow—for example, I’ll soon be adding playbooks for scaling the etcd cluster.

@pat-s
Copy link
Contributor Author

pat-s commented Mar 5, 2025

I’m still unsure about this because I don’t see much benefit in it, but at the same time, I recognize how important it is to you and possibly to others as well.

Probably, also given the discussion has already 6 upvotes. I'd argue that for all folks using ansible in a professional way, having a "sound" role/collection structure is very important.

I’ve reread it multiple times to think it through. As I understand it, you’re suggesting wrapping the playbook functionality into a role that calls other roles, similar to how playbooks previously worked, so that the playbook can be executed as a role.

Essentially, yes. There would be other ways also but this would result in even more file movements. I actually think it is a quite elegant way given the current structure and doesn't result in too much disturbance.

However, I’m not a fan of duplicating variables. Previously, the vars directory served as a single point for parameter management, but now it would need to be duplicated for each playbook role.

This is true but the main reason for that is that you're overusing roles in general. Many of the roles should be actually tasks within a role. Only the (currently) user-facing playbooks should be roles IMO, everything else should be tasks. In this structure, the duplication of vars/ would be reduced.

Another important consideration is that the number of playbooks will continue to grow—for example, I’ll soon be adding playbooks for scaling the etcd cluster.

Yeah but this would happen anyhow, wouldn't it? I.e. no matter if its a role or a task, the amount of YAML files is independent from it. Essentially it is up to you how much content and tasks you put into one. In the end, it is only about file organization.

A big upside to me is the fact that group_vars overrides are honored, which allows only putting variables which differ from the defaults into a file. Right now, users need to alter vars/ which all predefined vars in it, making it hard to almost impossible to keep a good overview of which vars where altered.


Again, this is just a suggestion which would make it a lot easier for users and make the project look more "professional" as it adheres to a "best practice" ansible structure. This would also allow publishing an official collection on Ansible Galaxy and making it easier to use for people preferring the "command line" approach.

@vitabaks
Copy link
Owner

vitabaks commented Mar 5, 2025

and make the project look more "professional" as it adheres to a "best practice" ansible structure.

I'm not an Ansible expert, I'm an expert in Postgres databases, so I'm interested in help achieving this.

P.S. Maybe you have ideas on how to avoid duplicating defaults for each paybooks-role, maybe publish them in group_vars?

@vitabaks
Copy link
Owner

vitabaks commented Mar 5, 2025

@pat-s For example, is the Kubespray project also considered poorly organized and unprofessional? Could you provide examples of best practices that we should aim for?

@pat-s
Copy link
Contributor Author

pat-s commented Mar 5, 2025

For example, is the Kubespray project also considered poorly organized and unprofessional?

I just skimmed over it quickly. There each role seems to have a defaults/ with vars relating to it and no global vars/. This is good.

In general, there shouldn't be any group_vars/ or host_vars/ within a role/collection.
If you find yourself duplicating many entries, you can use a common role, put default vars there and source it from other roles (by default these vars would otherwise only be read from .

The problem with vars/ within a role/collection is that it takes precedence over user-defined vars.

In your case here, you have super many tasks which are split and re-used throughout different actions. Placing them in their own roles to have a good structure is fine, but it should be publicly stated in the README (or by using a prefix) that only specific ones should be "executed" and others are essentially just "helper" roles. This is why I used the user_ prefix.

Overall, Ansible allows for many structures (on a technical level) to work. This is both good and bad, as when starting out with a non-standard one, it will work (somehow) but will have its drawbacks.

As a user, you usually want to be able to call a role "as-is" and only specify variables which diverge from the defaults (either in group_vars or in vars: within the playbook).

Could you provide examples of best practices that we should aim for?

I guess most collections simply adhere to the recommended structures. Not so many need that amount of helper roles as yours does, so often enough, all roles in a collection are actually "callable" by the user.


A totally different approach would be to keep the playbooks as is and force users to call them via ansible-playbook namespace.collection.playbookName. Yet even with this approach you need to move the vars/ files into defaults/ or a common shared role, as otherwise users cannot override them (+ add required assets to make it a valid collection in the first place).

This approach has the downside that users cannot directly use their own playbook to just call a role but they are forced to call ansible-playbook directly or use ansible.builtin.import_playbook within a playbook.

The upside (in your specific case) would be to not having to rewrite your playbooks into roles.

@vitabaks
Copy link
Owner

vitabaks commented Mar 5, 2025

Using the common role seems to look better, because I would like to have one path to the parameters.

@pat-s
Copy link
Contributor Author

pat-s commented Mar 5, 2025

OK. I think the best middleway might be to stick with the playbooks and only move vars/ to commmon/defaults and source this. This avoids having to restructure many files while still allowing for a proper collection structure.

@pat-s
Copy link
Contributor Author

pat-s commented Mar 5, 2025

@vitabaks This new structure now works allows users to

  • Install the collection via ansible-galaxy install
  • Specify it in a requirements.yml
  • Set group_vars/ which override the implicit defaults

On the user level, setting custom vars in group_vars/ is actually required now. I.e. this should be highlighted in the release notes. It is a breaking change anyhow.

Users also need to set the inventory themselves now. When installing a collection, there shouldn't be a bundled inventory.
This means there should be an example inventory file (inventory.example) which shows how the inventory can be defined. The docs should also state which ones are mandatory (to deploy the essential stack) and which ones are optional.

The probably trickiest part are your custom group_vars/ definitions (like is_master). These can be worked around by requiring users to add the following definitions into their inventory:

[master:vars]
is_master=true

[replica:vars]
is_master=false

I added galaxy.yml which you need to populate. This is needed if you want to publish to ansible galaxy (which I would recommend).


In summary:

  • Users install the collection via ansible-galaxy
  • Users populate their inventory following the example inventory
  • Users define custom var settings in their group_vars/
  • Users call the playbooks in a playbook which looks like
- name: Playbook
  hosts: etcd_cluster # or other group name

  tasks:
    - name: Run playbook
      ansible.builtin.include_playbook: autobase.autobase.configure_pgcluster

@vitabaks
Copy link
Owner

vitabaks commented Mar 6, 2025

The probably trickiest part are your custom group_vars/ definitions (like is_master). These can be worked around by requiring users to add the following definitions into their inventory

I'm thinking of getting rid of the is_master variable in favor of conditions for host groups, this will simplify the configuration.

when: inventory_hostname == groups['master'][0]
when: inventory_hostname in groups['replica']

@pat-s
Copy link
Contributor Author

pat-s commented Mar 6, 2025

Yes, this should simplify this and remove the need for the custom inventory entry.

@pat-s
Copy link
Contributor Author

pat-s commented Mar 6, 2025

Btw there is also this GHA here to automatically publish a collection for releases. HTH in case there are issues/questions.

@pat-s
Copy link
Contributor Author

pat-s commented Mar 23, 2025

It may also be worth adding a README to the automation catalog with a description of the details of the methods of use.

Indeed, listing all available playbooks including a short description when to use which would be very helpful. Do you want to do this including additional cleanup? From my side, the PR is ready.
It was still WIP as galaxy.yml still needs to be adjusted, but this should also be done by you.
WRT to that, usually the namespace is different from the collection name. Not sure if that makes sense in your case here, using the same name for both is also theoretically possible.

@pat-s pat-s marked this pull request as ready for review March 23, 2025 14:28
@pat-s pat-s mentioned this pull request Mar 31, 2025
@pat-s
Copy link
Contributor Author

pat-s commented Apr 1, 2025

Thanks for cleaning up and finishing the remaining tasks.

You might wanna consider using https://github.com/buluma/ansible_galaxy_collection or a similar action to automate deployments to Ansible Galaxy for every autobase release.

@vitabaks
Copy link
Owner

vitabaks commented Apr 2, 2025

You might wanna consider using https://github.com/buluma/ansible_galaxy_collection or a similar action to automate deployments to Ansible Galaxy for every autobase release.

I'll take a look at it, thanks.

So far, I have manually uploaded for version 2.2.0 -> https://galaxy.ansible.com/ui/repo/published/vitabaks/autobase/

@pat-s Everything seems to be in order, could you also test it?

@pat-s
Copy link
Contributor Author

pat-s commented Apr 3, 2025

I've been using this structure in my envs since I opened the PR. So you can consider it to be tested.

Anything that comes up afterwards can also be fixed in subsequent patch PRs.

Good thing this is happening overall, I think it's a big day for this project 👍️

@vitabaks vitabaks merged commit 879f8a0 into vitabaks:master Apr 4, 2025
24 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation Automation functionality using Ansible enhancement Improvement of the current functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants