# cifmw_cephadm Deploys a Ceph cluster on a set of EDPM nodes using [cephadm](https://docs.ceph.com/en/latest/cephadm). The [openstack-k8s-operators HCI documentation](https://github.com/openstack-k8s-operators/docs/blob/main/hci.md) describes how to run Ceph on EDPM nodes but leaves it to the reader to install Ceph with `cephadm`. The `cifmw_cephadm` role and `hooks/playbooks/ceph.yml` hook playbook may be used to automate the Ceph installation. Before this role is run the following roles should be run. - `cifmw_create_admin`: creates a user for `cephadm` - `cifmw_block_device`: creates a virtual disk to store data - `cifmw_ceph_spec`: defines the Ceph cluster layout After this role is run, the `cifmw_ceph_client` role can generate a k8s CR which OpenStack can use to connect to the deployed Ceph cluster. The `ceph.yml` hook playbook in the `hooks/playbooks` directory provides a complete working example which does all of the above and has been tested on a three EDPM node deployment from [install_yamls](https://github.com/openstack-k8s-operators/install_yamls). ## Privilege escalation Requires an Ansible user who can become root to install Ceph server. ## Parameters The `hooks/playbooks/ceph.yml` hook playbook defaults these parameters so that they do not need to be changed for a typical EDPM deployment. * `cifmw_cephadm_basedir`: (String) Base directory for artifacts and logs. Defaults to `cifmw_basedir`, which defaults to `{{ ansible_user_dir ~ '/ci-framework-data' }}`. * `cifmw_cephadm_default_container`: If this is value is `true`, then `cephadm bootstrap` is not passed the `--image` parameter and whatever default Ceph container defined inside of `cephadm` is used. Otherwise use `cifmw_cephadm_container_ns` (e.g. "quay.io/ceph"), `cifmw_cephadm_container_image` (e.g. "ceph") and `cifmw_cephadm_container_tag` (e.g. "v18"). * `cifmw_cephadm_spec_ansible_host`: the path to the Ceph spec generated by the `cifmw_ceph_spec` role (e.g. `/tmp/ceph_spec.yml`). * `cifmw_cephadm_bootstrap_conf`: the path to the initial Ceph configuration file generated by the `cifmw_ceph_spec` role (e.g. `/tmp/initial_ceph.conf`) * `cifmw_ceph_client_vars`: the path to ceph client variables passed as input to the `cifmw_ceph_client` role (e.g. `/tmp/ceph_client.yml`). * `cifmw_cephadm_pools`: see below * `cifmw_cephadm_keys`: see below * `cifmw_cephadm_certs`: The path on the ceph host where TLS/SSL certificates are located. It points to `/etc/pki/tls`. * `cifmw_cephadm_certificate`: The SSL/TLS certificate signed by CA which is an optional parameter. If it is provided, ceph dashboard and RGW will be configured for SSL automatically. Certificate should be made available in `cifmw_cephadm_certs` path only. To enable SSL for dashboard, both `cifmw_cephadm_certificate` and `cifmw_cephadm_key` are needed. These certificates can be generated automatically by setting `cifmw_cephadm_certificate` and `cifmw_cephadm_key` to the desired path, provided that the `cifmw_openshift_kubeconfig` is set correctly so that Ansible can request that k8s create a certificate using an existing root CA. * `cifmw_cephadm_key`: The SSL/TLS certificate key which is an optional parameter. If it is provided, ceph dashboard and rgw will be configured for SSL automatically. * `cifmw_cephadm_monitoring_network`: the Ceph `public_network` where the dashboard monitoring stack instances should be bound. The network range is gathered from the `cifmw_cephadm_bootstrap_conf` file, which represents the initial Ceph configuration file passed at bootstrap time. * `cifmw_cephadm_rgw_network`: the Ceph `public_network` where the `radosgw` instances should be bound. The network range is gathered from the `cifmw_cephadm_bootstrap_conf` file, which represents the initial Ceph configuration file passed at bootstrap time. * `cifmw_cephadm_rgw_vip`: the ingress daemon deployed along with `radosgw` requires a `VIP` that will be owned by `keepalived`. This IP address will be used as entry point to reach the `radosgw backends` through `haproxy`. * `cifmw_cephadm_nfs_vip`: the ingress daemon deployed along with the `nfs` cluster requires a `VIP` that will be owned by `keepalived`. This IP address is the same used for rgw unless an override is passed, and it's used as entry point to reach the `ganesha backends` through an `haproxy` instance where proxy-protocol is enabled. * `cifmw_cephadm_ns`: Name of the OpenStack controlplane namespace used in configuring swift objects. * `cifmw_cephadm_config_key_set_ssl_option`: Optional colon separated list of SSL context options (default: `no_sslv2:sslv3:no_tlsv1:no_tlsv1_1`) * `cifmw_rgw_ssl_backward_compatibility`: This option is true by default because this role is able to manage older Ceph releases (starting from Squid). Set it to false if the target Ceph release is equal to or greater than Tentacle. * `cifmw_cephadm_rgw_s3_glance`: (Bool) If this is value is `true`, then cephadm will create glance secrets using the discovered RGW settings * `cifmw_cephadm_nfsv3`: (Bool) If this value is `true`, cephadm enables `NFSv3` during Ceph `NFS` cluster creation. This option is required to explicitly enable `NFSv3` for Ceph releases starting with Tentacle. For releases prior to Tentacle, this option is not required, as both `NFSv3` and `NFSv4` are enabled by default. * `cifmw_cephadm_auth_allowed_ciphers`: (String) When set, runs `ceph mon set auth_allowed_ciphers ` during cluster configuration. Example values are `"aes,aes256k"` or `"aes256k"` or `"aes"`. Defaults to `""` (unset, no command is run). Use the `cifmw_cephadm_pools` list of dictionaries to define pools for Nova (vms), Cinder (volumes), Cinder-backups (backups), and Glance (images). ``` cifmw_cephadm_pools: - name: vms pg_autoscale_mode: True target_size_ratio: 0.3 application: rbd - name: volumes pg_autoscale_mode: True target_size_ratio: 0.3 application: rbd - name: backups pg_autoscale_mode: True target_size_ratio: 0.2 application: rbd - name: images target_size_ratio: 0.2 pg_autoscale_mode: True application: rbd ``` Use the `cifmw_cephadm_keys` list of dictionaries to define a CephX key which OpenStack can use authenticate to Ceph. The `cephx_key` Ansible module will generate a random value to pass for the key value. ``` cifmw_cephadm_keys: - name: client.openstack key: "{{ cephx.key }}" mode: '0600' caps: mgr: allow * mon: profile rbd osd: profile rbd pool=vms, profile rbd pool=volumes, profile rbd pool=backups, profile rbd pool=images ``` ## Examples See `ceph.yml` in the `hooks/playbooks` directory. ## Tips for using standalone ### Pick the appropriate storage network In the `hooks/playbooks/ceph.yml` hook playbook, set the `storage_network_range` variable. * If network isolation is not being used, then set the `storage_network_range` variable to `192.168.122.0/24` (the default EDPM IP address range). * If network isolation is used, then as per the [openstack-k8s-operators networking documentation](https://github.com/openstack-k8s-operators/docs/blob/main/networking.md), the default storage network is `172.18.0.0/24` and the `storage_network_range` variable should be set accordingly. As per the [openstack-k8s-operators HCI documentation](https://github.com/openstack-k8s-operators/docs/blob/main/hci.md) a shortened `OpenStackDataPlane` services list can be used to configure the storage network before Ceph and OpenStack are deployed. See the README of the `cifmw_ceph_spec` role for more details on how the `storage_network_range` variable is used. ### Update the Ansible inventory and environment variables This example assumes [ci-framework](https://github.com/openstack-k8s-operators/ci-framework) and [install_yamls](https://github.com/openstack-k8s-operators/install_yamls) git repositories are in in $HOME and that EDPM nodes have been provisioned. The `cifmw_cephadm`, `cifmw_create_admin`, and `cifmw_block_device` roles need to be able to SSH into all EDPM nodes but the default inventory only has localhost. The devsetup process in [install_yamls](https://github.com/openstack-k8s-operators/install_yamls) generates each EDPM node and its IP address sequentially starting at 192.168.122.100. The following command may be used to create an inventory with the group `edpm` containing `N` EDPM nodes. ``` export N=2 echo -e "localhost ansible_connection=local\n[edpm]" > ~/ci-framework/inventory.yml for I in $(seq 100 $((N+100))); do echo 192.168.122.${I} >> ~/ci-framework/inventory.yml done ``` [install_yamls](https://github.com/openstack-k8s-operators/install_yamls) generates an SSH key `install_yamls/out/edpm/ansibleee-ssh-key-id_rsa` for root on every EDPM node. Configure the Ansible environment to use this user and key. ``` export ANSIBLE_REMOTE_USER=cloud-admin export ANSIBLE_SSH_PRIVATE_KEY=~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa export ANSIBLE_HOST_KEY_CHECKING=False ``` ### Run the Ceph playbook #### Direct playbook execution using ansible-playbook ``` cd ~/ci-framework/ ansible-playbook hooks/playbooks/ceph.yml ``` #### Using run_hook role ``` - name: Deploy ceph hosts: localhost vars: post_ceph: - name: Run ceph hook playbook type: playbook source: ceph.yml tasks: - name: Run post_ceph hook vars: step: post_ceph ansible.builtin.import_role: name: run_hook ``` ## Regarding the disks used as OSDs By default the `hooks/playbooks/ceph.yml` hook playbook assumes there are no block devices for Ceph to use and calls the `cifmw_block_device` role to create block devices and has the `cifmw_ceph_spec` role configure a spec to use the created block devices. If `cifmw_ceph_spec_data_devices` is passed to the `hooks/playbooks/ceph.yml` hook playbook, then the `cifmw_block_device` role is not called and the spec created by the `cifmw_ceph_spec` role will use whatever block devices were passed by `cifmw_ceph_spec_data_devices`. Use of `cifmw_ceph_spec_data_devices` implies that the block devices are already exist on the nodes to be deployed. If the ci-framework is run in a reproducer scenario and the following parameter is passed, then the `libvirt_manager` role will deploy compute nodes with three 30G disks (`/dev/vd{b,c,d}`). ```yaml cifmw_libvirt_manager_configuration_patch_01_add_compute_volumes: vms: compute: extra_disks_num: 3 extra_disks_size: 30G ``` Assuming those disks are on the nodes as created by the above, the following parameter may be passed. ```yaml cifmw_ceph_spec_data_devices: >- data_devices: all: true ``` The above sets the `cifmw_ceph_spec_data_devices` parameter. The `>-` is necessary so that the YAML block underneath it is treated as a string. This string is then passed to the `cifmw_ceph_spec` role which will create a Ceph spec containing the following: ``` service_type: osd service_id: default_drive_group data_devices: all: true ``` The above will result in Ceph using all disks that it judges to be available. Other embedded YAML options may be passed for the `cifmw_ceph_spec_data_devices` as described in the [Advanced OSD Service Specifications](https://docs.ceph.com/en/octopus/cephadm/drivegroups).