sleighzy.kafka
Apache Kafka
This Ansible role helps in installing and setting up Apache Kafka version 3.5.1.
Apache Kafka is a tool for managing event streams, allowing different applications to produce and read messages through topics. It is very fast, capable of handling large amounts of data from many clients simultaneously. Kafka keeps messages safe by storing and copying them. You can scale data streams easily without any downtime.
WARNING
This Ansible role does not upgrade older versions of Kafka. Please read the upgrade documentation and update the necessary configuration files before using this role.
https://kafka.apache.org/35/documentation.html#upgrade
For upgrading, you might need to add specific properties to the server.properties
file before running this playbook:
inter.broker.protocol.version
log.message.format.version
Supported Platforms
- RedHat 6
- RedHat 7
- RedHat 8
- Debian 10.x
- Ubuntu 18.04.x
- Ubuntu 20.04.x
Requirements
- Apache ZooKeeper
- Java versions 8 (deprecated), 11, or 17
You can use the Apache ZooKeeper role from Ansible Galaxy if needed:
ansible-galaxy install sleighzy.zookeeper
You need at least Ansible version 2.9.16 or 2.10.4 to avoid issues with some systems. Always check for a working state when you start Kafka using the Ansible role. If you run into trouble, you can start the service manually on the host. More info is available at https://github.com/ansible/ansible/issues/71528.
Role Variables
Variable | Default | Comments |
---|---|---|
kafka_download_base_url | https://downloads.apache.org/kafka | |
kafka_download_validate_certs | yes | |
kafka_version | 3.5.1 | |
kafka_scala_version | 2.13 | |
kafka_create_user_group | true | |
kafka_user | kafka | |
kafka_group | kafka | |
kafka_root_dir | /opt | |
kafka_dir | {{ kafka_root_dir }}/kafka | |
kafka_start | yes | |
kafka_restart | yes | |
kafka_log_dir | /var/log/kafka | |
kafka_broker_id | 0 | |
kafka_java_heap | -Xms1G -Xmx1G | |
kafka_background_threads | 10 | |
kafka_listeners | PLAINTEXT://:9092 | |
kafka_num_network_threads | 3 | |
kafka_num_io_threads | 8 | |
kafka_num_replica_fetchers | 1 | |
kafka_socket_send_buffer_bytes | 102400 | |
kafka_socket_receive_buffer_bytes | 102400 | |
kafka_socket_request_max_bytes | 104857600 | |
kafka_replica_socket_receive_buffer_bytes | 65536 | |
kafka_data_log_dirs | /var/lib/kafka/logs | |
kafka_num_partitions | 1 | |
kafka_num_recovery_threads_per_data_dir | 1 | |
kafka_log_cleaner_threads | 1 | |
kafka_offsets_topic_replication_factor | 1 | |
kafka_transaction_state_log_replication_factor | 1 | |
kafka_transaction_state_log_min_isr | 1 | |
kafka_log_retention_hours | 168 | |
kafka_log_segment_bytes | 1073741824 | |
kafka_log_retention_check_interval_ms | 300000 | |
kafka_auto_create_topics_enable | false | |
kafka_delete_topic_enable | true | |
kafka_default_replication_factor | 1 | |
kafka_group_initial_rebalance_delay_ms | 0 | |
kafka_zookeeper_connect | localhost:2181 | |
kafka_zookeeper_connection_timeout | 6000 | |
kafka_bootstrap_servers | localhost:9092 | |
kafka_consumer_group_id | kafka-consumer-group | |
kafka_server_config_params | General parameters for server.properties |
Check log4j.yml for available log4j-related variables.
Starting and Stopping Kafka Services
Using systemd:
- Start:
systemctl start kafka
- Stop:
systemctl stop kafka
Using initd:
- Start:
service kafka start
- Stop:
service kafka stop
Default Properties
Property | Value |
---|---|
ZooKeeper connection | localhost:2181 |
Kafka bootstrap servers | localhost:9092 |
Kafka consumer group ID | kafka-consumer-group |
Kafka broker ID | 0 |
Number of partitions | 1 |
Data log file retention period | 168 hours |
Enable auto topic creation | false |
Enable topic deletion | true |
Ports
Port | Description |
---|---|
9092 | Kafka listener port |
Directories and Files
Directory / File | |
---|---|
Kafka installation directory | /opt/kafka |
Kafka configuration directory | /etc/kafka |
Directory for data files | /var/lib/kafka/logs |
Directory for logs | /var/log/kafka |
Kafka service | /usr/lib/systemd/system/kafka.service |
Example Playbook
To run this role on hosts in the kafka-nodes
group, add the following to your playbook:
- hosts: kafka-nodes
roles:
- sleighzy.kafka
Linting
Use ansible-lint for linting.
pip3 install ansible-lint --user
ansible-lint -c ./.ansible-lint .
Testing
This role uses Ansible Molecule for testing. It sets up a Kafka and ZooKeeper test cluster with three nodes in Docker containers, each running a different OS.
Follow the Molecule Installation guide to set it up in a virtual environment:
$ python3 -m venv molecule-venv
$ source molecule-venv/bin/activate
(molecule-venv) $ pip3 install ansible docker "molecule-plugins[docker]"
Run the playbook and tests. Fix any linting errors before testing. This command will run all tests and then remove the Docker containers.
molecule test
To run the playbook without tests and check for idempotence, use:
molecule converge
To run tests without tearing everything down, use:
molecule create
molecule converge
molecule verify
To clean up the testing environment and remove Docker containers:
molecule destroy
License
Apache Kafka installation for RHEL/CentOS and Debian/Ubuntu
ansible-galaxy install sleighzy.kafka