Configuring Kafka Cluster exposed externally in Azure Kubernetes Service using Load Balancer
Kafka has become an integral part of most of the distributed application’s design. It’s an event processing platform and helps decouple modules.
Sometimes, it becomes tricky to set it up in a cloud environment where multiple clients want to use the same event bus as part of application architecture. I will help you set up Kafka in AKS, the same can be used in GCP and AWS. For this tutorial, we will assume that there are 3 brokers. Also, we will not learn to set up kafka cluster. We will just make changes to make it work for external clients.
Problem:
Above image is taken from a confluent blog showing communication between various kafka clients and broker. It highlights the issue in precisely great depth. Below is the link
Scenerio 1: Accessing kafka broker within a cluster.
Kafka clients wants to connect to kafka brokers using internally ( kubernetes cluster level ) exposed service as shown in below image. It can be accessed at kafka:9092 irrespective of number of underlying brokers. Request
Scenerio 2: Accessing kafka outside host cluster.
Clients outside the cluster ( not namespace ) won't be able to connect to Kafka cluster using kafka:9092 since it was resolvable only at cluster level.
Issue :
Issue is irrespective of client location, inside or outside cluster, the metadata of topics and their leaders were returning locally resolvable DNS names as shown below.
Clients outside the hosted Kafka cluster will try to resolve it with their namespaces and it won't be able to produce or consume messages.
Solution :
Step 1:
We will need to set up advertised listeners a part of kafka process which will be used by external users. In the sample yaml below, we have started kafka using advertised.listener=PLAINTEXT://${HOSTNAME}-uat.aks-tenant-uat-az1-appref.def.intranet.asia:9092 which has def.intranet.asia:9092 as per your organization or private cloud.
exec kafka-server-start.sh /opt/kafka/config/server.properties --override delete.topic.enable=true --override advertised.listeners=PLAINTEXT://${HOSTNAME}-uat.aks-tenant-uat-az1-appref.def.intranet.asia:9092 --override advertised.host.name=$(HOSTNAME).kafka --override log.dirs=/kafka/kafka-logs-${HOSTNAME} --override reserved.broker.max.id=1007 --override broker.id=-1 --override zookeeper.connect=zk-cs:2181 --override message.max.bytes=15728640 --override replica.fetch.max.bytes=15728640
Step 2:
Once above step is done, we need to setup services for each kafka brokers as mentioned in below yaml. There are 3 things we must take care of.
- externalTrafficPolicy must be Cluster otherwise this will be exposed to the internet and as with your org policies it won't work.
- selector will be one particular broker statefulset.kubernetes.io/pod-name: kafka-0. If you have 3 brokers, there will be 3 seperate services
- Type of service must be LoadBalancer.
apiVersion: v1
kind: Service
metadata:
name: kafka-ext-port-0
namespace: appref-uat-namespace
spec:
externalTrafficPolicy: Cluster
ports:
- nodePort: 30722
port: 9092
protocol: TCP
targetPort: 9092
selector:
statefulset.kubernetes.io/pod-name: kafka-0
sessionAffinity: None
type: LoadBalancer
Step 3:
we need to make DNS entries of external Hostname ( step 1 ) and IPs ( step 2) so that the URLs become resolvable for all the external clients
Step 4 :
You can now try kafkacat as shown below
Conclusion :
We have successfully exposed kafka cluster to clients outside our kafka hosted cluster using Load balancer in AKS. Feel free to reach out in comments in case of any doubts