Service discovery in ZooKeeper

NOTE

The concepts and main components of ZooKeeper are described in the ZooKeeper article.

Overview

Service Discovery is a ZooKeeper service mechanism that acts as a registry, keeping track of the addresses of all instances. Instances have dynamically assigned network paths.

The Service Discovery system provides a mechanism for:

  • registration of services and instances of services;

  • search for instances of a particular service;

  • notifications about changes to service instances.

The entire Service Discovery procedure consists of two phases:

  1. Service Registration.

  2. Service Discovery.

The Service Discovery mechanism with load balancing uses the curator-x-discovery package from the Apache Curator — Java/JVM client library for Apache ZooKeeper.

The Service Discovery mechanism is shown in the following diagram.

Service Discovery in ZooKeeper
Service Discovery in ZooKeeper
Service Discovery in ZooKeeper
Service Discovery in ZooKeeper

The stages of Service Discovery are described below.

Service Registration

When a Kafka service instance appears, it registers with the Service Registry ZooKeeper in the namespace of its service, adding its ephemeral znode. Znode stores the host:port of the server as its data. The service instance sends heartbeat requests to save its registration.

Service Registry is a component that contains a database of all available service instances. It stores information about the currently available instances of each service and their network data for establishing a connection.

The Service Registry monitors running instances for changes by polling the deployment environment or by subscribing to events. When the Service Registry detects a new service instance available, it writes it to its database. The Service Registry also unregisters failed (disabled) service instances.

Apache Curator represents a service instance as a ServiceInstance class. ServiceInstances have a name, ID, address, port and/or ssl port, and an optional (user-defined) payload. ServiceInstances are serialized and stored in ZooKeeper like this:

base path
       |_______ service A name
                    |__________ instance 1 id --> (serialized ServiceInstance)
                    |__________ instance 2 id --> (serialized ServiceInstance)
                    |__________ ...
       |_______ service B name
                    |__________ instance 1 id --> (serialized ServiceInstance)
                    |__________ instance 2 id --> (serialized ServiceInstance)
                    |__________ ...
       |_______ ...

Service Discovery

After registering service instances, service discovery occurs in the following sequence:

  1. The client connects to the ZooKeeper ensemble and requests the desired service.

  2. Using the ServiceProvider class used in Apache Curator, ZooKeeper uses a load balancing algorithm to select one of the available service instances and returns a host:port for one of the registered instances.

  3. The client connects to the service instance.

Found a mistake? Seleсt text and press Ctrl+Enter to report it