SkyDNS: A Method of DNS Service Discovery



Service discovery plays an important role in Service Oriented Architecture and in distributed systems.

So what is meant by service discovery in this context?
If I explain it in brief, service discovery means finding the services that are running in a large environment. Several problems arise here. One problem that clients face is determining the IP and port for a service that exist on multiple hosts. In a live system, the service locations change frequently due to auto or manual scaling, new deployments of services and failing and replacing of hosts. Therefore service discovery is an important issue to be handled in distributed services systems.

There are two main scenarios where issues can occur in locating services. They are,
              - Service registration
              - Service discovery
Service registration is the process of registering a service in a central service registry. A service can be registered with its details such as service, host, port, version, region, protocols and environment details. 
Service discovery is the process by which a client application queries the service registry to learn the location of the services.  

There are several aspects to consider in service registration and service discovery. Specifically, we have to consider about aspects like load balancing and run time dependencies. Many open source solutions have been brought to solve issues regarding service registration and service discovery. In this blog, I'm going to describe one of those important solutions, which is SkyDNS


SkyDNS

There are several ways to service discovery in Kubernetes. One approach is through environment variables which are available to Docker containers. Another approach is though DNS. SkyDNS is a type of DNS server. It works on top of etcd. SkyDNS implements RAFT protocol. One SkyDNS server runs as a 'master' at any time. When the SkyDNS server becomes unavailable for any reason, other servers elect a new master and operations continue without interruption. SkyDNS only returns SRV records.

There are two main functionalities of the SkyDNS server.
          - Accept registrations from services.
          - Respond to DNS queries from clients.

service has a property called TTL(Time To Live) which means the period at which the service remains active. A service always accompanies a SRV record. A service also uses a simple JSON API to register and update their TTL, through which they announce their availability by sending a POST with a small JSON payload.

Respond to the client query is given as a SRV record which means a Service Record. SkyDNS serves SRV records only for services that are registered with it.  SRV record includes a priority that tells the client which services were closest to it. It also includes a weight that tells the client which services are under the lowest load. When a client queries a service, SkyDNS returns DNS entries for services that have not outlived their TTL or extended their TTL. SkyDNS expires records for services that have not updated their availability within TTL window. Therefore services should maintain their heartbeat and keep them in pool by periodically sending POST to SkyDNS updating their TTL. This is an advantage for clients because clients no longer waste their time making requests to services that are not running or reachable. This is what happens in brief.

Client looks for a particular service and he uses DNS requests to locate the service. This DNS request which is issued to SkyDNS includes particulars of service that is required. 

SkyDNS supports several interesting parameters in SRV record to facilitate returning the service that is best suited to the request of the client.

SkyDNS identifies several parameters of the DNS query. They are as follows.


DNS Query Parameters


  1. Environment : This is the last parameter of the DNS query, but it is the only required field in any DNS request. For the Environment parameter, we can put values like development, staging, production, integration etc. This parameter is useful to segregate services, so that you can run a single SkyDNS service cluster but serve your entire development environment. 
  2. Service : This is the name of the service that is running. It should be a unique name which can be used to identify the service and what service does.
  3. Version : This allows the client to request for a specific version of the service.
  4. Region : These are the available zones of the service. We know that services are run in multiple data centers. SkyDNS understands locality of a service. When services are registered at SkyDNS, they are registered as running in a specific region which allows clients to request service running in same region. This is where the priority field comes in. If same service runs in multiple regions, SkyDNS uses priority field in SRV record to return the services in the requested region first, with same priority. In that case, local services are returned first. Then the services running in different regions are returned at a lower priority.
  5. Host : Client can request a service on a particular host
  6. UUID : Each service has a unique identifier which allows multiple services at same version to be running on same host. So if the client requires a specific instance of a service, he can specify UUID in the DNS query, so that the client is ensured to only receive that specific instance.
Out of these parameters, environment is the only required parameter. Missing fields are interpreted as 'any'. 

SkyDNS work together with Kube2sky. Kube2sky listens to the Kubernetes api for new services and adds information to etcd, where as SkyDNS listens for DNS queries from clients and responds based on information in etcd. Whenever there are changes in the services in Kubernetes API, Kube2sky publishes those changes to SkyDNS through etcd. Therefore Kube2sky acts as a bridege between Kubernetes and SkyDNS. For now, Kube2sky runs in a pod along with etcd and SkyDNS containers.


1 comment :