IBM Storage Ceph

Connect, collaborate, and share expertise on IBM Storage Ceph

View Only

Back to Blog List

NTP Strategies for IBM Storage Ceph (and Beyond!)

By Anthony D'Atri posted Mon May 19, 2025 02:02 PM

Introduction

Ceph is a distributed system, and delivers the best experience when all server and client systems have closely synchronized clocks.

Notably, the Paxos consensus algorithm used by Ceph Monitors to pooo.establish and maintain quorum requires that Monitor clocks be synchronized to within 50 milliseconds of each other.

It is straightforward to design and implement an NTP architecture that easily achieves and maintains sub-millisecond accuracy, but we often see implementations that do not provide sufficient diversity and resilience. In this document we will briefly explore common pitfalls and strategies to mitigate them. Note that we do not attempt to provide a complete reference for configuring and running NTP services, but rather concentrate on architectural decisions and certain important configuration choices.

NTP Overview

The Network Time Protocol provides a mechanism for synchronizing the clocks of computers and other devices over LAN or WAN network connections.

There are three popular NTP implementations that we see on Linux systems:

Classic (legacy) ntpd

Modern chrony

systemd-timesyncd

Of these, system-timesyncd should be ruled out immediately, as it only syncs very intermittently and in every way is suited only to setting a desktop or laptop system’s clock to be vaguely accurate. Linux servers need stronger and ongoing synchronization, and systemd-timesyncd is fundamentally inadequate for IBM Storage Ceph systems.

The classic ntpd is sometimes referred to as simply NTP, but this is imprecise and can lead to confusion. NTP is the protocol, not the implementation thereof. It suffices for server timekeeping, but is saddled with considerable legacies. The classic ntpd is the default for RHEL 7.

The best NTP timekeeping daemon for Linux servers is chrony. Chrony is configured in much the same way as the classic ntpd and operates in a similar fashion, but is more efficient and will often converge system times more quickly. While out of scope for IBM Ceph Storage, RHEL 7 systems can easily be switched to Chrony from the legacy ntpd for a homogenous fleet.

NTP Sources

Your enterprise’s systems sync against reference time sources. These include appliances that receive super-accurate time signaling from GPS satellites as well as public servers available over the Internet. Other systems within your organization can act as servers as well, including in some cases network routers or switches. All are valuable, with caveats.

Resilience and Diversity are Crucial

In order to implement a quality, resilient NTP service for your IBM Storage Ceph deployment (and your enterprise in general) you must adhere to the below design principles:

More sources are better. The NTP protocol is lightweight from compute and network perspectives. There is no need to limit the number of configured sources out of concern for resource consumption. At any given time a system’s NTP daemon will select the single configured source that it considers the best available to which to synchronize. It is entirely possible that no configured source will be considered acceptable, which we must avoid. It is very acceptable to have as many as twenty sources configured.

Quality vs Proximity. Enterprise-quality NTP daemons measure and can adjust for sources that are accurate yet are relatively network-distant.

Public NTP pools † are fine things, but their quality varies widely especially in certain geographical regions. They are valuable components of your NTP scheme, but are ideally not the only upstream sources in the mix. This time-series from a real-world enterprise Ceph cluster tells the tale:

Here we see periods of reasonably-precise synchronization of system clocks interspaced with times of severe divergence. The root cause of these wild fluctuations was inconsistent quality of the servers in a certain regional public NTP pool. The public NTP pools enact primitive load-balancing by periodically rotating the participating time source servers to which the advertised, abstracted DNS records point. At the time of writing, for example, us.pool.ntp.org rotates among nearly 600 backing servers, though only four are exposed at any given time.

Enterprise NTP daemons value stability, and during intervals when the public pools point-in-time selection of DNS record targets do not contain any quality sources, system times will skew wildly and rapidly as shown. Remember that Ceph Monitors want no worse than 50 millisecond synchronization among themselves: the above graph shows the time skew of each non-lead Monitor relative to the lead Monitor.

Network routers or switches may have the ability to serve as NTP sources, but may have limited precision and / or capacity. Thus they can contribute to an NTP mesh’s diversity and resilience but likely are best not relied upon as the only sources.

Virtual machines (VMs) are usually not quality time sources, as virtualized clocks often lack appropriate precision and stability. Physical, bare-metal servers are the best choices.

Modern NTP daemons implement adaptive backoff of the interval between probes of configured time sources. This helps reduce load and network traffic as a system’s clock stabilizes. The iburst attribute when configuring sources is useful for speeding initial synchronization by sending a small number of frequent time probes at startup, then falling back to less-frequent probes. This is advised for all time sources.

Resilient NTP Architecture

The below diagram shows a generalized, highly available and highly resilient datacenter NTP architecture. Not all components of this architecture are necessary, but the more you can implement, the better results you may have. We will briefly discuss each component.

A depiction of an NTP topology that is resilient and efficient and does not DoS public pools

Local Geo Pool
This refers to public NTP pool severs abstracted through rotating DNS. For example, a server in Boring, Oregon or Intercourse, Pennsylvania might configure the below

server0.us.pool.ntp.org iburst
server1.us.pool.ntp.org iburst
server2.us.pool.ntp.org iburstserver3.us.pool.ntp.org iburst

Public Linux Poolserver0.rhel.pool.ntp.org iburst server1.rhel.pool.ntp.org iburst server2.rhel.pool.ntp.org iburst server3.rhel.pool.ntp.org iburst

Hand-picked public servers
This might include known-quality specific source FQDNs or IP addresses or sources provided by your organization or an associate’s company. One might run chrony sources and chrony sourcestatswhen configuring public pools to select a specific server or two with consistently low Stratum, Freq Skew, Offset, and Std Dev values and high Reach. We do not list any examples here because the best choices will vary based on your location and situation. Note as well that this approach is acceptable for a very small number of distribution server but should not be applied directly to a large number of your internal systems.

That said, additional, static choices for diversity might be the servers run by NIST: https://tf.nist.gov/tf-cgi/servers.cgi
Distant Geo Pool
If your organization runs servers in Africa, Latin America, or APAC regions †† it may be especially valuable to add two entries for public servers in the US zone in addition to those in your local zone:
server0.asia.pool.ntp.org iburst server1.asia.pool.ntp.org iburst server2.asia.pool.ntp.org iburst server3.asia.pool.ntp.org iburst server0.us.pool.ntp.org iburst server1.us.pool.ntp.org iburst

GPS Appliance
Old-school GPS appliances are dedicated hardware, often with a coax run to the data center’s roof where a specialized antenna receives signals from the constellation of visible GPS satellites. These can require expensive and lengthy site arrangements but cannot be beat for capacity and precision.

In recent years small appliances have become available for as little as USD 500. These generally can only serve a modest number of clients, but they can sit on a windowsill with line-of-sight to the sky and provide an inexpensive low-stratum and high-quality source for your distribution layer, which will share the temporal love with all your internal systems. In order to remain vendor-neutral and avoid stale advice we do not list specific appliances here but a web search engine will quickly find multiple options.

Internal Distribution Server
It is a bad netizenship to have more than a few servers directly query external, public time sources. Larger numbers of servers doing this would present inappropriate, abusive load to these sources that provide a valuable service free of charge. Implementing an internal distribution layer respects external resources that are provided out of the goodness of someone’s heart, keeps the network traffic off of your congested WAN, and presents much lower network RTT and jitter for internal clients.

Not pictured on the above diagram but quite valuable are the below three strategies, which reflect that with IBM Storage Ceph servers and clients, close synchronization is often more important than tight adherence to reference time, though staying very close to reference time has additional benefits.

- The internal distribution servers should sync against each other as well as to reference sources

- IBM Storage Ceph Monitors should all sync against each other as well as to the distribution layer

- IBM Storage Ceph clients should also sync against the Monitors

Note that the chrony stock config file includes a makestep line. You likely want to disable this to prevent service blips from making large adjustments to system time.

† https://www.ntppool.org/en/ , [01234].rhel.pool.ntp.org

†† Or Antarctica!

#InfrastructureServices
#EnterpriseLinux
#Storage

0 comments

8 views

Permalink

https://community.ibm.com/community/user/blogs/anthony-datri/2025/05/19/ntpftw

IBM Storage Ceph

IBM Storage Ceph

NTP Strategies for IBM Storage Ceph (and Beyond!)

By Anthony D'Atri posted Mon May 19, 2025 02:02 PM

Introduction

NTP Overview

NTP Sources

Resilience and Diversity are Crucial

Permalink

Additional
Resources

Office

Quick Links

IBM Storage Ceph

IBM Storage Ceph

NTP Strategies for IBM Storage Ceph (and Beyond!)

By Anthony D'Atri posted Mon May 19, 2025 02:02 PM

Introduction

NTP Overview

NTP Sources

Resilience and Diversity are Crucial

Permalink

Additional Resources

Office

Quick Links

Additional
Resources