Clock going nuts

Hello,
Non grid related - but I have two windows servers on the internet - one is quite old and the clock drifts quite a bit. I was thinking I could set the good one to sync to NTP time servers occasionally, then use PTPSync to sync to keep the drifting one synced very closely to the other across my private management LAN back end.

I decided to test on a couple local win 10 pc’s before deploying to my servers to get a feel for the software. I set up one machine (lets call it PC1) without the -g switch so, if I understand correctly, it would broadcast it’s time to the rest of the network. I set the other PC (lets call it PC2) with default settings. I also turned windows default time syncing off.

At first, the PC2 synced as expected to PC1. Later I noticed PC2 seemed to be drifting, about a second slow after only a few hours & didn’t seem to be correcting itself. I checked my settings and and initialized - no effect. I decided to shut down the service on both PCs and restart to see if I could force a sync to get the PCs back close - this is when the clock PC2 started freaking out and flying into the future. I shut down the PTP service on both machines but it did not affect PC2 and the clock kept flying crazily into the future while the task manager showed “system interrupts” using 100% cpu.

The night before I did the same test but used a VMware win10 virtual as “PC2” - it did the same thing after messing with the PTP service a bit - I had chalked that up to maybe something iffy with the VMWare clock as I’ve run into issues with that before with linux instances.

The PC2 Machine is a Dell Precision 7510 laptop running an updated copy of Win10 if that helps. I uploaded a video of the time issue on PC2 - a reboot did stop the issue, luckily. https://www.youtube.com/watch?v=b8a66ghhGyQ

There’s a minute possibility that PTP and NTP are competing - you might need to turn off NTP clock synchronizations.

That said, Windows 10 already includes native PTP support - have you tried enabling native PTP? See the following:

Thanks,
Ritchie

Hi Ritchie,

Thanks for the response. I was only using these two win10 boxes to become familiar with the software - and to test to see how it worked before loading to my Server 2008 and Server 2012 R2 boxes.

I had the windows “Set time automatically” turned off when testing PTP.

What was really crazy was when the clock started warping into the future, killing the PTPSync service did not fix the issue, a reboot was required.

So this tool has mainly been tested with shared broadcast type synchronizations, although I expect clock to single machine would work OK.

Looks like there’s a bug in the forwarding feature for sure. Do you have other options for distribution of PTP signal, such as UDP broadcast, multipoint unicast, or multicast?

I’ll have to look into your issue when I have a moment…

Thanks,
Ritchie

I’m not sure I fully understand the question.

My setup is I have two servers sitting in a datacenter with two NICs each. One NIC connected to the internet with public IP, one connected to the other server via a crossover cable with private IP in the 192 block. I use the private lan for server to server communication (i.e. database queries, file transfers, etc) and management traffic - anything I can keep off the public side.

I was hoping to use one server as a “master” that broadcasts synchronization messages to the other server so they stay closely in sync - especially considering the older of the two servers seems to drift quite a bit.

What I did for my test was:
Setup two win 10 boxes on a local private lan in my office.
Turned off their automatic time sync.
Installed PTPSync on both machines
Made sure the -b switch was set to the correct interface GUID on each machine
Left default settings on both, but removed the -g switch on one
I put a couple minutes error in the clock of the machine that had it’s -g switch on
Started the PTPSync service on both machines

The result was:
The machines synced their times as expected.

Unexpected result was
After a short time, the “slave” machine seemed to be noticeably drifting out of sync.
In order to try to force re-sync it, I restarted the PTPSync service a couple times on both machines - it was sometime around this point I started seeing the issue with the clock on the slave PC.

I did have the console running and although I’m not 100% sure how to read the messages, it did appear that the slave machine was receiving messages from the master.

It is entirely possible I either mis-understand the function of the -g switch or have something setup wrong, although I did leave all other options as default.

What about your PTP clock source? This is usually a hardware clock with an internal GPS for keeping good time.

If you don’t have a PTP clock, you can always fallback on NTP with a more aggressive synchronization schedule, e.g., changing the following registry keys and restarting the NTP time service, i.e., W32Time (labeled as “Windows Time”):

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config]
"FrequencyCorrectRate"=dword:00000003
"PollAdjustFactor"=dword:00000001
"SpikeWatchPeriod"=dword:00000002
"HoldPeriod"=dword:00000005
"MaxPollInterval"=dword:00000008
"PhaseCorrectRate"=dword:00000001
"MinPollInterval"=dword:00000006
"UpdateInterval"=dword:00000064
"MaxNegPhaseCorrection"=dword:ffffffff
"MaxPosPhaseCorrection"=dword:ffffffff

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters]
"NtpServer"="rocppmuspdc,0x01 socppmuspdc,0x02"
"Type"="NTP"
"Log"=dword:00000064
"WriteLog"="True"

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient]
"Enabled"=dword:00000001
"SpecialPollInterval"=dword:0000003c

Note the primary and backup NtpServer settings, also, you can set one machine as the authoritative “NTP Server”:

Also: On a Windows machine, you can run this from an admin command prompt:

w32tm /stripchart /computer:othermachine

To check time differential.

This may be where the breakdown is? I don’t have a hardware PTP source. I was under the impression that leaving the -g switch off turned that box into a source based upon it’s own clock. My need is that the two machines are closely synched, more so than absolute time is correct, so my thought was to sync the primary to a public NTP source, then use PTP locally to sync the second machine to the first. Maybe that plan was flawed?

If PTPSync wasn’t written to work this way, maybe your suggestion of simply using my primary server as a primary NTP source and syncing at an aggressive interval is the best choice. I didn’t really want to sync any faster than 30 min to public servers in order to not abuse them, but if I sync across my mgmt lan to my own server - that will get the trick done and I can do it as often as I like, at minimal cost.

I just configured this setup and my old server is fluctuating between 0ms and -3ms off with a 60 sec sync time to the other server. This is probably as good as it’s going to get for that old box - and is probably good enough.

I appreciate your time and help. I would still be interested in knowing if PTPSync is able to work how I had initially planned, in case I need a better solution at some point.

Yeah PTP is the “new-ish” protocol for network-based time synchronization. AFAIK, there aren’t any publicly available PTP clock sources yet, so people have to buy a hardware clock (or already have a clock that supports the protocol). Here are a few I am aware of that support PTP:

Per your scenario, the primary machine should only “forward PTP packets it has already properly received and parsed” - so that could explain some of the craziness we saw in your YouTube video, but I am still not convinced about what it was sending. Maybe picking up random traffic on the PTP UDP channel on your network? Cosmic background radiation? Not really sure, neat trick nonetheless.

Best of luck!

Thanks,
Ritchie

FYI: https://www.cnet.com/news/facebook-develops-super-precise-public-time-keeping-service-for-the-internet/

Thanks - I put it in my available servers list, if nothing else, it cut latency from 40ms to 19ms.

Another quick question - not a huge issue but more of a curiosity - I set up NTP from my 2nd server back to my first across my mgmt lan with a sync time of 60s. I ran the command line w32tm tool for hours yesterday and was getting back mostly sub millisecond variances. Toward the end of my 60 second sync period, would be up to negative one or maybe two ms difference before re-synching and was always negative. The actual w32tm values looked something like this -00.0002858s — today I’m getting positive 4 to 6ms differences across 60 seconds but more concerning is that my w32tm times look like this: +00.0010000s - every single one has all zeros behind the thousandths position instead of random numbers (with the exception of a few that have + or -1 in the ten millionths pos). Looks like I am only getting ms precision now. Nothing changed between today and yesterday. From my logs, the change in behavior happened between 8:27:14AM CST and 8:28:14AM CST

I think those NTP settings provided only target millisecond resolution, there’s lot’s of settings to “tweak” there if you want to dial in the accuracy more.

FYI, we use those settings to try to keep machine time synchronized to under a second – this matters much with the type of work we do, but we can usually manage with less than a second tolerance.

What you learn when you start messing with this stuff is that computer clocks float wildly unless explicitly tamed.

If you come up with some better settings, would like to know what they were in case we encounter a need in the future.

Thanks,
Ritchie