Two shakes, a RSD3 running .20, and a RS4D running .21, randomly stop reporting data on StationView. They do, however, continue to produce data locally, and both show active server connections. The co-located Boom running .21 does not exhibit this behavior.
All three machines have active NTP sync, albeit to differing pools of servers.
I note the machines are uploading via a Starlink connection that is subject to random interruptions, but I’d expect that the links would recover, particularly since the internet and connection statuses are showing as good.
Any hints on where to start attacking this problem would be appreciated. Nothing glaringly obvious jumps out of the logs.
Could you post the logs from all three instruments when you can? Having one that doesn’t show these transmission issues could give us a clue for the other two.
Sure thing! Should be attached, upload gods willing.
A few things:
The 3D that fell off StationView magically showed up again. This is the same 3D that’s steadfastly refusing to update to .21 that’s reported in another ticket, so I won’t upload the same logs over on that ticket.
The 4D .21 splash page indicated that services were down. I’ve attached logs taken at that time, and then again after reboot.
The Boom log is for the instrument that is always happy with life
Thank you for all the logs wiley42, they were very informative!
Indeed, the RBOOM is just coasting through its great life; the others should take an example!
Going into what I’ve found:
all Shakes display (every now and then) a disconnection from the NTP time synchronization server. This doesn’t usually last long, and the units always reconnect to the server, so it could be related to the Starlink connection (with all three Shakes together)
other than this, the RBOOM is fine
however, both the RS3D and the RS4D show some file corruption from the logs. And for the 3D, this is likely the reason why the Shake doesn’t manage to update on its own
Thus, for both the 3D and the 4D, I recommend re-burning the microSD card. You can find instructions on how to do so here.
If the microSD cards continue to show inconsistent behavior, then it may be worth using different ones.
Once the reburning is done, simply reinsert the cards into the Shakes, turn them on, and then everything should be fine now.
I have local stratum-one NTP servers, so I’ll just whack all three of them to point to one of those rather than whatever pool they’re pointing to by default. I’ll need to pull the 3D and 4D out of their vault and I’m traveling most of next week, so I’l update my observations sometime the week of the 8th. Hopefully a couple of new mlc cards and refreshed images will make them as happy as the RBOOM
Replying to my own message is bad form but, well, here I go.
Turns out the 4D eventually decided to report that local services were down. Logs are attached (hopefully). RSH.R4EFC.2026-02-15T18_56_21.logs.tar (2.9 MB)
This (Monday) UTC morning I’m seeing all three units not transmitting live data, so I don’t know if you have temporarily pulled them or if you’re having an outage.
However, while I can confirm (as you have seen) that the 4D has been correctly updated to v0.21.1, the logs still show continued NTP errors such as the one below:
2026 046 10:38:34>> Time adjustment M0: HARD RESET. This will result in a one-time time-tear.
The average is approximately once every 1-2 minutes, so the NTP connection (or the overall internet connection) behaves quite erratically, and the result is stations dropping out of StationView.
If possible, it could be interesting to see what happens using a standard internet connection (non-Starlink one) to see if we can isolate the possible cause.
Also, if you can attach the logs from the other two instruments you have, I can confirm if this is the reason why all three are not reporting, or if there are other factors at play.
I’m increasingly skeptical that loss of NTP sync explains the issue, given that all three units are on the same switch, with the same backhaul, to the same switch, to the same Starlink terminal. It also doesn’t explain why the 4D’s local status page occasionally reports that services have died.
If I had an alternative, I wouldn’t be using Starlink
Now, on the NTP front, I did observe that the Rboom was configured to use a different pool than the 3D and 4D. However, to factor this mess out entirely, all three are now talking to a pair of stratum one servers installed on the same switch as the Shakes; all three have sync with the local stratum-one servers as well as various pool servers. Currently the 3D and 4D are reporting data locally and claim connections to the server, but are not reporting on Station View; the Boom is showing a server connection and is reporting on Station View, but is showing nothing in the local activity preview…
I have been monitoring your Shakes for the past 24 hours or so, trying to gather more information about your situation:
The Boom continued without issues, as far as I could see
The 3D appears to be behind real-time, but is slowly catching up
The 4D, instead, has transmitted for a bit, and then seemingly stopped
So, overall, at the logs snapshot, the Shakes were connected to the server and could communicate with it. I have asked our server team to check things on our side, and I’ll let you know if they find anything.
In the meantime, could I ask you to SSH into the Shake and execute these commands without rebooting the instruments or your Starlink router?
SHM(0) .GPS. 0 l 10 16 377 0.000 408.520 1.569
SHM(1) .PPS. 0 l - 16 0 0.000 0.000 0.000
*tick.mainecoon. .PPS. 1 u 170 512 377 0.292 0.992 2.624
+tock.mainecoon. .PPS. 1 u 541 512 377 0.274 -0.822 1.742
+usscz2-ntp-002. .GPSs. 1 u 60 64 377 22.750 2.338 8.135
At the moment, all three are up, all three are finding each other on the splash screen, but only the Boom’s data is being reported by Station View. Rebooting the 3D/4D will result in their data being reported (despite an NTP solution not yet being computed), but eventually they fall off. It’s a bit like the wire protocol doesn’t recover nicely from network interruptions.
Thank you for all the command-line reports, wiley42.
Our server team has checked things on our side, and I can report that there are no server-side issues at the moment. So, we need to focus on why the connection is behaving in this way.
As reflected client-side (your logs), our servers get many data transmission timeouts that can also be seen from all the broken pipeline lines in the logs. This means that the Shake is actively trying to transmit, but then the interruptions you’ve seen happen.
Let’s try this test by checking what happens when the Shakes are connectet singularly:
Disconnect the 3D and 4D, and leave the Boom on for 24h
Then disconnect it, and reconnect only the 3D for 24h
And, last, disconnect that and connect only the 4D for 24h
This is to check if the data volume that the three instruments are trying to update is behind what we are seeing. We already know that the Boom works fine, but it will be interesting to see the result of the experiment above.
Additionally, another element you can check (if you haven’t already) is this.
Yep; interruptions are expected; the vault is located in mountainous terrain with visibility to the north partially obscured by trees and rising terrain, which means transient drop-outs, up to several minutes in duration, are normal. There’s a 40 meter tower waiting to be erected that should get the terminal above the trees (if only just), but there’s not much we can do about terrain but hope that build-out of the constellation eventually fills the gaps.
Currently, everything in the vault is down thanks to storms and the UPS eventually running dry. Everything is under two meters of snow, but clearing is expected soon, at which point we’ll excavate things and get back on the air – at which point we’ll run the proposed experiment.
While the Shakes are backhauling via Starlink, it’s not “naked” Starlink; they live on a RFC1918 network with a firewall at the edge that builds a tunnel via Starlink to a data center in a more -er- civilized part of the world. This gives us a way around the sometimes unexpected behavior of the Starlink terminal, and allows us to have a fixed external IP address.
I’ll be back once I have more data to share. As ever, thank you for the support!
Quite understandable, thank you for all the additional details wiley42!
Probably, the additional height provided by the tower will address most of the issues, as the terminal should also be able to connect to more satellites (while waiting for the network up there to densify).
It’s no trouble at all. If you need further assistance when the snow melts, just reach out to us!