Shutdown, remove UPS, reboot - Server Connection: Not Connected

My UPS whined (loudly and steadily) that it’s time for a new battery. Maybe something about recent -27°C low temperatures outside the unheated garage. My RSB was running fine. Go inside, shutdown the RSB via web interface, go to garage, remove UPS, plug router and RSB into wall outlet.

Go inside, RSB up, helicord display works, Dataview shows no new data. The RSB had come up before the router, so reboot RSB via web page again - no change.

From the RSB today:

$ ping -c 2 raspberryshakedata.com
PING raspberryshakedata.com (144.91.66.87) 56(84) bytes of data.
64 bytes from ip-87-66-91-144.static.contabo.net (144.91.66.87): icmp_seq=1 ttl=54 time=113 ms
64 bytes from ip-87-66-91-144.static.contabo.net (144.91.66.87): icmp_seq=2 ttl=54 time=109 ms

Web home page shows:

|Data Forwarding|:|ON|
|Server Connection|:|Not Connected|

Best clue I can find in log files is in odf_SL_plugin.log where it shows the DDS Destination is gone:

2021 294 13:47:49>> odf_SL_plugin: Program Starting…
2021 294 13:47:49>> Version: 2020.045
2021 294 13:47:49>> Run-time parameters:
2021 294 13:47:49>> Data Source: INSTRUMENT
2021 294 13:47:49>> Network Code: ‘AM’
2021 294 13:47:49>> Station Code: ‘RB3C2’
2021 294 13:47:49>> Serial Port: ‘/dev/ttyS0’
2021 294 13:47:49>> VEL SR: 100 hz
2021 294 13:47:49>> ACC SR: 0 hz
2021 294 13:47:49>> IS SR: 100 hz
2021 294 13:47:49>> Data Destination 1: DDS
2021 294 13:47:49>> Host: raspberryshakedata.com
2021 294 13:47:49>> Port: 55555
2021 294 13:47:49>> Data Destination 2: UDP
2021 294 13:47:49>> Data Destination 3: SEISCOMP

2022 028 21:54:09>> odf_SL_plugin: Program Starting…
2022 028 21:54:09>> Version: 2020.045
2022 028 21:54:09>> Run-time parameters:
2022 028 21:54:09>> Data Source: INSTRUMENT
2022 028 21:54:09>> Network Code: ‘AM’
2022 028 21:54:09>> Station Code: ‘RB3C2’
2022 028 21:54:09>> Serial Port: ‘/dev/ttyS0’
2022 028 21:54:09>> VEL SR: 100 hz
2022 028 21:54:09>> ACC SR: 0 hz
2022 028 21:54:09>> IS SR: 100 hz
2022 028 21:54:09>> Data Destination 1: UDP
2022 028 21:54:09>> Data Destination 2: SEISCOMP

From odf_SL_plugin.err:

2022 028 09:56:34>> sendDClientDP(): Error sending data …
2022 028 09:56:35>> sendDClientDP(): Error sending data …
2022 028 09:56:35>> Connection request (raspberryshakedata.com:55555) failed with error code: Connection refused
2022 028 09:56:35>> sendDClientDP(): Error sending data …
2022 028 09:56:35>> sendDClientDP(): Error sending data …

Hmm. The garage router uses WiFi to my main router, I wonder if its NAT data is confused.

Is there a safe way to try connecting to the server? In the meantime I’ll stick to observing and not poke things outside of my LAN.

I have rebooted my RSB before without incident, I was not expecting difficulty this time. I’ll figure out how to upload the current log tar file.

Oh, there’s an upload icon in the text editor…

RSH.RB3C2.2022-01-31T17_50_05.logs.tar (2.0 MB)

Hello RicWerme,

Thank you for posting the logs of your Shake and all the details of the operations that you have made. Nothing seems to appear out of place during the booting process, so we can be reassured that the unit is operating as it should.

This is, however, out of place:

Station Info
------------

Data-Sharing Mode : OFF
 Data Server Conn : OFF

Could you please reboot the Shake again and then try the following procedure to see if it solves the connectivity issue?

Please access your rs.local/ page, go to Settings (the gear icon high on the left), and then the Data tab. Make sure that the Forward Data box is checked, and then click Save and Restart.

The station should now be able to connect again.

If it does, then this error is caused by a known bug that we are examining and that will (hopefully) be solved in the next Shake OS v0.20 release.

Thank you.

Thanks,

Bottom line:
> Station Info
> ------------
>
> Data-Sharing Mode : ON
> Data Server Conn : OFF

The main status page says:

|Data Forwarding|:|ON|
|Server Connection|:|Not Connected|

Discussion:
So, progress. Not without confusion.

From another problem report, please-help-server-connection-not-connected-issue/2658, I had already tried the suggestion to uncheck “Forward Data,” waste time on FaceBook, check “Forward Data,” and restart. Didn’t seem to help.

This time, merely verified the box was checked and restarted. And couldn’t reconnect to the server. I checked the RSB, the green light was flashing, so I didn’t want to unplug it, but looked at control pages for both routers, had lunch, went on a wild goose chase because I saw another IP address with client name “rs local” which turned out to be my TV’s set-top box.

Somewhere along the line, I could reconnect. Helicorder data showed a 30 second gap. I hate computers.

Got logs, I see in odf_SL_plugin.log that the server is back:

2022 032 17:16:48>>             Data Destination 1: DDS
2022 032 17:16:48>>                     Host:   raspberryshakedata.com
2022 032 17:16:48>>                     Port:   55555
2022 032 17:16:48>>             Data Destination 2: UDP
2022 032 17:16:48>>             Data Destination 3: SEISCOMP

Meanwhile, dataview.ras… has dropped RB3C2.

I assume it will come back in a while - I’m not touching anything for a while! Maybe I’ll wait for the next OS release before reinstalling the UPS.

Question:
Will the cloud viewer start working on its own, or should I touch something to get my RSB to reconnect to the server?

RSH.RB3C2.2022-02-01T19_30_17.logs.tar (2.1 MB)

Hello RicWerme,

Thank you very much for providing your new logs and for describing all the steps that you have done while trying to work around this issue.

I can see that the data had started flowing again, but then stopped another time. You did good in not unplugging it while still on, because it could have caused issues with the microSD card. We advise to do so only as a last-resort case.

A question, since there seems to be some kind of local network conflict with your TV box (it seems very strange that the rs.local/ address appears for that device), have you tried shutting off the Shake, disconnecting your TV box momentarily, turn on the Shake again and see if manages to connect?

Another test that you can do, since you said that you have two different routers (I have the same setup at home) is to try to connect the Shake to the router directly connected to the internet line, and see if in that case it manages to connect. I am suggesting this since I had to do some work on my second router to manage and get the Shake connected, so this could be a similar case?

Our servers will show the unit again after a stable connection has been regained. As of now (morning UTC time) the Shake still seems disconnected.

I’m going to take some time to learn more about the environment. My career has been mostly on Unix networking and file systems. It even goes back to writing one of the first FTP clients on the ARPAnet.

I don’t know much about WiFi or Linux internals despite using Linux desktops for ages. I want to see if I can capture the network dialog between my RSnB and elsewhere if one of my routers supports that “properly,” then study with tcpdump or wireshark. Coax ethernet was so much better for such study! Or maybe run tcpdump on the RSnB if I get frustrated enough…

It wasn’t until I got the RSnB that I realized things like “rs.local” was available. Didn’t work for me, so I use a fixed IP address and have an entry in /etc/hosts files for “myshake.” I suspect the TV box got it when DHCP assigned 192.168.0.14 to the RSnB when I first had it, then 192.168.12 later, maybe after a power failure, that TV box may have just picked it up when it was given …14. I’m nearly certain that’s a red herring. It wasn’t until a couple days ago that I even realized it was related to the client ID I saw on the routers. I see the two routers have a different idea of those client names. Odd. A lot to learn.

2 Likes

Hello RicWerme,

First, the good news! I can now see your RSBOOM live again on our services, here: RS StationView

I assume then that whatever you have done in the meanwhile has caused the Shake to be able to connect again, so thank you for persisting on this.

It is possible that the issue with the TV box was like you have thought, but now all works well, so let’s hope that it continues like this. And yes, I agree, there is always a lot to learn even when one thinks that everything has been learned.

I din’t do nuttin’! Honest! At least I don’t know what I might have done. Sigh.

I noticed late last night that a tcpdump command listening to “ether host b8:27:eb:57:b3:c2”, my RSnB, had captured several broadcast messages to it. The were all sensible DHCP replies, which added to some uncertainty about “If I gave it a fixed IP address, why did I use …14 like DHCP uses and not something like …250 like I did for the printer and similar stuff?” and “I don’t remember giving it a fixed address, but I do remember adding it to /etc/hosts.”

Turns out it is using DHCP. That’s pretty simple in concept, but it seems to be involved in some weird config problems. I may have confused things when I took out the UPS and the RSnB booted before the garage router.

Then I saw that the RSnB status page said it was connected to the server and that the dataview page was [mostly] working. So I just went to bed instead of figuring out why.

I don’t know why I’m seeing more DHCP messages than I think are necessary, but it’s legal. Not worth pursuing.

There’s some point on the EHZ trace for 2022-02-02 23:19:52 UTC that looks okay on the local helicorder traces, is a huge spike at dataview, and if I zoom in on it, I get a “Connection request failed” error at that moment in time. Very weird, not worth pursuing.

I have the new battery for the UPS, I want to play with that a bit as the UPS may be looking for too high a voltage to for charge control/worn out battery. I expect to install it. I will be making some “single step” changes, starting with a fixed IP address, then reinstall the UPS and power up the garage router, and look at its status. Then restart the RSnB and commune with that for a while.

If it all goes well I’ll declare this fixed with the diagnosis that “Even my RSnB doesn’t like me.”

On a whole different topic, the dataview page stopped working on my Google Chrome browser (on Linux desktop). Now it doesn’t work on Chromium. I’m running it ok on a Windows laptop with Google Chrome. I get the message:

This page isn’t working
If the problem continues, contact the site owner.
HTTP ERROR 413

It may mean the browser is sending a lot of stale cookies, I haven’t dug into it yet. Should I open a new service request when I learn more?

Hello RicWerme,

Yes, what you have explained may have been the reason behind all the issues that you have experienced. And I also agree with the step-by-step approach that you have in mind for when you will decide to reintroduce the UPS in the mix, it definitely is the best method to do it.

Thank you for reporting the HTTP 413 issue with our DataView portal. I have tried to replicate it on one of my Linux machines but without success. Nevertheless, I will open a ticket for our software team, and they will take a look at it.

RicWerme,

For some reason, I think our paths may have crossed before, on some completely different website …
Anyway, just a couple of comments on your observations.

Firstly, you will not see, connections on your device by using netstat. The RS code runs in a container, and the way that Linux networking works these days, that means that it is likely to be more complicated to see connections on the host OS. For better or worse, Linux now uses a quite different TCP/IP stack than the old Sun one, which was pretty much the defacto standard for Unix and Unix-like systems for many years. The old commands (like netstat) are now actually wrappers around other commands. Unfortunately, there isn’t any one simple command to see all connections.

Another issue you might bump into with Raspbian is that the maintainers of that system have some rather odd ideas. Especially around what a static IP is. I ran into this when I was trying to use an R-PI as a controller for an astronomical telescope tracker. I wanted to just set up a tiny private network, using static IPs and view the controller dashboard that way. Doesn’t work. They have the idea that there is ALWAYS a DHCP server on every network (they obviously live very sheltered lives), so you can configure whatever you want, but it doesn’t take effect until the R-PI contacts a DHCP server to “register” its static IP.

As for things stopping working on Chrome … It is the way … at least it is now. Code under development is deployed directly to end users. We become alpha testers, like it or not. It is not at all unusual for things to break silently as new chrome versions are forced upon you. Some will (silently) get fixed a while later. This is why you keep a few other browser types around. One will usually work ok while the broken one gets (silently) fixed.

2 Likes

[Reply to PhilipPeake, I’ve been remiss in posting an update, but I can post a reply to Phil]

Google suggests our paths may have crossed. I do spend too much time on the 'net. :slight_smile:

Yeah, saw the docker/container stuff early on. I don’t have much experience or appreciation of VMs, and understanding how (and why!) they’re used in RS is on the todo list. Also, while my Internet background includes a lot of BSD/SunOS/Tru64/AIX, I have some exposure to Guelph/Linux (I don’t know if they’re connected!) and have consorted with the NFS community in the past, including Linux kernel NFS authors. Curiously, I did note that tcpdump is on my RSnB but it doesn’t display data from eth0. It does display data from docker0, so another task is to see if there’s some config variable that would let me capture eth0 data. Sigh. BTW, one reason for buying the RSnB was to learn about Raspberry Pis. I would have started with a bare system and added a camera or weather interfaces, but the RSnB was irresistable as I also want to record and muck with wind turbine infrasound.

Thank you for the comments on always DHCP. That may help me describe the “impossible” state of affairs when I set a static IP address. (Why, oh why did I fix something that wasn’t broken at the moment?) My RSnB is running, I do not know WTF it’s doing networkwise. I’ve turned my attention for a few days to more enjoyable tasks like taxes, dealing with sub-zero (Fahrenheit) temperatures, and vacuuming the whole house.

As for Chrome/Chromium, similar things in the past have been tied to very long requests due to huge amounts of cookie data. May not be the case here. I’m also way behind in updating my main Linux system and browsers (maybe if I just create a buncha containers…). I think Chromium died from some site that ate up all the RAM, restarting didn’t fix things. I did have the brilliant thought of trying an Incognito window - stationview and dataview both worked fine in it. I’ll have to try that with FaceBook too. (Speaking of things that work in mysterious ways - and fail in equally mysterious ways.)

Ric - I don’t know if you know this stuff, sorry if you already do …
But I found this to be helpful in understanding more about modern Linux networking when I was trying to understand why I couldn’t see the default route taking traffic over a wireguard link.
It’s really a document to help explain that, but it also explains a lot more…

https://ro-che.info/articles/2021-02-27-linux-routing

Wow, I’m amazed, and somewhat chagrined, at how informative that page is. That’s a whole bunch of stuff I haven’t had to fight with for a few years. Heck, I didn’t even remember if my current desktop has a DHCP or static address (it’s DHCP).

That’s one of the reasons I want to do some serious playing with Raspberry Pis.

I’ll look at it more closely tomorrow, it will offer some good guidance to trying to understand some of the weirder stuff going on.

Wow, the end of April. What a bunch of distractions. However, my neglected RSnB has burbled to the top of the stack, pending the next distraction.

Since Feb 8th, the poor thing has endured a power failure or two (I haven’t plugged the UPS back in). I’ve brought it inside and will power it up in my office, shaky 1860 wooden floor and all that. I have pulled out the micro SD card and copied the files from both /boot and / to a file server, and will install per directions on to a new card so I can compare the two. Then boot the old and experiment a bit.

One thing I found before and verified from the copied data is that the sole file related to DHCP stuff is

sl:~$ cd /nfs/myshake/main/etc/

sl:etc$ ls -l dhcpcd.conf
-rw-rw-r-- 1 root syslog 1350 Feb  3 15:47 dhcpcd.conf

sl:etc$ head dhcpcd.conf
interface eth0
static ip_address=192.168.0.253/
static routers=
static domain_name_servers=192.168.0.1
# A sample configuration for dhcpcd.
# See dhcpcd.conf(5) for details.

# Allow users of this group to interact with dhcpcd via the control socket.
#controlgroup wheel

What does that file begin with on a well behaved system?

This is mine - using DHCP, not a static IP:

 # A sample configuration for dhcpcd.
 # See dhcpcd.conf(5) for details.

 # Allow users of this group to interact with dhcpcd via the control socket.
 #controlgroup wheel

 # Inform the DHCP server of our hostname for DDNS.

 # Use the hardware address of the interface for the Client ID.
 clientid
 # or
 # Use the same DUID + IAID as set in DHCPv6 for DHCPv4 ClientID as per RFC4361.
 #duid

 # Persist interface configuration when dhcpcd exits.
 persistent

 # Rapid commit support.
 # Safe to enable by default because it requires the equivalent option set
 # on the server to actually work.
 option rapid_commit

 # A list of options to request from the DHCP server.
 option domain_name_servers, domain_name, domain_search, host_name
 option classless_static_routes
 # Most distributions have NTP support.
 option ntp_servers
 # Respect the network MTU.
 # Some interface drivers reset when changing the MTU so disabled by default.
 #option interface_mtu

 # A ServerID is required by RFC2131.
 require dhcp_server_identifier

 # Generate Stable Private IPv6 Addresses instead of hardware based ones
 slaac private

 # A hook script is provided to lookup the hostname if not set by the DHCP
 # server, but it should not be run by default.
 #nohook loo
 denyinterfaces wlan0

That’s exactly what I hoped to see, thanks.

I rebooted my RSnB in my office, and verified that my file had my first four lines but everything else was the same. I commented those out and rebooted, now its default route is going through WAN gateway instead of virtual system vethff7a8a2.

It can contact systems at raspberryshake.org again. No sign of my system at stationview.raspberryshake.org, but I’ve read that takes a while. So maybe it will be back tomorrow AM at RS DataView BETA

Yay, RS DataView BETA is back.

It’s now at a different local IP address, but no surprise there. I still want to give it a static IP address, but that task is much lower priority than others I’ve been putting off. At least I know I have to understand why Raspians are so insistent that people use DHCP. I’ll have to learn more about that before I take the unit out to record infrasound from industrial wind turbines - the main reason I bought the RSnB in the first place.

The unit is also in my 2nd floor office (shaky wooden floor), I’ll move the UPS and RSnB back to the (detached) garage floor later today where it has been very happy.

1 Like

I did a bit more digging. Turns out that (as usual) this all boils down to using systemd. Trying to dumb down complex things to make them easier for the majority of dumb users to use, and breaking stuff in the process - you know, like putting a DNS server into systemd, so that if you try to run your own DNS, you find that the ports are in use.

Well, they also decided that everyone uses DHCP and that “Network Administrators” would not approve of allowing user assigned static IPs. (Well, I am my network admin, and I DO approve.).

The Raspbian documentation (what there is of it) suggests pretty strongly that the old /etc/network code doesn’t work. That it has been replaced by dhcpd. Well, that’s only sort of true.

Read this: How do I set up networking/WiFi/static IP address on Raspbian/Raspberry Pi OS? - Raspberry Pi Stack Exchange

Particularly the SECOND answer (with 104 up-votes)

Turns out that the /etc/network config can be activated (after disabling dhcpd). Then you have no dependency at all on DHCP.

NOTE that this probably means that the rs.local form of addressing it no longer works. But since you know the IP, that should not be a problem.

1 Like

Thanks, another great page to read - some day. For the time being, I’ll learn more about DHCP servers.

The next high priority tasks are to see what changes may have been made to the protocol use to talk to my Davis VP Weather station, my old VP is tired, time for a VP2. There may be none, it may moderate hassle and phone calls to support. Serial line protocol. Also have to clean the house before some in-laws visit.

Thanks for all your pointers and insights. One thing I may do is use a Raspberry Pi as a “range extender” for the weather station - keep the console closer to the outdoor unit and use a Raspberry Pi to do all the data collection and save it over WiFi. Might be a good exercise to keep things simple - like with a static IP address…

1 Like

I followed this discussion very closely, thank you for the literal mine of information that you have provided!

I am sure it will be useful for future (or even current) Shakers that are in a similar situation.

Thank you again!