Seedlink data latency - TCP protocol

jpschmidt8 · December 1, 2023, 6:15pm

I am trying to determine data latency for Raspberry Shake 3D (RS3D) and 4D devices that are forwarding data to our Earthworm server (linux) via a seedlink connection. We have some remote stations and we are trying to determine what range of latency we could expect to see given specific sites bandwidth with either UDP or TCP/Seedlink protocols. In reading through the RS documentation and forum, I have seen some discussion related to data latency, but only related to UDP. It appears that for UDP protocols, that RS3D and 4D’s ship data packets at a strict quarter of a second, regardless of the total size of the packet. That means, if our RS4D was sending maxed-out data packets during intense shaking, we would expect to see 4 packets a second with a total size of = 5680 bytes/sec, and 4260 bytes/sec for the RS3D. (Upload data volume - #11 by iannesbitt). This is great information for UDP, but this doesn’t help with TCP style connections.
In doing a sniffwave command on an earthworm server using slink2ew (seedlink) modules, I see the following output:

Four lines of example sniffwave output for RS3D RA192

RA192.EHZ.AM.00 (0x32 0x30) 0 i4 338 100.0 2023/12/01 17:45:03.87 (1701452703.8670) 2023/12/01 17:45:07.24 (1701452707.2370) 0x00 0x00 i66 m204 t19 len1416 [D: 0.9s F: 3.6s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 310 100.0 2023/12/01 17:45:07.25 (1701452707.2470) 2023/12/01 17:45:10.34 (1701452710.3370) 0x00 0x00 i66 m204 t19 len1304 [D: 0.8s F: 3.0s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 350 100.0 2023/12/01 17:45:10.35 (1701452710.3470) 2023/12/01 17:45:13.84 (1701452713.8370) 0x00 0x00 i66 m204 t19 len1464 [D: 0.7s F: 3.4s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 370 100.0 2023/12/01 17:45:13.85 (1701452713.8470) 2023/12/01 17:45:17.54 (1701452717.5370) 0x00 0x00 i66 m204 t19 len1544 [D: 0.8s F: 3.8s]

Header information for this output can be seen here:
(Earthworm Program: sniffwave overview)

According to the sniffwave, it looks like TCP packets sizes (ex: len1416 from first line) vary and are not static. The number of samples per packet (ex. 338 samples for first packet) is also not consistent with other packet sizes from the same channel stream. When it comes to how RS devices handle the seedlink server, is there any compression to the data, or is it sent in an uncompressed format? What is the RS logic for the seedlink server as to when it decides to send packets? It doesn’t seem to be configured to send based on total number of samples or length of packet. Understanding how the seedlink server handles data would be appreciated.

Understandably, when it comes to Earthquake Early Warning (EEW), it is usually priority to get data as fast as possible to minimize latency, but UPD does open the doors to possible data loss/gaps if bandwidth is limited. It would be nice to know how TCP performs in comparison.

In a related topic, If there is a way to set the seedlink logic to send a packet every second, regardless of the size of the packet, that would be a nice feature. I know that during the 2019 Ridgecrest EQ sequence in CA, that SCSN implemented a 1 Hz LZ1 channel that measured data latency. Being able to configure the seedlink to send every second would allow users an easy way to replicate such a channel on their server to monitor latency SOH.

ivor · December 5, 2023, 11:07pm

hello,

first, i can indeed confirm your observations regarding the differences between data packets’ contents that you are seeing:

using the UDP protocol, data packets are delivered at a constant rate of one packet every 1/4 second, containing 25 data points per packet
while using the seedlink protocol, the data packets delivered are not constant, but variable: both in how often they are delivered and the number of data points contained within each

the reason the seedlink data packets are variable is historical: seedlink, forever, has used the mini-SEED seismic data format. this format is indeed (very) compressed, though using an algorithm that is specific to seismic data only. because it was originally designed to minimize the amount of disk space needed to store the data, (long before there was live data streaming and, probably, long before anyone was even thinking about such things as EEW services), it does not perform well when the requirement is “give me the data as quickly as it comes off the ADC”.

in brief: seedlink data packets are delivered only when a mini-SEED data packet has been completely filled. because of how the mini-SEED compression algorithm works, this means that, for quiet data, more data points will fit into a single data packet than that for noisy data. in other words, latency between data packets is greater during quieter times than noisier times.

since EEW systems are interested in the noisier moments, this is good. however, this (slight?) decrease in data latency during an event of interest will never be enough to satisfy the requirement to have the data delivered as quickly as possible.

in short, if EEW services are your ultimate goal, where data latency must be minimized to the greatest extent possible, i cannot recommend using the seedlink protocol, it simply cannot provide the type of performance required.

that said, while “instant data” is very sexy, maybe the question can be turned around, to: “how much latency could actually be tolerated?” of course, this is also tricky since the answer to this question very much depends on the distance between event source and the people needing to be warned of impending shaking, the closer they are to the source, the less tolerable any data latency will be.

moving back to UDP, your concern regarding packet loss is real, since this is exactly what UDP is designed for. however, it is entirely possible that, although there is no guaranteed delivery, very few data packets will be dropped in practice; this can and should be measured (see the sample python programs in /opt/settings/user directory for example programs you can use for this). as well, even though TCP guarantees delivery, it does not guarantee timely delivery, it can ultimately be as late as it wants to be.

because i am unable to provide any absolute answers here, i would suggest setting up your own tests to measure each solution’s performance to determine what could work for your use-case:

what percentage of UDP data packets are lost in a real-world scenario? (carried out for a relatively long period of time, say, at least a week)
what is the average data latency when transferring data via seedlink?
how much data latency can be tolerated?
is there any possibility to use both protocols to deliver the data to your server?

and while i’m here, this seems like a good time to remind everyone reading this thread that while we at Raspberry Shake have no control over how you use our instruments, nor how the data generated by them is ultimately used, we very explicitly absolve ourselves of any responsibility and liability when used for any purposes beyond that of hobby. this is stated in the usage license here, in the last section titled “LIMITED LIABILITY & UPTIME”.

to be even more clearer (sic), the last paragraph of this section of the license states:

“… the Shake network is not to be used as an early warning system. While we cannot stop you from using it in such a way, we must state that using the network for reasons associated with saving human lives is not how it should be used. Furthermore, we refuse to accept responsibility for any negative outcomes due to the network being unavailable at any given moment.”

i hope that i have sufficiently addressed your questions, please let me know if not.

cheers,
richard

jpschmidt8 · December 6, 2023, 3:19pm

@ivor, Thank you so much for the thorough rundown of mini-SEED compression and TCP, this gives me a lot to work with in my testing on the pros/cons of each method in my use case.

PhilipPeake · December 6, 2023, 5:29pm

A quick comment on UDP packet loss:

On a local network (or better still network segment) with network and machines not overloaded, UDP packet loss is going to be zero (or as close to that as makes no difference).

Losses increase as the number of networks and network devices transited increases, but is still (IMHO) amazingly low.

But if you want to improve things, using TCP is a better solution.
To do this, choose another system on the same network as your shake.
As an example, my shake has the IP: 10.0.0.125
I set up UDP data to 10.0.0.4 port 8888.
On 10.0.0.4 I ran:

$ nc -u -l 8888 | nc 10.0.0.143 8889

On 10.0.0.143:

$ nc -l 8889
{'EHZ', 1701883395.705, 16872, 16903, 16851, 16942, 16859, 16785, 16823, 16770, 16712, 16856, 16900, 16836, 16784, 16743, 16860, 16950, 17023, 16887, 16839, 16869, 16863, 17018, 16846, 16872, 16961}{'HDF', 1701883395.705, 10575, 11565, 11127, 9750, 11369, 11585, 11102, 12517, 12209, 11276, 12011, 12081, 11486, 12472, 12267, 10906, 11475, 11455, 11573, 11980, 11671, 11089, 10957, 12160, 11309}{'EHZ', 1701883395.955, 16951, 16923, 16894, 16919, 16925, 17021, 16840, 16875, 16912, 16841, 16898, 16879, 16757, 16772, 16855, 16729, 16745, 16773, 16721, 16820, 16892, 16834, 16819, 16671, 16612}{'HDF', 1701883395.955, 10854, 11832, 10975, 11078, 11135, 11121, 12224, 12186, 11598, 11449, 11402, 10982, 11635, 11438, 11313, 12160, 12099, 11529, 11738, 12659, 12071, 12089, 13038, 12894, 12475}{'EHZ', 1701883396.205, 16791, 16895, 16841, 16751, 16744, 16780, 16831, 16783, 16672, 16692, 16719, 16796, 16764, 16649, 16765, 16865, 16890, 16849, 16727, 16707, 16678, 16666, 16762, 16728, 16702}{'HDF', 1701883396.205, 12422, 11681, 11629, 12859, 12344, 11656, 12185, 11646, 11446, 12582, 12711, 12206, 12499, 12300, 12444, 12648, 11859, 11415, 11487, 11041, 11646, 12836, 12064, 11648, 12015}{'EHZ', 1701883396.455, 16694, 16717, 16709, 16610, 16531, 16569, 16738, 16655, 16513, 16498, 16410, 16528, 16631, 16577, 16565, 16603, 16632, 16570, 16598, 16580, 16556, 16607, 16602, 16551, 16516}{'HDF', 1701883396.455, 11420, 11506, 12311, 11510, 11164, 11705, 11844, 11837, 12430, 12130, 11627, 12946, 12688, 12447, 12556, 11585, 11723, 12174, 12110, 12085, 12189, 12255, 13326, 12800, 11228}{'EHZ', 1701883396.705, 16489, 16508, 16527, 16464, 16389, 16510, 16513, 16459, 16522, 16325, 16377, 16423, 16410, 16454, 16417, 16479, 16552, 16619, 16423, 16427, 16484, 16508, 16600, 16534, 16530}{'HDF', 1701883396.705, 12302, 12953, 12197, 11760, 11993, 12050, 11885, 11653, 11599, 11762, 12920, 12865, 11649, 12248, 12313, 11866, 11303, 11369, 11651, 11736, 11858, 11712, 11323, 11119, 11521}{'EHZ', 1701883396.955, 16474, 16469, 16447, 16445, 16522, 16340, 16337, 16478, 16429, 16453, 16484, 16417, 16399, 16452, 16436, 16381, 16315, 16476, 16476, 16378, 16412, 16496, 16546, 16542, 16499}{'HDF', 1701883396.955, 11725, 11960, 12294, 12466, 11463, 12500, 13424, 11702, 12178, 12270, 11445, 11314, 11681, 12009, 11306, 11745, 12273, 12614, 12504, 11239, 11564, 12244, 11650, 11066, 11071}{'EHZ', 1701883397.205, 16467, 16602, 16535, 16500, 16517, .......

Using nc to strip out the UDP data and re-encapsulate/address it in TCP may be good enough for your purposes - I would put each end in a loop so that if anything breaks, it is automatically re-started) or you might want to code a specific application to do this - it wouldn’t be hard…

You could also write a small application (or maybe even use a bit of shell script and nc) running on the 'shake itself … but personally, I prefer to keep custom code off my shake.

jpschmidt8 · February 19, 2024, 3:14pm

Hey @ivor, thanks again for the thorough response related to TCP and UDP pros and cons. One last question related to this topic. I noticed that in practice with a couple of our RS3D’s that some stations send with higher/lower TCP latencies. Examples from sniffwave using the EarthWorm Seedlink connection protocol, I find that I get data “Feed” latency from RA192 at between 2.2s-3.5s, whereas with our REDAF station, we get a consistent 2.0-2.5s feed latency between packets.

RA192 Station:

>> sniffwave WAVE_RING_SLINK RA192 EHZ wild wild
...
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 324 100.0 2024/02/19 14:59:22.80 (1708354762.8020) 2024/02/19 14:59:26.03 (1708354766.0320) 0x00 0x00 i66 m204 t19 len1360 [D: 0.9s F: 3.2s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 300 100.0 2024/02/19 14:59:26.04 (1708354766.0420) 2024/02/19 14:59:29.03 (1708354769.0320) 0x00 0x00 i66 m204 t19 len1264 [D: 0.9s F: 3.0s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 295 100.0 2024/02/19 14:59:29.04 (1708354769.0420) 2024/02/19 14:59:31.98 (1708354771.9820) 0x00 0x00 i66 m204 t19 len1244 [D: 0.5s F: 2.6s]
RA192.EHZ.AM.00 (0x32 0x30) 0 i4 218 100.0 2024/02/19 14:59:31.99 (1708354771.9920) 2024/02/19 14:59:34.16 (1708354774.1620) 0x00 0x00 i66 m204 t19 len 936 [D: 0.7s F: 2.4s]
...

REDAF Station:

>> sniffwave WAVE_RING_SLINK REDAF EHZ wild wild
...
REDAF.EHZ.AM.00 (0x32 0x30) 0 i4 206 100.0 2024/02/19 14:57:58.77 (1708354678.7740) 2024/02/19 14:58:00.82 (1708354680.8240) 0x00 0x00 i66 m211 t19 len 888 [D: 0.6s F: 2.0s]
REDAF.EHZ.AM.00 (0x32 0x30) 0 i4 206 100.0 2024/02/19 14:58:00.83 (1708354680.8340) 2024/02/19 14:58:02.88 (1708354682.8840) 0x00 0x00 i66 m211 t19 len 888 [D: 0.9s F: 2.4s]
REDAF.EHZ.AM.00 (0x32 0x30) 0 i4 206 100.0 2024/02/19 14:58:02.89 (1708354682.8940) 2024/02/19 14:58:04.94 (1708354684.9440) 0x00 0x00 i66 m211 t19 len 888 [D: 0.9s F: 2.0s]
REDAF.EHZ.AM.00 (0x32 0x30) 0 i4 206 100.0 2024/02/19 14:58:04.95 (1708354684.9540) 2024/02/19 14:58:07.00 (1708354687.0040) 0x00 0x00 i66 m211 t19 len 888 [D: 0.8s F: 2.0s]
...

I will note that near the REDAF, there is a motor that runs constantly in the background at about 28 Hz. I imagine (related to your comments) that the motor strongly effects the sensor, likely resulting in poor data compression causing the packets to send more frequently than our RA192 station.
My question is, is there a way I could modify the seedlink protocol on the RS itself to decrease the packet size required to be sent so that they send more frequently? A packet sent every 1-2 seconds using the TCP protocol would be ideal, rather than the 3-4 second delay.