Upload data volume

I know this was asked on the prior forum, but is there any update on reducing the bandwidth for uploading data? I expect I’ll be on a cell based hotspot in the near future, so I have a real motivation to help out if I can. Is there a way to participate in the development of the rshake software? This sort of thing is kind-of up my alley. My alternative is going off the air, so to speak.
Thx, Chris

Hi Chris,

Moving to binary data transmission is in our development plans for 2019. Richard estimates an update between the end of this quarter and the end of the fourth quarter of the year with this improvement included.

I know that’s a pretty big range and might not be quick enough for your transition to hotspot internet. We appreciate your patience, there are a number of moving parts that we have to be careful not to break in the process.

As stated in a previous post, the current network data usage is ~2 GB per channel per month. Upon implementation of binary data transmission:

Ian

Hi Ian, I’m sure it’s a big change, but I’m a little surprised there’d only be a 4x reduction. I’d captured a few transmissions a while ago, but can’t find them at the moment, but I’ll bet there’s more cheese to be found here. :^)
Thx, Chris

1 Like

I get about 350 MB/month on the RShake, but more like 570 MB/month for RBoom. This is due to the (numerically) large amount noise on the signal resulting from basically “too much gain” on the input side plus the fact that the mini-seed compression is linear and lossless.

If you were to take the analog values and right shift them 7 places for a divide-by 128 effect, then I would bet the RBoom files would go down to something similar to RShake.

I think nothing would be lost in scaling. Background noise would shrink from about 1000 counts to 8 counts - quite the same as the Rshake data. The calibration constant would go from a nominal value of 4000 to 31.25.

Ken

1 Like

Hi Ken, I’m reasonably sure the problem isn’t the amount of noise. Due to the nature of the data, there are 100 sample/sec per sensor to send, as JSON (text), and for RBoom, there are two streams, so ~double the data makes sense. It’s easy enough to capture the actual data with tcpdump to see what I mean. I’ve been thinking about this for a while, and you have to consider how you’d encode the data in a more compact way. The sample values are 24bits of resolution each. Basic compression might help, but given the randomness of the values, it doesn’t compress all that well. Not sure if floating point would help either, since it has to convert back to absolute values at the other end (lossy). I’ve considered .wav files, since it is “50hz audio” for all intents and purposes. No idea if that’s any better either.
Chris

1 Like

I started to make accurate data transmission analysis and I found that activating the Forward data the transmission volume is much much higher than declared. I used the ifconfig command on the router and have estimated the amout of RX+TX GB and the activation of the Forward option switch from 2.3 GB/month to 20.5, with an a=overload of more than 18 GB/month which is way beyond the limit we have of 5 GB/month.

We have also a second raspishake but we cannot access any more the html interface (it is in the field); I wanted to make a comparison but I cannot activate the Forward any more. I asked in another post how to activate/deactivate from the command line; I got a reply that a programme will be given to make that…

Capture

Thanks for the reply. This issue (for RBoom) has been a bit like pushing on a string. But I think I am right.

My RBOOM is infrasound only - so there is but one data stream.

My RShake is a separate 1D unit - so again 1 data stream only.

The SEED compression algorithm converts constant length data into variable length data. It does this by computing simple arithmetic differences between samples and seeing how many bits are required to encode the difference. If the signal is wandering around the range of, say, +/-10 then only 5 bits are required to record each sample of the signal. If the signal has differences of 1000 between adjacent samples then 11 bits is required for each sample. So the data portion of the resulting compressed record is twice as large. See appendix B of this reference:

It’s not just data storage - the responsiveness of SWARM is noticeably slower when working with RBOOM data.

I did not mention that I have a 4D, 4 channels.

I tried also the second Raspishake and the result was the same. Activating both of them on the same router the equivalent consuption rasied to about 38 GB/month. Is there something I can do to verify further the data consumption ? We cannot keep it on because it exhaust the credit we have (5 GB/month) but if a logfile can help you to understand if there are problems in our version let me know

There was no further reaction to my posts where I identified a very large data consumption (up to 18 GB/month for a 4D device). Is there any measurement I can make to check why I get a huge difference respect to what suggested by developers (2 GB/channel/month, so 8 GB per month in my case)

hi there,

sorry for the delay, we will be getting back to you, by tomorrow, if not today.

cheers,

richard

As we’ve noted in the past, the upload volume for a RS4D is typically within a range of 8-14 GB/mo. That’s not a guess, that’s based on our calculations and verified by measurements we’ve made here in the office and in the field. If you are seeing something larger than that, then there must be additional traffic being included in the usage computations for that connection. This may have happened as a result of:

  • Direct data link(s) from the outside via e.g. SWARM or FileZilla, or another Seedlink/wave server client
  • UDP data forwarding from the Shake to an outside address
  • Another program using the Pi’s uplink

Here are the calculations for the Shake 4D’s data forwarding usage. These have been tested and confirmed:

Max data / second:
= number of bytes for headers + number of bytes for data points

chn = total number of channels
dpf = data packet frequency, i.e., number of data packets / second
dph = data packet header size
sps = sample rate, as samples / second
bdp = max bytes / data point

= chn * dpf * dph + chn * sps * bdp
= 4 * 4 * 80 + 4 * 100 * 11
= 1280 + 4400
= 5680 bytes / second

Max data per day:
= 5680 * 86400
= 490752000 bytes
= 468 MB

Average data per day: (assuming 5 bytes / data point):
= 270MB / day / 100Hz 4D

Max data per month:
468 MB * 30 days = 14040 MB/mo

Average data per month:
270 MB * 30 days = 8100 MB/mo

Thanks for that, Ian. For comparison, my /measured/ upload from a 1D is
5330 bits/second, or 1.68 GB/month:

https://www.satsignal.eu/mrtg/raspi14.html

Cheers,
Daivd

2 Likes

I love those (gnuplot?) graphics @GM8ARV. Thank you for posting them and illustrating the measurements!

A log file would at least help us identify if there was some RShake data activity other than the data forwarding. Post it and we’ll take a look.

Hi folks
Wondered if there is an update on the binary data upload feature?
With my shake4D, I’m seeing about 468MB per day, which agrees with the maximum expected value posted earlier (I guess this means the compression isn’t effective for some reason…?). Anyway, would love to have the data usage reduced to match the size on disk.

hello,

the first step in reducing the data payload size was fixing a bug where the white space was not being removed before sending, this was done in v0.16. the next step is to convert the stream completely to compressed binary, which is underway. this will be completed in Q2 this year and released as part of a a normal system update.

apologies for the continued delay.

richard

Hi Folks
Checking back again on the upload data reduction question. To circle back around, we ended up with unlimited VDSL service (15Mb/s down, 5Mb/s up). I’m still interested in reducing that usage, and I’ve been off-the-air for quite some time. I’m finally moving my 4D to a detached garage and plan to re-enable data upload now that the data should exclude our regular daily activity :^). It’s quite impressive how trouble free this unit has been. I do keep it on a UPS and have an SSD using a USB <=> SATA converter.
Thx, Chris

1 Like

Hello Chris, and welcome back to our community!

We have implemented a data protocol that, as of now, has reduced the amount of transmitted data down to a third of what it was before. Regarding binary compression, we have been gradually improving our server infrastructure for it to be able to sustain this new method of data transmission, and it will thus be implemented in the near future.

We will, as always, provide updates one everything is in place. Thank you for your continued interest!

Time to repeat my old suggestion that, on the RBOOM side, you could divide all the data values by 10 (or 8 if more convenient) without losing anything. This is because of the amount of instrument noise in the signal. See this posting for the noise with the air “short-circuited”.

The no-signal reading is +/- 1000 counts and this is faithfully recorded by the lossless compression scheme that Raspberry uses. A lossless compression scheme achieves zero compression on a signal that is all noise.

I note that originally it was thought that the gain of the instrument was 4000 counts per pascal. It turned out to be 56000 counts/pascals. Maybe that is why there is 10 dB too much gain/noise in the signal.

The problem would be how to handle the change in the instrument response …