Running BirdNet 24/7 in a different way.

Running BirdNet 24/7 in a different way.

09-01-2024

I've been using the AI program Birdnet for some time now. I use the smart-phone app quite often when I'm out and about. We live rural and so we are lucky to hear bird calls quite often. Thanks to Birdnet, I now know what a Pheasant sounds like (spoilers, it's pretty weird and unique!). It's one of these examples of genuinely helpful and benevolent A.Is. I really like it!

I wanted to get it up and running 24/7, but using the gear that I currently have. I didn't want to buy anything new for this project. Rather, I thought it might be nice to learn a few new things and attempt to build something a bit more personal.

The existing hardware

I bought a Unifi UVC G3 Flex camera a while back. I wanted to monitor our Chickens out the front. It has some infrared stuff in there for monitoring things at night. It also, crucially, has a microphone. For a while, I had it turned off as mostly I'd just hear road noise.

The second piece of the puzzle is the home server I've had setup for a while. It's a small Dell Optiplex machine - nothing too fancy - running FreeBSD. I chose this operating system for a number of reasons, many of which are discussed in this excellent blog post at https://unixsheikh.com/articles/technical-reasons-to-choose-freebsd-over-linux.html. The tl;dr? It's a bit tighter than Linux, doesn't use SystemD and most important of all, has excellent support for ZFS. A while back I had a look at which filesystem I wanted to use for all my data - BTRFS, EXT4, NTFS etc. I found that ZFS had all the things I wanted, such as encryption, RAID and mechanisms for recovery. I've been running it for a while now and I'm quite happy. However, the restrained system does mean running cutting edge Linux programs isn't always possible.

I have a few old RaspberryPi model B+ boards lying around too! Much too slow to do much with normally, for this project they will be perfect! One of them has a screen attached to it. This makes it perfect for the final part, which we'll get to innabit!

One of the larger birds that live near us - an Osprey!

Birdnet

Birdnet itself is a Tensorflow based model which you can download for free from its github page. I've been using the phone-based version for some time now. The project seems to have grown quite a bit since I last looked at it. You can now buy specific devices running bird-net - something called a PUC The page Birdweather has been going for sometime now. Simply hookup your RaspberryPi running BirdNet-Pi to a microphone and you are away.

I didn't want to publish my findings to the internet just yet. Probably in the future I will do so. I wanted something a bit more personal and to use existing hardware that I have laying around. Since Birdnet runs on a phone I figured it can't be that demanding, so I set to and tried to get it to run on our home server.

RTSP

I wondered how I could access the microphone on the camera. Turns out I need to learn a bit about Real-time Streaming Protocol (or RTSP). Turns out this protocol has been about for a while. Remember Netscape or Realplayer? Yep, it's written (in part) by these folks! Incidentally, I hear Real-Player is still going. Good for them!

So RTSP isn't much use to me really. I had tried to get the video feed from the camera up on a webpage, but it was quite a pain. I did try a Go program over on github, but I found it to be a bit flakey; possibly this is down to the various browsers but I couldn't be sure. In the end, I resorted to everyone's favourite program FFMPEG to transcode the stream into a better format that I could display on a webserver. If FFMPEG can cope with the video, I'm sure it'll be fine for the audio.

The command I used looked at bit like this:

ffmpeg -i rtsp://192.168.2.156/s1 -vn -ar 44100 -ac 1 -t "00:00:05" output.wav

This grabs a 5 second wav file from the camera. Sorted!

I use Python for gluing things together. I'm a tad bored of it by now, but it is useful certainly. Turns out there's a python library for ffmpeg (of course there is!) called ffmpegio. Perfect! With this, we can call ffmpeg from within a python script and thus easy move our data around.

FreeBSD and Tensorflow

Birdnet can run in what it calls 'server mode'. It sits on a port and listens for HTTP POST requests consisting of multipart-form data. Essentially, it's a like a little webserver, only it's waiting for a stack of bytes representing the sound. This is the perfect mode for us. Only problem is, we need to get it working on our FreeBSD box.

Tensorflow is the big component here. It's not a straightforward library to build, and build it I must as FreeBSD is not one of the officially supported distributions - I can't just use pip.

Tensorflow uses Bazel as it's build system. Not something I've used before. Fortunately, so long as I install it through FreeBSD's package manager, everything is fine. There are a number of other things that need installing such as protobuf and llvm (among other things). I found that only Bazel 5 on FreeBSD works, which isn't the latest one.

FreeBSD has something called ports - what seems to be a list of ported software that you can build for FreeBSD that hasn't made it's way into the official pkg system. It turns out there is a port of Tensoflow for Python. Fortunately, I don't need GPU support this time (my server doesn't have one), so there's no need to install all the nvidia stuff like cuda and cudnn.

To start the build, I needed to run the following:

cd /usr/ports/science/py-tensorflow
make DISABLE_VULNERABILITIES=yes

Now I found there were a couple of problems. One of the supporting libraries bazel pulled in - re2 - has a bug in it. I need to alter the code by hand in vim (external/com_googlesource_code_re2/re2/dfa.cc) and remove a preprocessor statement I believe.

The program gfortran needs installing but thanks to the version numbering, a symlink needs to be created to gfortran. The same was true for llvm-config. FreeBSD appends version numbers to names of these executables, which means Bazel can't find them.

Once I'd gotten around all these hiccups, I managed to get Tensorflow to build. It took a long time - a couple of days in the end - and a lot of faffing, but it is possible. Good news for future neural nets at home!

The next thing was to test birdnet. Setting up birdnet involves making a python virtual environment and installing some libraries, pretty much as follows:

cd birdnet
python -m venv venv
source ./venv/bin/activate
pip install librosa resampy bottle
sudo apt-get install ffmpeg

One can then test birdnet using the following command:

 python analyze.py --i ./example/soundscape.wav --o ~/tmp/birdnet.csv

If it's all setup and working, the results should appear in the birdnet.csv file.

For the final step, I needed to run Birdnet in server mode. In this configuration, Birdnet waits for a POST request of multipart-form data, containing the wav file in question. The Birdnet repos contains the example code to do this, so I just copied it from there. To run birdnet in server mode, the following command worked for me:

 python server.py --host 127.0.0.1 --port 9013 --spath /usr/local/www/birdnet

The server now listens on port 9013. I have an nginx instance that sends and requests to a particular URL towards this service. All the parts are now in place. We just need a program that grabs the wav file, sends it to the server and presents the result.

D-Bus

The first idea I had was to use a D-Bus alert. D-Bus is a form of interprocess communication on Linux. You can post messages to the bus and any program listening will handle the messages set for it. In my case, I want a small python script to create the wav file, get the result from the server, then send the found birds to the D-Bus. An alert will appear on my desktop. Perfect! This blog post has an excellent write-up on how to do this.

The final result looks a bit like this:

"" Record the audio from an RTSP stream using
ffmpegio, then send to BirdNet for analysis."""

import os
from ffmpegio import ffmpeg
import json
import time
import requests
import dbus

def main(args):
    # Delete the wav file if it exists already.
    if os.path.exists(args.out):
        os.remove(args.out)

    # Call FFMPEG with the options we want
    duration = str(args.duration).zfill(2)
    command = ['-i', 'rtsp://192.168.2.156/s1' ,'-vn', '-ar', '44100', '-ac', '1', '-t', "00:00:" + duration, str(args.out)]

    if args.verbose:
        print("Excuting ffmpeg command: ffmpeg", " ".join(command))

    ffmpeg(command)

    # Now format the request and send it to the server
    url = f"{args.server}/analyze"

    # Make payload to send to the server

    mdata = {
        "sf_thresh": 0.03,
        "pmode": "avg",
        "num_results": 5,
    }

    audio_bytes = None

    with open(args.out, 'rb') as f:
        audio_bytes = f.read()

    filename = args.out.rsplit(os.sep, 1)[-1]
    multipart_form_data = {"audio": (filename, audio_bytes), "meta": (None, json.dumps(mdata))}

    # Send request
    start_time = time.time()
    response = requests.post(url, files=multipart_form_data)
    end_time = time.time()

    print("Response: {}, Time: {:.4f}s".format(response.text, end_time - start_time), flush=True)

    # Convert to dict
    data = json.loads(response.text)
    heard = []

    for bird_name, confidence in data["results"]:
        if confidence >= args.confidence:
            tokens = bird_name.split("_")
            heard.append(tokens[1])

    if len(heard) > 0:
        bird_text = ", ".join(heard)
        item = "org.freedesktop.Notifications"

        notfy_intf = dbus.Interface(
            dbus.SessionBus().get_object(item, "/"+item.replace(".", "/")), item)

        notfy_intf.Notify(
           "", 0, "", "BirdNet", bird_text,
            [], {"urgency": 1}, 3000)


if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(description='RTSP Reader for BirdNet')
    parser.add_argument('--out', default="chickens.wav")
    parser.add_argument('--duration', type=int, default=5)
    parser.add_argument('--confidence', type=float, default=0.2)
    parser.add_argument('--server', default="http://192.168.2.4/birdnet")
    parser.add_argument("-v", "--verbose", help="increase output verbosity",
                    action="store_true")
    args = parser.parse_args()

    main(args)

There are more things I can send in the request, such as confidence levels, location and such. I'll be experimenting with these in the future.

Old Pi, new tricks!

birdnet raspberry pi. — The final setup - a litle Raspberry Pi saying what bird it has heard.

But really, a D-Bus alert is just the beginning. I wanted something that would sit on a shelf and quietly post a message, showing what it had heard. I figured this should be possible. I have a bunch of really old Raspberry Pis lying around (who doesn't right?! Probably in the same drawer as all these conference badges I'll wager! :D ). I'd rather one of them at least does something useful. I have a HDMI screen for one, that sits nice and snug on the front. It's only a RaspberryPi B+ but it's fast enough for this job. I installed the legacy Raspbian OS with no Xorg or X11. It just has the console and framebuffer.

I found a case on Thingiverse that fits both the pi and screen perfectly. With that assembled, I think it looks quite neat.

The final bit is to get the code running. I used the Python library asciimatics to get some nice text on screen (having given up on ncurses for now - maybe later).

Why do this?

One of the things that doesn't come up that often when technical folk talk about A.I (or any tech really!) - at least from what I've seen - is the idea of spirit. Does this new technology make your life better, and I don't mean more efficient. Does it put a smile on your face? Does it surprise you? Think of the first time you got hold of a Lego set maybe? Or perhaps the first time you saw a planet through a telescope? Sometimes, technology can really brighten up your day. I think this is one of them. Every-time I see the little screen, I'm reminded that there are birds outside. I know which kinds and more importantly, I stop what I'm doing and just listen out for a little while. It takes me out of myself for a short time. I sometimes even go outside - shock - when a message catches me at just the right time.

I consider this sort of thing to be a win for A.I, not the nonsense that tries to control your life, replace jobs or reinforce some stereotype. In this regard, A.I is the same as any other technology or tool.