I have a small wrapper around rip2, aliased to `recycle`; files go to a `graveyard` zfs dataset. I deny `rm` usage for agents, a simple (global) instruction pointing to recycle seems to do the trick for Claude.
Seems like a quick win to remove some downside risk and make me a bit more comfortable letting agents run wild in local workspaces.
oh man that's awesome. I have been working for quite some time on big taxonomy/classification models for field research, espec for my old research area (pollination stuff). the #1 capability that I want to build is audio input modality, it would just be so useful in the field-- not only for low-resource (audio-only) field sensors, but also just as a supplemental modality for measuring activity out of the FoV of an image sensor.
but as you mention, labeled data is the bottleneck. eventually I'll be able to skirt around this by just capturing more video data myself and learning sound features from the video component, but I have a hard time imagining how I can get the global coverage that I have in visual datasets. I would give anything to trade half of my labeled image data for labeled audio data!
Hi Caleb, thanks for the kind words and enthusiasm! You're absolutely right, audio provides that crucial omnidirectional coverage that can supplement fixed field-of-view sensors like cameras. We actually collect images too and have explored fusion approaches, though they definitely come with their own set of challenges, as you can imagine.
On the labeled audio data front: our Arctic dataset (EDANSA, linked in my original post) is open source. We've actually updated it with more samples since the initial release, and getting the new version out is on my to-do list.
Polli.ai looks fantastic! It's genuinely exciting to see more people tackling the ecological monitoring challenge with hardware/software solutions. While I know the startup path in this space can be tough financially, the work is incredibly important for understanding and protecting biodiversity. Keep up the great work!
I've owned 17 Seagate ST12000NM001G (12TB SATA) drives over the last 24mos in a big raidz3 pool. My personal stats, grouping by the first 3-4 SN characters:
- 5/8 ZLW2s failed
- 1/4 ZL2s
- 1/2 ZS80
- 0/2 ZTN
- 0/1 ZLW0
All drives were refurbs. Two from the Seagate eBay store, all others from ServerPartDeals. 7/15 of the drives I purchases from ServerPartDeals have failed, at least four of those failures have been within 6 weeks of installation.
I originally used the Backblaze when selecting the drive I'd build my storage pool around. Every time the updated stats pop up in my inbox, I check out the table and double-check that my drives are in fact the 001Gs.. the drives that Backblaze reports has having 0.99% AFR.. I guess the lesson is that YMMV.
I think impact can have a big influence on mechanical hard drive longevity, so it could be that the way the ServerPartDeals drives were sourced, handled or shipped compromised them.
Sure. The talk about 8bit refers to quantization-aware training. Pretty common in image models these days to reduce the impact of quantization on accuracy.
Typically this might mean that you simulate an 8bit forward pass to ensure that the model is robust to quantization ‘noise’. You still use FP16/32 for backward pass & weight updates for numerical stability.
It’s just a way to optimize the model in anticipation of future quantization. The experience of using an 8-bit Nemo quant should more closely mirror that of using the full-fat bf16 model compared to if they hadn’t used QAT.
I haven't managed to successfully export my custom ViT model yet, but I've not had an issue accessing the export methods in torch 2.3 within the nvcr.io/nvidia/pytorch:24.02-py3 container.
I may have some more time to debug my trace tonight (i.e. remove conditionals from model + make sure everything is on CPU) and will update if I have any new insights.
I've pasted QR codes on the backs of bees before! Unfortunately we started this project a week before COVID so we didn't get to see it through, but the actual gluing is easier than you might think. Doesn't look like they used it here, but we planned on using James Crall's BEEtag repo [0].
I'm working on something to let you measure animal activity without pasting QRs [1]. I've been running a casual study of my bird feeders since September, and my system will be field-deployed by a few labs this summer. My background is in pollination ecology, so bee/pollinator tracking is a top priority. If you want to study your own backyard, you can use polliOS with your own IP cameras. Targeting March 1 for beta release, but you can submit your email to get notified.
YMMV, but I have tried this several times and have found it very difficult without microsoldering equipment.
Some ESP variants have a built-in U.FL connector which makes things easier (e.g. AITHINKER ESP32-Cam and its many clones), but you still have to remove the 0Ω resistor, which I find very difficult with even the smallest tips on my Hakko iron.
Could be a skill issue on my end; should be much easier if you have a simple reflow/rework station.
Thank you, it is really good. I was able to find the wind shear forecast pretty easily, and cycle through a lot of detailed images. (obviously, understanding that the further out I go, the wider the error bars will be)
Here's hoping that the wind shear continues to stay strong enough to protect people from any bad hurricanes. Although I don't know what will happen if that ocean energy isn't "bled off" by a few big storms?
Various hibernation modes are great, but 300ma@5v sometimes feels too hungry for the amount of computation in there. I bet things would be better if it wasn’t 45nm ‘pitch’. They are great chips, but require more power than you might expect if you’re doing something network-heavy or, god forbid, using the camera. Unfortunately there’s no competitive alternatives for the hobbyist.
I have a small wrapper around rip2, aliased to `recycle`; files go to a `graveyard` zfs dataset. I deny `rm` usage for agents, a simple (global) instruction pointing to recycle seems to do the trick for Claude.
Seems like a quick win to remove some downside risk and make me a bit more comfortable letting agents run wild in local workspaces.