Tangentially related (and also using the ubiquitous MNIST dataset), Sebastian Lague started a brilliant, but unfortunately unfinished video series on building neural networks from scratch.
This video was an absolute eye-opener me [1] on what classification is, how it works and why a non-linear activation function is required. I probably learned more in the 5 minutes watching this than doing multiple Coursera courses on the subject.
As a researcher I always appreciate to see new opportunities to organize my work and improve my routines. This system does seem interesting.
To this day though I've found that Zotero is unbeaten to keep everything organized. I collect all my notes and documents in the Zotero library, and I sync it on multiple devices by placing the Zotero files in a Dropbox folder.
In this way I can use whatever app I want to write the actual notes (MD, txt, docx, whatever). I organize the notes in Zotero "folders" and the documents of each note are stored in "sub-folders". The best thing is that these are not actual folders, so the same document, if relevant for multiple researches, can be placed in two or more folders/sub-folders etc.
This setup has worked for me for nearly 8 years, with over 50 publications and over 5000 documents in my Zotero library. And best of all, the only thing I'm actually paying for is Dropbox, which I would anyway and, IMHO, is totally worth it. But that's another story. And more importantly, to get things started one can rely on the free tier of Dropbox, so even that's free.
So as a researcher (which translates to little money to spare and high volumes of documents to manage), I find that to this date I still have to find a solution that beats my configuration. I would love though to discover new opportunities!
> My wife and I are competing our annual reviews of each other. One of the challenges I’m facing this year is we didn’t agree on OKRs and I have a lot of qualitative feedback but don’t like how the KPIs look. I’m worried her annual review of me will be similar. We both aligned that we’d max out 401ks, move for promotions and end a car lease. But her career moved faster and we don’t have joint accounts so I failed to enter a savings goal. I’m going to suggest she didn’t save enough to see if I can get insight. Also the household itself did well this year we deceased our order in rate by 15% while moving to a good mix of organics and non frozen items +20%.
A few weeks ago, we wrote about how we implemented SIMD instructions to aggregate a billion rows in milliseconds [1] thanks in great part to Agner Fog’s VCL library [2]. Although the initial scope was limited to table-wide aggregates into a unique scalar value, this was a first step towards very promising results on more complex aggregations. With the latest release of QuestDB, we are extending this level of performance to key-based aggregations.
To do this, we implemented Google’s fast hash table aka “Swisstable” [3] which can be found in the Abseil library [4]. In all modesty, we also found room to slightly accelerate it for our use case. Our version of Swisstable is dubbed “rosti”, after the traditional Swiss dish [5]. There were also a number of improvements thanks to techniques suggested by the community such as prefetch (which interestingly turned out to have no effect in the map code itself) [6]. Besides C++, we used our very own queue system written in Java to parallelise the execution [7].
The results are remarkable: millisecond latency on keyed aggregations that span over billions of rows.
We thought it could be a good occasion to show our progress by making this latest release available to try online with a pre-loaded dataset. It runs on an AWS instance using 23 threads. The data is stored on disk and includes a 1.6billion row NYC taxi dataset, 10 years of weather data with around 30-minute resolution and weekly gas prices over the last decade. The instance is located in London, so folks outside of Europe may experience different network latencies. The server-side time is reported as “Execute”.
We provide sample queries to get started, but you are encouraged to modify them. However, please be aware that not every type of query is fast yet. Some are still running under an old single-threaded model. If you find one of these, you’ll know: it will take minutes instead of milliseconds. But bear with us, this is just a matter of time before we make these instantaneous as well. Next in our crosshairs is time-bucket aggregations using the SAMPLE BY clause.
If you are interested in checking out how we did this, our code is available open-source [8]. We look forward to receiving your feedback on our work so far. Even better, we would love to hear more ideas to further improve performance. Even after decades in high performance computing, we are still learning something new every day.
They should start with Rand Waltzman's presentation at DEFCON (link above). In sum, USGOV is still totally unprepared for information warfare ops ongoing and intended.
About 1/2 through this video the content is the best I've ever watched from DEFCON, period.
Man, this seems like such a great resource; wish I had found it when I first got into SDR. I wrote a blog post about controlling ceiling fans using SDR (https://blog.hmac.io/2019/10/25/making-dumb-fans-smart-using...), but I glossed over my initial struggles just getting up to speed with this stuff. All the concepts of SDR are straightforward and fairly intuitive, but it's the software stack and actually using the tools that's hard. The whole field is niche enough that you end up stubbing your toe with every step you take in that world.
Googling around and trying to figure out where to even begin comes up with so many fragmented, unhelpful pieces of information. You either end up being pointed at Gnu Radio, which amounts to an incomprehensible behemoth for a newbie, or you find the numerous lighter weight pieces of software which aren't very clear on what exactly they're good for and are often unmaintained.
Luckily my first project was rather simple; ceiling fan remotes don't exactly use the most advanced protocols. Once I found CubicSDR and fiddled around with it enough I was able to dump the radio signals to an audio file and just tease the rest out in Audacity. My blog post mostly covers the nightmare that was the TX side of things.
If you want to nitpick on the source, this is another source with a full publication:
> Variation at the rs2494732 locus of the AKT1 gene predicted acute psychotic response to cannabis along with dependence on the drug and baseline schizotypal symptoms.
The whole premise reminds me of the infamous "impossible is nothing" resume submitted to UBS back in 2006. Sadly, the author died. For those unfamiliar, it included:
* Cover letter
* Resume: one and a half pages
* Writing sample: eight pages
* A glamour shot of Vayner
* Seven-minute video that features the following feats by Vayner:
* Interview: gives advice for achieving life goals
* Bench press: 495 pounds (225 kilograms)
* Downhill skiing with jumps
* Tennis serve: 140 miles per hour (225 km/h or 63 m/s)
This video was an absolute eye-opener me [1] on what classification is, how it works and why a non-linear activation function is required. I probably learned more in the 5 minutes watching this than doing multiple Coursera courses on the subject.
[1] https://www.youtube.com/watch?v=bVQUSndDllU