A email thread, to me, is much preferable to any IM, unless participants knows to send those IMs in a thread as well.
All the context are in there in the rawest of forms, you a run through them with your eyeballs or have your tools do the summarisation there and then. Most of the IMs I received didn't even quote the original message they were responding to, and I end up spending time jumping up and down the channel or group to get the whole context. Not to mention folks sending me link to a slack, which, depending on the mood of the almighty slack god, can or cannot be opened in the app / current slack session.
That’s fantastic. In your opinion what are some of the best books / resources you use to have this kind of understanding of LLM and the underlying deep learning algorithm?
To approach it from first principles, I didn’t really follow any one specific tutorial or course. I really just started from the bottom up. Starting with my Tensor module, I familiarized myself with the math operations, backprop, computation graphs, autodiff… etc. That was the hardest part honestly, but it set the foundation for everything else in the system.
Once I had that working, the rest (activations, loss functions, optimizers, layers, transformers) started to make a lot more sense. Writing it all myself gave me full control, removed the abstraction, and helped me to really internalize how each part of the system fits together, and why it works - not just how.
Here’s some resources I found helpful, I also have some links to additional resources in the project readme:
- Deep Learning Foundations and concepts: Book by Christopher Bishop. Mainly covers theory and statistical ML.
- Natural Language Processing with Transformers: Book by Lewis Tunstall, Leandro von Werra, and Thomas Wolf (Hugging Face). Good for understanding real world NLP/ LLMs.
- UvA Deep Learning Tutorials: Website for building and understanding DL modules, has a lot of project-based notebooks (tutorial 7 on GNNs was very helpful).
- Deep Learning: Book by Goodfellow, Bengio & Courville. Covers a lot of foundational theory and math.
- Stanford’s CS231 Course: This is fully available online, with lecture videos and coding walk-throughs. Super helpful for learning about backprop, CNNs, and deep nets, etc.
- The Annotated Transformer: Website by Harvard NLP. This goes over the ‘Attention is All You Need’ paper in an understandable way with example code
- codingvidya.com: A good aggregator for finding ML books and learning resources.
- Andrej Karpathy’s YouTube channel, and his micrograd and tinygrad Repos are super helpful resources, especially when learning by building from scratch.
I’m still learning as I go, but I’m happy to share what’s worked for me so far! Hope this helps!
I’m building a on-device AI App that supports a good variety of open-weight LLM Models. It’s mostly so that I can gain an intuition on what goes on when we do inferencing, but my longer term plan is
1. Support on-device RAG to allow chatting with your own documents on mobile offline
2. Support MCP on-device, taking advantage of information that’s (only) available on your phone, like calendar events, health data, etc. These shouldn’t need to be anywhere but on-device.
3. Allow on-device AI to use shortcuts(?)
I think most of the functionality are well served on desktop front with Ollama and LM Studio, but moving these functionality to mobile offers a great learning opportunity.
China folks are more hungry than most in the west, with the exception of Silicon Valley and New York, and China isn't "burdened" by quality of life, and work life balance, which isn't healthy, but that's one of their edge.
Even for U.S. AI Startups, asian makes up a majority of the research + technical staff. I think the incentive to go into PhD in the west has been gradually declining for years, while PhD programmes have been flooded with Chinese Students. I think it's something that the U.S. government needs to address to try to remain competitive in the longer term.
Manufacturing is something that the west actively gave up and migrated the skillset to Asia, notably China and Taiwan.
I think while technically they differ mostly only in terms of the training materials thrown into it, the outcome is that each model is good at something and bad at others, just like human being. Soon you'll need standardized tests and HR department to evaluate individual LLM performance. :)
Sigh. But honestly, it just felt like people getting used to China doing outrageous things like these to the point where it's no longer interesting for a comment.
Note that in an actual intercept, this thing is in space and going like 10km/s. It's job is to slam directly into an incoming Ballistic Missile re-entry craft before it can deploy nukes and decoys.
The giant nozzle pointed downwards is simulating 0-gravity, and probably does not exist on live fired craft.
Those thrusters must be extremely precision controlled too. At that flight profile, having a nozzle open an extra millisecond means missing the intercept by hundreds of feet.
I don’t know man. I mean, google indexes the open web to create a search engine and we all seem to be fine with it. I do draw the line that if you want to build a LLM, there’s the internet for you, train your own damn model and stop piggybacking on someone else’s, or in China’s case, at least don’t get caught.
Any large company that has majority ownership in China or primarily employs software developers based in China is a spying apparatus, whether voluntarily or not.
It's not even about intentions. You are subject to the whims of the CCP whether you want to be or not.
No company operates from China without such influence. Some are more blatant than others (Huawei is more obviously a spying apparatus than Zoom).
I wish you the best of luck and hope you are able to get out of HK if you wish.
Chinese citizens and organizations are required by China’s National Intelligence Law to support, assist and cooperate with the state intelligence work. The law also protects any individual and organization that aids it. That’s partly why Huawei is a national security risk. [0]
CCP also sent state officials to private companies. Private companies, including foreign entities, have to establish formal party organizations. [1]
All the context are in there in the rawest of forms, you a run through them with your eyeballs or have your tools do the summarisation there and then. Most of the IMs I received didn't even quote the original message they were responding to, and I end up spending time jumping up and down the channel or group to get the whole context. Not to mention folks sending me link to a slack, which, depending on the mood of the almighty slack god, can or cannot be opened in the app / current slack session.
But you do you. :)