Hacker Newsnew | past | comments | ask | show | jobs | submit | more spearman's commentslogin

Pretty sure it's Apple building it, and they're using JAX in-house so I imagine it will get better over time. Though they do love to drop support for old things so maybe M1 will never work again...


I think they’re likely using MLX in house now, no? (Probably not everyone, ofc - but seems likely that many will just use the native array framework designed explicitly for MX chips)


From https://machinelearning.apple.com/research/introducing-apple...

> Our foundation models are trained on Apple's AXLearn framework, an open-source project we released in 2023. It builds on top of JAX


Wow! thanks for the ref


I liked this (and I fixed a few bugs in it too): https://explained.ai/matrix-calculus/



I skimmed the paper but couldn't find it: What API did they use to write their kernels? I would have guessed SYCL since that's what Intel is pushing for GPU programming but I couldn't find any reference to SYCL in the paper.


OK I found it. Looks like they use SYCL (which for some reason they've rebranded to DPC++): https://github.com/intel/intel-extension-for-pytorch/tree/v2...


SYCL is a standard, DPC++ is a particular implementation of this standard.


Obelisk AGI lab, Astera Institute | Machine Learning Research Engineer | Full-Time | On-site with VISA in Berkeley, CA

General Intelligence laboratory that draws on neuroscience and brain architecture to create new models of adaptive intelligence. The Astera Institute is a non-profit dedicated to developing high leverage technologies that can lead to massive returns for humanity.

Visa: As a non-profit research institute, we are exempt from the H1B visa cap so we are willing and able to hire qualified applicants regardless of your nationality.

Compensation: We pay less than e.g., Meta. We pay more cash than the average early VC-backed startup, but there's no equity, since we're a non-profit.

Senior Software Engineer - Reinforcement Learning Environments https://jobs.lever.co/astera/b923ebee-6db2-43cf-9631-72c4351...

Machine Learning Research Engineer https://jobs.lever.co/astera/3396ac29-0682-4f9c-bb79-0666693...

Our stack: - Infrastructure: Bare metal servers running Kubernetes and Ray - Models written in PyTorch but we're experimenting with Jax - Environment game engine written in C++


The training support is much less mature and much less widely used, but it does exit: https://onnxruntime.ai/docs/get-started/training-on-device.h... https://onnxruntime.ai/docs/get-started/training-pytorch.htm...


We wanted to use ONNX runtime for a "model driver" for MD simulations, where any ML model can be used for molecular dynamics simulations. Problem was it was way too immature. Like ceiling function will only work with single precision in ONNX. But the biggest issue was that we could not take derivatives in ONNX runtime, so any complicated model that uses derivatives inside was a nogo, is that limitation still exist? Do you know if it can take derivatives in training mode now?

Eventually we went with pytorch only support for the time being, with still exploring OpenXLA in place of ONNX, as a universal adapter: https://github.com/ipcamit/colabfit-model-driver


> A carefully shaped money-backed lever over the market is absolutely part of reason you never see DisplayPort inputs on consumer TVs, where HDMI group reigns supreme

What does this actually mean?


HDMI is licensed, Displayport doesn't need one. I'm sure there's money changing hands in the TV sector of most display manufacturers.


I believe both Google and Dropbox had a lot of Python code powering their products that they wanted to make faster. I don't think Microsoft has many large 1st party uses of Python. I think they're investing in it largely to gain developer mind-share. So for Google and Dropbox "use another language" was an option, for Microsoft it's not.


This is cool but following some of the links it seems like there are a lot of immature parts of the ecosystem and things will not "just work". See for example this bug which I found from the blog post: https://github.com/odsl-team/julia-ml-from-scratch/issues/2

Summarizing, they benchmark some machine learning code that uses KernelAbstractions.jl on different platforms and find:

* AMD GPU is slower than CPU

* Intel GPU doesn't finish / seems to leak memory

* Apple GPU doesn't finish / seems to leak memory

Would also be interesting to compare the benchmarks to hand-written CUDA kernels (both in Julia and C++) to quantify the cost of the KernelAbstractions layer.


Google tried Swift for ML and it didn't make any dent in Python.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: