>Python has amazing strengths as a glue layer, and low-level bindings to C and C...

socialdemocrat · on May 4, 2023

A clear complexity cost difficulty in extending these libraries. This separation force libraries to become huge complex monoliths. In Julia the equivalent is done with absolutely hilariously tiny libraries. In fact they are so small that many Python guys exploring Julia decide to not explore further thinking most of the Julia ML libraries aren't done or have barely started.

They are just not accustomed to seeing libraries being that small. That is possible in Julia because it is all native Julia code which means interfacing with other Julia code works seamless and allows you to mix and match many small libraries very easily. You can reuse much more functionality which means individual libraries can be kept very small.

For PyTorch and TensorFlow e.g. activation functions have to be coded specifically into each library. In Julia these can just be reused for any library. Each ML library doesn't need to reimplement activation functions.

That is why you get these bloated monoliths. They have to reinvent the wheel over and over again. So yeah there is a cost which is constantly paid.

Every time you need to extend these libraries with some functionality you are paying a much higher price than when you do the same with Julia.

bjornasm · on May 2, 2023

I believe they are thinking about the cost of building the next NumPy/PyTorch/TensorFlow.

sitkack · on May 2, 2023

PyTorch supports fusion and CPU/GPU codegen, https://towardsdatascience.com/how-pytorch-2-0-accelerates-d...

Taichi already allows for amazing speedups and supports all the backends (including Metal) https://www.taichi-lang.org/ the fact that they didn't mention Taichi is a glaring omission.

JAX is that next version of TensorFlow, https://jax.readthedocs.io/en/latest/notebooks/quickstart.ht...

This looks like a reboot of Numba with more resources devoted to it. Or a juiced up Shedskin. https://shedskin.github.io/

I think Mojo is a minor mistake, I would caution adoption, it is a superset fork of Python instead of a being a performance subset. Subsets always decay back to the host language. Supersets fork the base language. I would rather see Mojo integrated into Python rather than adopting the language and extending.

I have a theory why Mojo exists. The team was reading a lot of PyTorch and Tensorflow, getting frustrated with what a PIA working with C++ and MLIR is for their model-backend retargeting codegen, so they created their perfect C++/Python mashup rather than use TVM. They nerd sniped themselves and rather than just use Python3.12+mypy, they made a whole new language based off of Python.