Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am more interested on running llama.cpp on CPU-only VPSs/EC2. Although it is probably too slow.


13b versions of models are running on 8-core CPU fast enough to have a fluid conversation.


What can I run on nVidia RTX 4060 Ti with 16GB RAM?


Best to try yourself, llama.cpp is refreshingly easy to build.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: