Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
int_19h
on March 8, 2024
|
parent
|
context
|
favorite
| on:
Fine tune a 70B language model at home
"Real-time" is a very vague descriptor. I get 7-8 tok/s for 70b model inference on my M1 Mac - that's pretty real-time to me. Even Professor-155b runs "good enough" (~3 tok/s) for what I'd consider real-time chat in English.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: