> So using 2 NVLinked GPU's with inference is not supported? To make better use ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		wsxiaoys on Jan 13, 2025 \| parent \| context \| favorite \| on: Tabby: Self-hosted AI coding assistant > So using 2 NVLinked GPU's with inference is not supported? To make better use of multiple GPUs, we suggest employing a dedicated backend for serving the model. Please refer to https://tabby.tabbyml.com/docs/references/models-http-api/vl... for an example

SOLAR_FIELDS on Jan 13, 2025 [–]

I see. So this is like, I can have tabby be my LLM server with this limitation or I can just turn that feature off and point tabby at my self hosted LLM as any other OpenAI compatible endpoint?

wsxiaoys on Jan 13, 2025 | [–]

Yes - however, the FIM model requires careful configuration to properly set the prompt template.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact