Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Generative AI and the big buzz about small language models
(
the-decoder.com
)
13 points
by
milliondreams
on March 1, 2024
|
hide
|
past
|
favorite
|
5 comments
milliondreams
on March 1, 2024
|
next
[–]
As we see these systems evolving, I have come to believe specialist small language models with an MoE framework are the future of the industry.
swimwiththebeat
on March 1, 2024
|
prev
|
next
[–]
Does anyone know if this is using the Mamba architecture[1] instead of transformers? It looks like it uses a state space model (SSM) layer.
[1]:
https://arxiv.org/abs/2312.00752
milliondreams
on March 4, 2024
|
parent
|
next
[–]
We covered state space models in a blog post here -
https://blog.dragonscale.ai/state-space-models/
It gives overview of Mamba And StrypedHyna.
sal9000
on March 2, 2024
|
parent
|
prev
|
next
[–]
It came earlier than Mamba. It uses hyena hierarchy blocks, which are considered SSM but not the same as Mamba.
compressedgas
on March 2, 2024
|
prev
[–]
Piece with less detail than the source linked from the article:
https://www.together.ai/blog/stripedhyena-7b
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: