Discussion about this post

User's avatar
bambam's avatar
2dEdited

At the same time, however, the convergent intelligence of Samba and Transformer architectures, when trained on the same dataset, raises some questions about whether we will hit dead ends in capability with our current focus on scaling up LLM parameters and training data. https://clusterrush.io

Expand full comment

No posts