What happened this week in AI by Louie The open-source movement continues at pace this week with Falcon, a new family of state-of-the-art language models, ascending to the top of Hugging Face's leaderboard. Falcon-40B, developed by the Technology Innovation Institute in Abu Dhabi, represents the first "truly open" model, boasting capabilities that rival many existing closed-source models. The Falcon family consists of two base models: Falcon-40B and Falcon-7B. Currently, the 40B parameter model leads the Open LLM Leaderboard, while the 7B model excels in its respective weight class. Falcon-7B and Falcon-40B have been trained on 1.5 trillion and 1 trillion tokens, respectively, aligning with modern models optimized for inference. The exceptional quality of the Falcon models is attributed to their training data, which is predominantly derived (>80%) from RefinedWeb—a groundbreaking, extensive web dataset based on CommonCrawl. Another noteworthy feature of the Falcon models is the Multi-query attention, which employs a shared key and value across all heads.
This AI newsletter is all you need #50
This AI newsletter is all you need #50
This AI newsletter is all you need #50
What happened this week in AI by Louie The open-source movement continues at pace this week with Falcon, a new family of state-of-the-art language models, ascending to the top of Hugging Face's leaderboard. Falcon-40B, developed by the Technology Innovation Institute in Abu Dhabi, represents the first "truly open" model, boasting capabilities that rival many existing closed-source models. The Falcon family consists of two base models: Falcon-40B and Falcon-7B. Currently, the 40B parameter model leads the Open LLM Leaderboard, while the 7B model excels in its respective weight class. Falcon-7B and Falcon-40B have been trained on 1.5 trillion and 1 trillion tokens, respectively, aligning with modern models optimized for inference. The exceptional quality of the Falcon models is attributed to their training data, which is predominantly derived (>80%) from RefinedWeb—a groundbreaking, extensive web dataset based on CommonCrawl. Another noteworthy feature of the Falcon models is the Multi-query attention, which employs a shared key and value across all heads.