Discover more from Towards AI Newsletter
This AI newsletter is all you need #62
What happened this week in AI by Louie
This week we have been watching development in coding models at META as well as new fine-tuning capabilities at OpenAI. Meta has introduced Code LLaMA, a large language model with the capability to both generate code based on prompts and describe codes. They unveiled three iterations featuring varying parameter counts (7B, 13B, and 34B), each of which has undergone training on an additional 500 billion code-related tokens. The models have compatibility with widely recognized programming languages such as Python, C++, Java, PHP, and others. Furthermore, two specialized models are built on top of it. Code LLaMA - Instruct is the tuned iteration of Code LLaMA, designed to follow instructions. Additionally, there is Code LLaMA - Python, a dedicated model tailored for the Python programming language. The models have been made available under licenses for both research and commercial use. The open source release of this model enables rapid iteration and already we have seen further models built on top of it including WizardCoder, which demonstrated superior performance compared to the majority of existing language models, coming close to but not quite reaching the level of GPT-4.
In other big news, OpenAI has introduced the capability to fine-tune GPT-3.5-turbo as a service. It's important to highlight that utilizing the fine-tuned model for inference comes with a significant increase in cost. OpenAI has also unveiled ChatGPT Enterprise, offering unrestricted usage, enhanced speed, and an extended context window for organizations.
We were happy to see the release of Code LLaMA and think there is huge potential for models fine-tuned and optimized for coding to make significant improvements - both as co-pilot tools for developers and for opening up software development to non-developers. The GPT Turbo fine-tuning release is also exciting, and we expect will lead to some high-quality coding and fine-tuned models also, but we are particularly excited by the prospect of fine-tuning for GPT-4 opening up later this year. We think a GPT-4 fine-tuned for coding has the potential to be incredibly powerful.
- Louie Peters — Towards AI Co-founder and CEO
Thanks for reading the Towards AI Newsletter! Subscribe for free to receive new posts and support my work.
OpenAI has introduced fine-tuning for GPT-3.5 Turbo, which delivers enhanced performance on specific tasks. This refined version can potentially equal or even surpass the capabilities of the base GPT-4 model. Initial testers have managed to substantially reduce prompt length through the fine-tuning process. The costs for training and usage input/output are provided at $0.008, $0.012, and $0.016 per 1K tokens, respectively.
Meta has developed a robust foundational model known as SeamlessM4T, which is capable of managing diverse text and speech tasks across 100 languages. It encompasses automatic speech recognition, speech-to-text translation, speech-to-speech translation, text-to-text translation, and text-to-speech translation, supporting a wide range of input and output languages.
OpenAI has launched ChatGPT Enterprise, providing security and privacy features suitable for enterprise use. This version offers unlimited access to GPT-4 at higher speeds, extended context windows (32k) for handling longer inputs, advanced data analysis capabilities, customization options, and additional features.
Alibaba Cloud has unveiled two open-source AI models: Qwen-VL and Qwen-VL-Chat. These models are trained using the company's Tongyi Qianwen (Qwen) LLM. They can interpret visual data, like text in images, and respond to location-based queries, like offering directions by interpreting images of signs.
Five 5-minute reads/videos to keep you learning
Hugging Face has introduced AutoGPTQ integration in Transformers, facilitating streamlined 2, 3, 4, and 8-bit quantization with negligible accuracy reduction. This integration is compatible with Nvidia GPUs as well as RoCm-powered AMD GPUs.
This paper explores the effectiveness of teaching algorithmic reasoning to LLMs, focusing on overcoming challenges such as overfitting and spurious correlations. It proposes a four-step approach that includes formulating algorithms as skills, teaching multiple skills simultaneously, teaching skill composition, and teaching the use of skills as tools.
Code Llama is now accessible via Hugging Face, offering the capability to execute code infilling utilizing the 7B and 13B models. It has been made available under the same permissive community license as Llama 2 and is open for commercial utilization.
Language-to-reward systems, fueled by LLMs, empower robots to learn directly from language. These systems translate natural language instructions into reward-specifying codes, compute rewards based on robot actions, and facilitate learning through reinforcement learning (RL).
MetaGPT represents a novel approach to enhancing collaborations among AI agents. This video reveals the inner workings of MetaGPT's innovative design, delves into the role of SOPs (Standard Operating Procedures), and explores how multiple AI agents collaborate seamlessly.
Papers & Repositories
This paper presents a straightforward algorithm for aligning LLMs with human preferences, drawing inspiration from growing batch reinforcement learning. Reinforced Self-Training (ReST), developed by DeepMind, offers a more economical alternative to RLHF. It employs a two-step process, Grow and Improve, to enhance the training dataset and fine-tune the LLM.
Giraffe is a new series of models derived from LLaMA and LLaMA2, encompassing variants with context window sizes of 4k, 16k, and 32k tokens. These models have undergone fine-tuning based on LLaMA and LLaMA2, and they feature experiments involving the expansion of the context window through positional encoding modifications.
Platypus, the latest LLM featured on HuggingFace's Open LLM Leaderboard, leverages the Open-Platypus dataset to attain impressive performance in STEM and logic. It effectively addresses bias during training by utilizing LoRA modules and the PEFT library. However, its challenge with languages beyond English is attributed to its underlying model, LLaMa-2.
Graph of Thoughts (GoT) represents a framework that enhances the prompting capabilities of large language models (LLMs) beyond what paradigms like Chain-of-Thought or Tree of Thoughts (ToT) provide. GoT has showcased enhanced performance in comparison to alternative methods, notably enhancing sorting quality (62%) while concurrently reducing costs (31%).
This paper introduces quantization with incoherence processing (QuIP), a new approach that achieves 2-bit quantization of language model models using adaptive rounding. It stands as the first algorithm of its type to be accompanied by a theoretical analysis, demonstrating its potential influence on other quantization methods, such as OPTQ.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Meme of the week!
Meme shared by neon8052
Featured Community post from the Discord
DrDub has initiated a remarkable project named "Tell-and-Show," which serves as an experiment in community-owned machine learning. The project builds recommendation profiles that are exclusively yours. It also provides tools and models available for adoption by other Free Software projects, to enhance the utility of these recommendation profiles. Check it out here and support a fellow community member! You can help this project by sharing your individual preferences over key items or joining as a volunteer. Share your questions and feedback in the thread here.
AI poll of the week!
TAI Curated section
Article of the week
Deploying large language models is undoubtedly one of the most challenging tasks, not because the deployment teams are incompetent, but simply due to the complexity of deploying these types of models. This is where the vLLM library comes in handy, an open-source library developed by UC Berkeley under the Apache license. The philosophy behind vLLM is to make serving and inference of large language models affordable for both industry and small research teams.
Our must-read articles
If you want to publish with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact firstname.lastname@example.org.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
Thanks for reading Towards AI Newsletter! Subscribe for free to receive new posts and support my work.