This AI newsletter is all you need #75
What happened this week in AI by Louie
This week saw the conclusion of the drama at OpenAI with the return of Sam Altman and Greg Brockman to OpenAI and the appointment of two new directors to the board (together with one existing director). To some extent, we think this leaves OpenAI in a better position than it started with more checks and balances on Sam’s control (now he is off the board) and more urgency to find a long-term democratic board governance solution. However, there is likely to be lasting damage, with some enterprises fearing dependency on an organization with such a complex and potentially unstable governance structure. We expect this to support the pre-existing trend of building products with limited dependencies on a single LLM where the LLM can be substituted for an alternative API or open-source model at short notice.
Away from OpenAI — we were excited to see a new video generation model from Stability AI this week and an improved Claude 2.1 model out of Anthropic and Inflection-2 model from Inflection.AI (soon accessible via its Pi digital companion interface). We are still early in text/image-to-video model capabilities; however, releasing a powerful open-source video generation foundation model can help catalyze progress in the field. We are also glad to see more competition in the LLM space as companies try to take advantage of OpenAI’s turbulence.
Why should you care?
We think the governance structure at OpenAI remains important for the short-term stability of the thousands of companies and individuals building on its models and the economy and geopolitics in the long term, more broadly if OpenAI becomes increasingly powerful. In our opinion, OpenAI governance is still far from being solved. In the short term, there is a need for a larger and more diverse board. In the long term, if OpenAI really does continue to deliver on its ambitions, there likely should be some form of democracy and decentralization to control what aims to be one of the most powerful organizations in the world. Given this, we must continue to see other companies and organizations compete with new models, together with the open-source AI movement continues developing models with less centralized dependencies.
- Louie Peters — Towards AI Co-founder and CEO
Anthropic has released version 2.1 of Claude, with several major upgrades, including a 200,000 token context window, reduced rates of hallucination, and new tool use capabilities. This upgrade allows users to send lengthy documents and improves accuracy, enhancing trust and reliability in the system. They are also updating their pricing to make it more accessible.
Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers wrote a letter to the board of directors warning of a powerful artificial intelligence discovery that they said could threaten humanity, two people familiar with the matter told Reuters. The sources cited the letter as one factor among a longer list of grievances by the board leading to Altman’s firing, among which were concerns over commercializing advances before understanding the consequences.
Stability AI has introduced Stable Video Diffusion, a powerful foundation model for generative video. This model has the potential to generate customizable frames at varying frame rates and is publicly accessible on GitHub and Hugging Face for research purposes.
AI startup Inflection AI just announced Inflection-2, a new 175 billion parameter language model trained on 5,000 NVIDIA H100 GPUs in fp8 mixed precision for ~1⁰²⁵ FLOPs. It outscored LLaMA 2 and PaLM 2 on various NLP benchmarks and approached near GPT-4 levels on specific tasks.
Intel released their fine tune of Mistral 7B, topping the Huggingface’s leaderboard. This is a fine-tuned model based on Mistral-7B-v0.1 on the open-source dataset SlimOrca. It is aligned with the DPO algorithm. The model is trained on Habana Labs’s 8x Gaudi2 mezzanine accelerators.
Has OpenAI’s turbulence led to more competition and faster model releases in the LLM space as companies try to take advantage? Share your thoughts in the comments!
Five 5-minute reads/videos to keep you learning
Andrej Karpathy has released an hour-long video titled “The Busy Person’s Intro to Large Language Models (LLMs),” which offers valuable insights, resources, and papers for ML experts and AI newcomers. This concise guide covers the video’s main topics and provides references to related papers.
Distil-Whisper is a speech recognition model with state-of-the-art results for transcribing any kind of audio. In this video, Louis Bouchard explores the model’s capabilities, how it was built, and how it works.
This article covers emerging tools and frameworks in AI, comparing their strengths, usability, and ideal use cases. It compares established foundations like TensorFlow and PyTorch, No-Code AI/ML Platforms, Cloud-based AI Services, Vision-focused Frameworks, and more.
Lookahead decoding is a new, exact, parallel decoding algorithm to accelerate LLM inference. They report 1.5–2x speedups during LLM decoding by trading off compute for latency; you pay for more FLOPs but get higher throughput. This article introduces the new approach along with demos and experimental results.
This 5-minute video produced by Y Combinator features insights from 33 YC founders who specialize in AI. Given their current understanding of AI, they share their perspectives on the timeline for when AGI might become a reality.
Repositories & Tools
1. GPT4All is an ecosystem to run powerful and customized large language models that work locally on consumer-grade CPUs and any GPU. It is a 3GB — 8GB file that you can download and plug into the GPT4All open-source ecosystem software.
2. Llama Packs is a community-driven hub of prepackaged modules to be used with LlamaIndex and LangChain. The goal is to connect large language models to various knowledge sources easily. They have already launched 16+ templates.
3. Tuna is a no-code tool for quickly generating LLM fine-tuning datasets from scratch. It helps create high-quality training data for fine-tuning large language models like the LLaMas.
4. Codesandbox is code autocomplete powered by Codeium. It provides single- and multi-line code generation with multiple suggestions from which to choose.
Top Papers of The Week!
Orca 2, a new language model, enhances reasoning through advanced training signals and diverse strategies. It surpasses instruction-tuned models in benchmarks and outperforms similar-sized models in complex tasks, even rivaling larger models in zero-shot settings.
This paper introduces GAIA, a benchmark for General AI Assistants, with 466 questions and their answers. It proposes real-world questions that require a set of fundamental abilities such as reasoning, multi-modality handling, web browsing, and generally tool-use proficiency. While simple for humans, they are challenging for most advanced AIs.
A new attention method called System 2 Attention (S2A) has been developed to address the issue of irrelevant or biased output in LLMs. Inspired by human cognitive processes, S2A filters out irrelevant context and promotes factuality and objectivity in LLM reasoning. In experiments, S2A outperforms standard attention-based LLMs on three tasks containing.
This paper derives a new general objective called ΨPO for learning from human preferences that are expressed in terms of pairwise preferences and, therefore, bypasses both approximations in RLHF. This allows an in-depth analysis of the behavior of RLHF and DPO and identifies their potential pitfalls.
This paper proposes Tied-LoRA, a simple paradigm that uses weight tying and selective training to increase the parameter efficiency of the Low-rank adaptation (LoRA) method. The experiments show that the Tied-LoRA configuration demonstrates comparable performance across several tasks while employing only 13~\% percent of parameters used by the standard LoRA method.
1. Shortly after screenshots showing xAI’s chatbot Grok appeared on X’s web app, X owner Elon Musk confirmed that Grok would be available to Premium+ subscribers sometime this week.
2. Google LLC released a new version of Bard that allows interaction with YouTube videos using natural language prompts. Bard will be able to access and process the content, providing a detailed and accurate response.
3. AI startup Artisan raises $2.3M to develop human-like digital workers. The workers are called Artisans and act as additions to teams they join, rather than software tools, and can perform 1,000s of tasks with minimal human input.
4. Founders of a new community-powered product creation platform, Off/Script, announced the official launch of its mobile app, allowing anyone to conceptualize, share, and monetize product mock-ups.
Who’s Hiring in AI!
Interested in sharing a job opportunity here? Contact firstname.lastname@example.org.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!