This AI newsletter is all you need #56
What happened this week in AI by Louie
This week we saw several new competitors in the world of LLMs, in both open-source and closed models. Despite its impressive capabilities, the first version of the LLaMA model had licensing limitations that restricted its use exclusively to academic scenarios, which was considered one of its drawbacks. Now, Meta has recently unveiled LLaMA 2, which comes with a commercial use license. The latest model has been trained on an additional 40% of data, amounting to 2 trillion tokens, and possesses twice the context length, reaching an impressive 4096 tokens. LLaMA 2 surpasses models such as MPT and Falcon when evaluated against the chosen benchmarks. The model will be available for the public through Amazon AWS, Microsoft Azure, and Huggingface in three different sizes. (7B, 13B, 70B)
In other news, after conducting a lengthy private test of their initial Claude model, Anthropic has now made the Claude 2 language model publicly available. It is a ChatGPT competitor which is accessible through the claude.ai address. Their model reportedly scored 76.5% on the Bar exam (a 3.5% improvement) and placed in the 90th percentile on the GRE. Furthermore, the model is now accessible through API access, enabling businesses and individuals to leverage it as a foundation for their own applications and projects. The evaluation of the model reveals that it excels in providing harmless responses, exhibiting a twofold improvement. This achievement aligns with the vision of Anthropic's founders upon their departure from OpenAI.As part of their ongoing exploration, they are currently investigating the feasibility of training-aligned models. To this end, they are conducting an experiment involving the training of a "Decepticon" variant of Claude. The objective is to identify the specific source of deception within models.
At the beginning of the week, Elon Musk announced the establishment of xAI, a new company that directly competes with OpenAI. The primary objective of xAI is to develop Artificial General Intelligence (AGI) to gain a deeper understanding of the true nature of the universe. The company has assembled an impressive team of talented individuals from renowned organizations such as DeepMind, OpenAI, Google, and Microsoft. At present, there is a lack of specific information about the company. Nevertheless, we think its strong team, financial backing, and ambition mean it is likely to become an important contributor to the field.
- Louie Peters — Towards AI Co-founder and CEO
Hottest News
1. Anthropic Releases Claude 2
Anthropic has released Claude 2, an advanced AI model that outperforms Claude 1.3 in various evaluations, achieving impressive scores on Codex HumanEval and GSM8k. Claude 2 excels in coding, math, and even scored higher on the Bar exam. Additionally, it offers improved effectiveness in generating harmless responses and can handle inputs up to 100K tokens.
2. Elon Musk Launches AI Firm xAI As He Looks To Take On OpenAI
Elon Musk's new AI startup, xAI, is recruiting top engineers from tech giants like Google and Microsoft to develop a "maximally curious" AI. Although separate from X Corp, xAI will closely collaborate with companies like Twitter and Tesla, aiming to bring about breakthroughs and innovation in the field of AI through synergistic efforts.
3. Bard’s Latest Update: More Features, Languages and Countries
Bard, a language model, has expanded its availability worldwide and now supports multiple languages. New features include the ability to listen to Bard's responses, customize the tone and style of its output, pin and rename past conversations, export Python code to Replit and Google Colab, and utilize images in prompts with the help of Google Lens integration.
4. Programs To Detect AI Discriminate Against Non-Native English Speakers
Stanford researchers found that over 50% of essays written by non-native speakers were flagged as AI-generated, emphasizing the need to address the discrimination faced by non-native writers using AI detectors. This has implications for college and job applications as well as search engine algorithms, potentially harming academic careers and psychological well-being.
5. Shutterstock Expands Deal With OpenAI To Build Generative AI Tools
Shutterstock announced its plans to expand its existing deal with OpenAI to provide the startup with training data for its AI models. In turn, Shutterstock will gain "priority access" to OpenAI's latest tech and new editing capabilities. Additionally, Shutterstock is working towards becoming a leader in generative AI by collaborating with top AI companies and compensating artists for their contributions to training the AI.
Five 5-minute reads/videos to keep you learning
1. Train LLMs using QLoRA on Amazon SageMaker
This guide explains how to use QLoRA on Amazon SageMaker for fine-tuning large language models. It highlights the usage of tools like Hugging Face Transformers, Accelerate, and the PEFT library for adapting pre-trained language models to different applications without fine-tuning all parameters. It also emphasizes the advantages of QLoRA.
2. The ultimate guide to LLMs and NLP for content marketing
NLP plays a crucial role in content marketing as it automates content generation, optimizes it for search engines, gauges sentiment, segments audiences, powers chatbots, and virtual assistants, conducts social listening, and aids in content curation. This article provides an overview of how to effectively utilize NLP to scale content marketing.
3. A developer’s guide to prompt engineering and LLMs
This article shares GitHub's efforts with LLMs to assist developers in harnessing the technology to its fullest potential. It provides a high-level overview of how LLMs operate and offers guidance on constructing LLM applications. It utilizes GitHub Copilot code completions as a prime illustration of an LLM-based application.
The article explores the recent issuance of a patent to an AI system, prompting inquiries into the legal and ethical implications surrounding the granting of intellectual property rights to non-human entities.
5. How to Use AI to Do Stuff: An Opinionated Guide
Increasingly powerful AI systems are being released at an accelerating rate. This article serves as an orientation to the current state of AI. It is an opinion piece that draws from the author's experience and focuses on selecting the appropriate tools for various tasks.
Papers & Repositories
1. Instruction Mining: High-Quality Instruction Data Selection for Large Language Models
This paper proposes Instruct Mining, a linear rule for evaluating the quality of instruction-following data. With a 42.5% improvement over models that use unfiltered data, this approach emphasizes the importance of data quality in fine-tuning LLMs for effective interpretation of instructions. The selection process employs natural language indicators such as naturalness, coherence, and understandability.
2. Generative Pretraining in Multimodality
The paper introduces Emu, a Transformer-based multimodal foundation model capable of seamlessly generating images and texts in a multimodal context. It can handle diverse types of data input, including images, text, and videos, and surpasses other large multimodal models in tasks such as image captioning, visual question answering, and text-to-image generation.
3. Becoming Self-Instruct: Introducing Early Stopping Criteria for Minimal Instruct Tuning
This paper introduces the Instruction Following Score (IFS), a metric that detects the ability of language models to follow instructions. It aids in distinguishing between base and instruct models, thereby preventing unnecessary fine-tuning that could alter a model's semantics. Furthermore, researchers have observed that significant semantic shifts occur when the IFS plateaus, underscoring the relationship between instruction following and model semantics.
4. Provably Faster Gradient Descent via Long Steps
This work establishes provably faster convergence rates for gradient descent through the utilization of a computer-assisted analysis technique. The convergence rate of optimization methods represents the mathematical limit of how quickly a model can identify the optimal solution under the guidance of that particular optimization method. The cyclical learning rates can be enhanced by occasionally taking substantial steps.
5. GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI is an innovative model that enhances vision-language tasks by incorporating regions of interest. This incorporation enables precise alignment between visual features and language embeddings, thereby empowering users to interact with the model through language and spatial instructions.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Weekly AI Podcast
In this week's episode of the "What's AI" podcast, Louis Bouchard interviews Aleksa Gordić, a former research engineer at DeepMind who has embarked on his startup journey. They explore various aspects of his professional life, discussing topics such as his current priorities, his work at DeepMind, dropping out of his master's program, and securing a software engineer role in machine learning without a formal degree. Aleksa shares his experiences and provides valuable insights for anyone interested in AI. He highlights the importance of practical experience, participation in competitions, and self-learning. Furthermore, he discusses the diverse roles within companies like DeepMind and emphasizes the value of personal drive and a strong project portfolio. To gain a deeper understanding of Aleksa's journey and explore the world of AI, tune in on YouTube, Spotify, or Apple Podcasts.
Meme of the week!
Meme shared by dimkiriakos
Featured Community post from the Discord
Operand has released an open-source Python library for agent integration, aimed at complementing existing libraries such as HF Agent API and LangChain. This library serves as a framework for connecting agents, software systems, and human users. It achieves this by defining actions, callbacks, and access policies that facilitate the connection, monitoring, control, and interaction with agents. Check it out on GitHub and support a fellow community member. Share your thoughts on this project in the thread here.
AI poll of the week
Join the discussion on Discord.
TAI Curated section
Article of the week
Top Computer Vision Papers During Week From 3/7 To 9/7 by Youssef Hosni
This article offers a comprehensive overview of the most noteworthy papers published in the first week of July 2023, focusing on the latest research and advancements in computer vision. Whether you are a researcher, practitioner, or enthusiast, this article aims to provide valuable insights into state-of-the-art techniques and tools within the field of computer vision.
Our must-read articles
ChatGPT Code Interpreter Is Now Available for All Plus Users by Gencay I.
Stop Ignoring Julia! Learn It Now And Thank Your Younger Self in the Future by Bex T.
Markov Chain Attribution Modelling by Snehal Nair
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Job offers
Software Development Team Lead @Computronics Solutions (Sofia, Bulgaria)
Junior Software QA Engineer (Manual) @CoverGo (Remote)
Software Engineer @Toku (Santiago, Chile)
Java Developer (m/f/d) @UBench (Turnhout, Belgium)
Software Engineer (Java) @Compass Education (Hawthorn, Australia)
Senior Software Developer @TherapyNotes.com (Remote)
Software Developer @Servus Credit Union (Remote)
Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
This AI newsletter is all you need #56 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.