This AI newsletter is all you need #44
What happened this week in AI by Louie
This week in AI has seen some exciting developments in the world of open-source language models together with further discussion on the legal standing of LLM training data and AI-generated content.
Stability AI, the company behind the AI-powered Stable Diffusion image generator, has released a suite of open-source large language models (LLMs) called StableLM. These models are currently available between 3 billion and 7 billion parameters, with larger models arriving later. Similarly, Together has announced RedPajama, an open-source project in collaboration with other AI organizations to create large language models. RedPajama has released a 1.2 trillion token dataset that replicates the LLaMA recipe, enabling organizations to pre-train models that can be permissively licensed. RedPajama has three key components: pre-training data, base models, and instruction tuning data and models. RedPajama and StableLM follow the recent release of Dolly 2.0 and together should give a lot more flexibility for individuals or groups to train or fine-tune their own custom models for use in research or commercial products.
However, the legal standing of LLM training data and AI-generated content was in focus again with Reddit’s and Stack Overflow’s announcement that they will start charging large-scale AI developers for access to their data. It also follows Twitter’s withdrawal of access to its data from OpenAI. These moves towards increased data protection pose questions about the legal standing of existing models trained on scraped data and also on whether AI progress could slow if data becomes more difficult to legally access. The ownership and copyright status of AI-generated content also remains hotly debated as AI training and inspiration don’t fit neatly into existing IP and copyright laws. While some creators are embracing the new potential of AI other cases will be settled in court. While viral music using AI to imitate Drake’s voice has been taken down by Universal Music Group, the Canadian producer and singer Grimes has offered to split 50% of royalties with any AI-generated song that successfully harnesses her voice. This 50/50 split is apparently the same deal she would make with any other artist collaboration, AI or not.
- Louie Peters — Towards AI Co-founder and CEO
Hottest News
DeepMind and Google Brain have now merged into a single organization, led by Demis Hassabis from Brain. This merger aims to create an internal AI focus to compete with external pressures from groups like OpenAI. One interesting change is that the new group, called Google DeepMind, has a clear objective of developing “AI products,” which represents a departure from the previous group’s focus.
2. Stack Overflow Will Charge AI Giants for Training Data
Stack Overflow has announced that it will start charging large-scale AI developers for access to its 50 million questions and answers in a bid to improve the quality of data used to develop large language models and to expedite high-quality LLM development. This move, which is part of a broader generative AI strategy, has not been reported before. The decision comes after Reddit revealed this week that it too will start charging some AI developers to access its content from June.
3. Stability AI Launches the First of its StableLM Suite of Language Models
Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt the StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.
RedPajama is working towards reproducing LLaMA models using a 7B parameter model, and a filtered dataset of 1.2 trillion tokens with the goal of open-source reproducibility. To achieve this goal, the RedPajama 1.2 trillion token datasets, and a smaller random sample, are available for download through Hugging Face. The full dataset is approximately 5TB when unzipped on disk and around 3TB when downloaded and compressed, while the smaller random sample is more consumable.
5. Microsoft Readies AI Chip as Machine Learning Costs Surge
Microsoft has developed a specialized chip called Athena to power large language models for AI, to reduce costs and time. Other tech giants like Amazon, Google, and Facebook are also working on their own AI chips. Microsoft has been working on Athena since 2019. Currently, a limited number of Microsoft and OpenAI employees are testing the chip, but it’s expected to be available to both companies next year.
Three 5-minute reads/videos to keep you learning
Instead of being in a zero-sum competition with artificial intelligence bots, we might be on the verge of entering an era of intelligence superabundance. This article explores how AI can generate more demand for tasks that require intelligence, due to the increased overall supply of intelligence. The article covers various industries that can benefit from this phenomenon.
2. AI agent basics: Let’s Think Step By Step
This article introduces the concepts behind AgentGPT, BabyAGI, LangChain, and the LLM-powered agent revolution. It covers topics such as core agent concepts, how LLMs make plans, chain of thought reasoning, and more.
3. Prompt Engineering vs Blind Prompting
In this article, the differences between prompt engineering and blind prompting are explored. The importance of knowing how to effectively interact with AI models like ChatGPT is emphasized, along with the challenges and benefits of each approach for generating desired outputs.
Weights & Biases (W&B) has announced the launch of W&B Prompts, a suite of tools for prompt engineers working with large language models (LLMs). The new tools include LangChain and OpenAI integrations for logging, W&B Launch integration with OpenAI Evals, and improved handling of text in W&B Tables, all accessible through a one-line command.
5. What is Fuzzy Logic, Robotics & Future of Artificial Intelligence?
Artificially intelligent fuzzy logic is a method of problem-solving that resembles human reasoning. This article covers the concepts of fuzzy logic and robotics in AI along with future applications in self-driving cars, cybernetics, healthcare, education, and decision-making systems.
Papers & Repositories
MiniGPT-4 is a model that aligns a frozen visual encoder with a frozen LLM, called Vicuna, using only one projection layer. The study’s results demonstrate that MiniGPT-4 has many capabilities similar to those exhibited by GPT-4, such as generating detailed image descriptions and creating websites from hand-written drafts.
2. The Embedding Archives: Millions of Wikipedia Article Embeddings in Many Languages
Cohere has released the Embedding Archives, a collection of free, multilingual vectors created from millions of Wikipedia articles, which are particularly useful for AI developers constructing search systems. The articles are divided into segments, and each segment is assigned an embedding vector. The Embedding Archives are accessible through Hugging Face Datasets.
3. Suno-ai/bark: 🔊 Text-Prompted Generative Audio Model
Bark, created by Suno, is a transformer-based text-to-audio model capable of generating highly realistic, multilingual speech as well as other types of audio such as music, background noise, and simple sound effects. In addition, the model can produce nonverbal communication such as laughing, sighing, and crying. The bark is licensed under the non-commercial license CC-BY 4.0 NC, while the Suno models can be used commercially.
4. Evaluating Verifiability in Generative Search Engines
According to the paper, generative search engines frequently lack complete citation support, with a rate of 51.5%. Proposed metrics aim to promote the comprehensive use of citations, highlighting the importance of trustworthy and informative generative search engines. On average, only 74.5% of citations support their associated sentence.
5. Learning to Compress Prompts with Gist Tokens
The paper proposes a new method called “gisting” that allows for the specialization of LMs without the need for prompt-specific finetuning or distillation. This approach involves training the LM to compress prompts into smaller sets of “gist” tokens, which can be reused for compute efficiency. The gisting model can be easily trained as part of instruction finetuning by using a restricted attention mask that encourages prompt compression, thus avoiding the trade-off between specialization and training time.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Weekly AI Podcast
In this week’s episode of the “What’s AI” podcast, Louis Bouchard interviews Brian Burns, founder of the AI Pub Twitter page and Ph.D. candidate at the University of Washington. If you are considering pursuing a Ph.D. or wondering how to get into machine learning, improve your resume and interview skills, or even grow on Twitter, this episode is for you! Brian shares tips on how to land your first job in AI. Tune into the podcast for insights on getting into AI, growing a Twitter page, hosting a podcast, acing interviews, building a better resume, and more. You can find the podcast on YouTube, Spotify, or Apple Podcasts.
Upcoming Community Events
The Learn AI Together Discord community hosts weekly AI seminars to help the community learn from industry experts, ask questions, and get a deeper insight into the latest research in AI. Join us for free, interactive video sessions hosted live on Discord weekly by attending our upcoming events.
AdriBen will be presenting his paper “A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables” at the Neural Network Architecture Seminar. The presentation will be streamed live from Asia, which may result in an unusual time for some viewers. The seminar will be recorded, so even if you can’t attend live, you can still access the content later. Join the seminar here!
Date & Time: 25th April, 11:00 am EST
2. Open AI ChatGPT-4 Hackathon
George Batalinski is hosting an OpenAI and GPT-4 hackathon on our meetup group. This interactive seminar will involve building apps in teams. If you do not have a partner, we will match you up. Check out our meetup group here to join.
Date & Time: 26th April, 12:00 pm EST
3. ChatGPT and Google Maps Hands-on Workshop
@george.balatinski is hosting a workshop to harness the power of ChatGPT to create real-world projects for the browser using JavaScript, CSS, and HTML. The interactive sessions are focused on building a portfolio, guiding you to create a compelling showcase of your talents, projects, and achievements. Engage in pair programming exercises, work side-by-side with fellow developers, and foster hands-on learning and knowledge sharing. In addition, we offer valuable networking opportunities with our and other communities in Web and AI. Discover how to seamlessly convert code to popular frameworks like Angular and React and tap into the limitless potential of AI-driven development. Don’t miss this chance to elevate your web development expertise and stay ahead of the curve. Join us at our next meetup here and experience the future of coding with ChatGPT! You can get familiar with some of the additional content here.
Date & Time: 30th April, 6:00 pm EST
Add our Google calendar to see all our free AI events!
Meme of the week!
Meme shared by AgressiveDisco#4516
Featured Community post from the Discord
Remster#7324 has developed Wanderbot, an AI Assistant trip planner website for AI-driven travel solutions. This AI-powered travel companion helps with itinerary generation and trip planning, creating personalized travel plans based on user preferences. Wanderbot is built on the ChatGPT platform, offering an interactive map, easy sharing, and a passionate community. Check it out here to support a fellow community member! Join the conversation in the thread to share your feedback here.
AI poll of the week!
Join the discussion on Discord
TAI Curated section
Article of the week
Introduction to GANs with TensorFlow by Rokas Liuberskis
The author introduces Generative Adversarial Networks in TensorFlow in this tutorial. They also take a different approach by starting with DCGAN instead of a simple GAN. The stunning diagrams and visuals make the understanding not just extremely easy, but also fun to learn. The easy-to-follow code flow eases the learning process as well.
Our must-read articles
Meet DeepSpeed-Chat: Microsoft’s New Framework to Create ChatGPT-Like Models Using RLHF Training by Jesus Rodriguez
Face Detection with Viola-Jones Method by Janik Tinz
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Job offers
Artificial Intelligence Engineer @Plain Concepts (Remote)
Senior Tech Lead, Machine Learning @Tubi (Remote)
Data Engineer @Uplift (Remote)
Data Scientist (3–5 years experience) @Datalab USA (Broomfield, USA)
Senior Python Backend Engineer @Chattermill (Remote)
Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
This AI newsletter is all you need #44 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.