Discover more from Towards AI Newsletter
This AI newsletter is all you need #44
What happened this week in AI by Louie
This week in AI has seen some exciting developments in the world of open-source language models together with further discussion on the legal standing of LLM training data and AI-generated content.
Stability AI, the company behind the AI-powered Stable Diffusion image generator, has released a suite of open-source large language models (LLMs) called StableLM. These models are currently available between 3 billion and 7 billion parameters, with larger models arriving later. Similarly, Together has announced RedPajama, an open-source project in collaboration with other AI organizations to create large language models. RedPajama has released a 1.2 trillion token dataset that replicates the LLaMA recipe, enabling organizations to pre-train models that can be permissively licensed. RedPajama has three key components: pre-training data, base models, and instruction tuning data and models. RedPajama and StableLM follow the recent release of Dolly 2.0 and together should give a lot more flexibility for individuals or groups to train or fine-tune their own custom models for use in research or commercial products.
However, the legal standing of LLM training data and AI-generated content was in focus again with Reddit’s and Stack Overflow’s announcement that they will start charging large-scale AI developers for access to their data. It also follows Twitter’s withdrawal of access to its data from OpenAI. These moves towards increased data protection pose questions about the legal standing of existing models trained on scraped data and also on whether AI progress could slow if data becomes more difficult to legally access. The ownership and copyright status of AI-generated content also remains hotly debated as AI training and inspiration don’t fit neatly into existing IP and copyright laws. While some creators are embracing the new potential of AI other cases will be settled in court. While viral music using AI to imitate Drake’s voice has been taken down by Universal Music Group, the Canadian producer and singer Grimes has offered to split 50% of royalties with any AI-generated song that successfully harnesses her voice. This 50/50 split is apparently the same deal she would make with any other artist collaboration, AI or not.
- Louie Peters — Towards AI Co-founder and CEO
DeepMind and Google Brain have now merged into a single organization, led by Demis Hassabis from Brain. This merger aims to create an internal AI focus to compete with external pressures from groups like OpenAI. One interesting change is that the new group, called Google DeepMind, has a clear objective of developing “AI products,” which represents a departure from the previous group’s focus.
Stack Overflow has announced that it will start charging large-scale AI developers for access to its 50 million questions and answers in a bid to improve the quality of data used to develop large language models and to expedite high-quality LLM development. This move, which is part of a broader generative AI strategy, has not been reported before. The decision comes after Reddit revealed this week that it too will start charging some AI developers to access its content from June.
Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt the StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.
RedPajama is working towards reproducing LLaMA models using a 7B parameter model, and a filtered dataset of 1.2 trillion tokens with the goal of open-source reproducibility. To achieve this goal, the RedPajama 1.2 trillion token datasets, and a smaller random sample, are available for download through Hugging Face. The full dataset is approximately 5TB when unzipped on disk and around 3TB when downloaded and compressed, while the smaller random sample is more consumable.
Microsoft has developed a specialized chip called Athena to power large language models for AI, to reduce costs and time. Other tech giants like Amazon, Google, and Facebook are also working on their own AI chips. Microsoft has been working on Athena since 2019. Currently, a limited number of Microsoft and OpenAI employees are testing the chip, but it’s expected to be available to both companies next year.
Three 5-minute reads/videos to keep you learning
Instead of being in a zero-sum competition with artificial intelligence bots, we might be on the verge of entering an era of intelligence superabundance. This article explores how AI can generate more demand for tasks that require intelligence, due to the increased overall supply of intelligence. The article covers various industries that can benefit from this phenomenon.
This article introduces the concepts behind AgentGPT, BabyAGI, LangChain, and the LLM-powered agent revolution. It covers topics such as core agent concepts, how LLMs make plans, chain of thought reasoning, and more.
In this article, the differences between prompt engineering and blind prompting are explored. The importance of knowing how to effectively interact with AI models like ChatGPT is emphasized, along with the challenges and benefits of each approach for generating desired outputs.
Weights & Biases (W&B) has announced the launch of W&B Prompts, a suite of tools for prompt engineers working with large language models (LLMs). The new tools include LangChain and OpenAI integrations for logging, W&B Launch integration with OpenAI Evals, and improved handling of text in W&B Tables, all accessible through a one-line command.
Artificially intelligent fuzzy logic is a method of problem-solving that resembles human reasoning. This article covers the concepts of fuzzy logic and robotics in AI along with future applications in self-driving cars, cybernetics, healthcare, education, and decision-making systems.
Papers & Repositories
MiniGPT-4 is a model that aligns a frozen visual encoder with a frozen LLM, called Vicuna, using only one projection layer. The study’s results demonstrate that MiniGPT-4 has many capabilities similar to those exhibited by GPT-4, such as generating detailed image descriptions and creating websites from hand-written drafts.
Cohere has released the Embedding Archives, a collection of free, multilingual vectors created from millions of Wikipedia articles, which are particularly useful for AI developers constructing search systems. The articles are divided into segments, and each segment is assigned an embedding vector. The Embedding Archives are accessible through Hugging Face Datasets.
Bark, created by Suno, is a transformer-based text-to-audio model capable of generating highly realistic, multilingual speech as well as other types of audio such as music, background noise, and simple sound effects. In addition, the model can produce nonverbal communication such as laughing, sighing, and crying. The bark is licensed under the non-commercial license CC-BY 4.0 NC, while the Suno models can be used commercially.
According to the paper, generative search engines frequently lack complete citation support, with a rate of 51.5%. Proposed metrics aim to promote the comprehensive use of citations, highlighting the importance of trustworthy and informative generative search engines. On average, only 74.5% of citations support their associated sentence.
The paper proposes a new method called “gisting” that allows for the specialization of LMs without the need for prompt-specific finetuning or distillation. This approach involves training the LM to compress prompts into smaller sets of “gist” tokens, which can be reused for compute efficiency. The gisting model can be easily trained as part of instruction finetuning by using a restricted attention mask that encourages prompt compression, thus avoiding the trade-off between specialization and training time.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Weekly AI Podcast
In this week’s episode of the “What’s AI” podcast, Louis Bouchard interviews Brian Burns, founder of the AI Pub Twitter page and Ph.D. candidate at the University of Washington. If you are considering pursuing a Ph.D. or wondering how to get into machine learning, improve your resume and interview skills, or even grow on Twitter, this episode is for you! Brian shares tips on how to land your first job in AI. Tune into the podcast for insights on getting into AI, growing a Twitter page, hosting a podcast, acing interviews, building a better resume, and more. You can find the podcast on YouTube, Spotify, or Apple Podcasts.
Upcoming Community Events
The Learn AI Together Discord community hosts weekly AI seminars to help the community learn from industry experts, ask questions, and get a deeper insight into the latest research in AI. Join us for free, interactive video sessions hosted live on Discord weekly by attending our upcoming events.
AdriBen will be presenting his paper “A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables” at the Neural Network Architecture Seminar. The presentation will be streamed live from Asia, which may result in an unusual time for some viewers. The seminar will be recorded, so even if you can’t attend live, you can still access the content later. Join the seminar here!
Date & Time: 25th April, 11:00 am EST
George Batalinski is hosting an OpenAI and GPT-4 hackathon on our meetup group. This interactive seminar will involve building apps in teams. If you do not have a partner, we will match you up. Check out our meetup group here to join.
Date & Time: 26th April, 12:00 pm EST
Date & Time: 30th April, 6:00 pm EST
Meme of the week!
Meme shared by AgressiveDisco#4516
Featured Community post from the Discord
Remster#7324 has developed Wanderbot, an AI Assistant trip planner website for AI-driven travel solutions. This AI-powered travel companion helps with itinerary generation and trip planning, creating personalized travel plans based on user preferences. Wanderbot is built on the ChatGPT platform, offering an interactive map, easy sharing, and a passionate community. Check it out here to support a fellow community member! Join the conversation in the thread to share your feedback here.
AI poll of the week!
TAI Curated section
Article of the week
The author introduces Generative Adversarial Networks in TensorFlow in this tutorial. They also take a different approach by starting with DCGAN instead of a simple GAN. The stunning diagrams and visuals make the understanding not just extremely easy, but also fun to learn. The easy-to-follow code flow eases the learning process as well.
Our must-read articles
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact email@example.com.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!