Discover more from Towards AI Newsletter
This AI newsletter is all you need #43
What happened this week in AI by Louie
This week, the competition in the race for Generative AI and LLMs continued to intensify. Amazon announced its foray into the space by hosting models from Amazon, Stability.AI, and AI21, and promoting the specifications of its in-house chips designed for inference and training. Elon Musk also entered the race by hiring top talent from DeepMind, including Igor Babuschkin and Manuel Kroiss, and purchasing 10,000 GPUs to build a “TruthGPT.” Microsoft’s Bing, which is based on GPT-4, increased pressure on Google to release LLM-powered search options (set to arrive under the project codename “Magi”), especially as Samsung is reportedly considering replacing Google with Bing as the default search engine on its mobile devices.
In the Open Source LLM movement, Databricks released Dolly 2, a 12-billion-parameter, instruction-tuned language model fine-tuned on approximately 15,000 instruction-tuning examples using EleutherAI’s Pythia-12b. This release is significant as it is fully licensed for research and commercial use, unlike many previous open source instruction-tuned models with questionable legal standing (some built on the leaked META llama model or by distilling information from OpenAI’s GPT-3.5, for example).
To cap off the week, the Auto-GPT agent AI project surpassed PyTorch in GitHub stars (now at 89k), highlighting the fast-paced and viral nature of AI products and innovations.
- Louie Peters — Towards AI Co-founder and CEO
Amazon is entering the competition for generative AI. However, instead of building AI models entirely on its own, it is partnering with third-party startups to host their models on AWS. It introduced Amazon Bedrock, which allows for the creation of AI-powered applications using pre-trained models from AWS and startups. Users can generate images, logos, and graphics through an API.
Alibaba, the Chinese technology giant, has announced its plan to launch an AI-powered chatbot similar to ChatGPT, called Tongyi Qianwen. This product will be integrated into Alibaba’s various businesses through its cloud computing unit, although the exact timeline for its release has not been specified yet.
Spain’s data protection authority, the AEPD, is conducting a preliminary investigation into OpenAI, the maker of ChatGPT, over potential violations of the General Data Protection Regulation (GDPR) of the European Union. This follows a similar move by Italy. However, there has been no order from the regulator to suspend processing by OpenAI.
Italy’s data protection regulator has given OpenAI a list of requirements to comply with GDPR and lift the ban on ChatGPT. These requirements include publishing an information notice, implementing age gating, clarifying legal basis, providing user data rights, allowing objections to data processing, and conducting an awareness campaign for Italian users.
Elon Musk is reportedly assembling a team of AI experts to launch an AI startup that would compete against OpenAI, the research organization he helped co-found years ago, according to a report by The Financial Times.
Our 5-minute reads/videos to keep you learning
Cohere For AI’s community has released a selection of NLP research for March 2023, featuring cutting-edge language models, unparalleled text generation, and revolutionary summarization techniques. This post covers an array of topics, showcasing the latest advancements in large language models and more.
Transformers are a recent breakthrough in machine learning that have gained significant attention in recent times. This blog post provides an overview of transformer architecture, its workings, and all its components. Additionally, it offers a conceptual introduction to the technology.
Meta AI developed a research demo of an AI system for animating artwork. They are releasing the animation code and dataset of 180k annotated amateur drawings to aid other AI researchers. The demo is browser-based and allows users to upload images, verify or correct a few annotations, and receive a short animation of their character.
This tutorial covers training and fine-tuning LLaMA, a large language model. Specifically, it focuses on Lit-LLaMA, a rewritten version that can perform inference on an 8 GB consumer GPU. The tutorial also explores how Lightning Fabric is used to speed up the PyTorch code.
In this Twitter thread, Amjad Masad explains the basics of building a dream MVP. He also addresses the question of whether AI will replace developers by encouraging people to engage in coding for 100 days. According to him, learning to code has become more valuable with the help of AI, with a predicted 10x ROI.
Papers & Repositories
OpenAssistant is a chatbot designed to understand tasks, interact with third-party systems, and retrieve information dynamically to accomplish them. It is a project aimed at providing everyone with access to a high-quality chat-based large language model.
Dolly 2 is a large language model developed by Databricks, trained on their Machine Learning Platform. It has 12 billion parameters and is a causal language model based on EleutherAI’s Pythia-12b. Dolly 2 was fine-tuned using a ~15K record instruction corpus that was created by Databricks employees and released under a permissive license.
This study introduces Self-Debugging, a method that trains a large language model to debug its predicted program with few-shot demonstrations. The model is taught to perform rubber duck debugging, identifying its mistakes by explaining the generated code in natural language, without receiving any feedback on code correctness or error messages.
This paper introduces a new approach of using GPT-4 to generate instruction-following data for LLM finetuning. The early experiments conducted on instruction-tuned LLaMA models indicate that the 52K English and Chinese instruction-following data produced by GPT-4 achieves better zero-shot performance on new tasks compared to the data generated by previous models.
This paper proposes consistency models, a new type of generative models that achieve high sample quality without adversarial training. They allow for fast one-step generation, few-step sampling, and zero-shot data editing. Consistency models can be trained to distill pre-trained diffusion models or as standalone generative models, and outperform existing distillation techniques for diffusion models in one- and few-step generation.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Upcoming Community Events
The Learn AI Together Discord community hosts weekly AI seminars to help the community learn from industry experts, ask questions, and get a deeper insight into the latest research in AI. Join us for free, interactive video sessions hosted live on Discord weekly by attending our upcoming events.
AdriBen will be presenting his paper “A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables” at the Neural Network Architecture Seminar. The presentation will be streamed live from Asia, which may result in an unusual time for some viewers. The seminar will be recorded, so even if you can’t attend live, you can still access the content later. Join the seminar here!
Date & Time: 25th April, 1:00 pm EST
Meme of the week!
Meme shared by Rucha#8062
Featured Community Post from the Discord
Creativity will remain uniquely human, but as AI moves further into the creative realms, how will it change the way we write?
Rucha#8062 is conducting a workshop titled “Creative Writing in the Time of AI: Experiments with AI as your Intern”. The workshop will explore the aspects of the writing process that could potentially be delegated to AI, and those that should remain solely within the realm of human expression. Through this workshop, participants will gain greater clarity and understanding of how to find their own voice, while leveraging the tools and resources offered by AI. Check it out here and support a fellow community member! Share your thoughts on the topic by joining the discussion here.
AI poll of the week!
TAI Curated section
Article of the week
Generative Adversarial Networks (GANs) are a deep learning architecture that has gained popularity for their ability to generate realistic new data. However, building a GAN model is only the first step, as deploying it as a user-friendly web application presents a separate challenge. This article provides an in-depth discussion of the background and problem statement related to GANs. It also covers setting up the working environment, loading pre-trained GAN models and images, and building a Streamlit web application.
Our must-read articles
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact email@example.com.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!