Discover more from Towards AI Newsletter
This AI newsletter is all you need #71
What happened this week in AI by Louie
This week, President Joe Biden brought AI regulation back into the spotlight by signing an executive order to oversee artificial intelligence. This directive calls on various government agencies to establish new guidelines concerning AI safety, data privacy, and cybersecurity. A notable part of the order is allocating funds for research on preserving privacy. The government is evidently emphasizing regulation in areas such as privacy, equity, and civil rights. It’s crucial to note that without the backing of Congress, this executive order does not have the force of law. In the upcoming months, we may witness initiatives from the European Union, and we’ve already observed strict regulations from China.
The news ignited debates on X (formerly Twitter), where prominent AI researchers voiced their concerns about AI regulations and the potential detrimental effects on open-source projects. Andrew Ng commented, “There are definitely large tech companies that would rather not have to try to compete with open source.” Meanwhile, Yann LeCun, the Chief Data Scientist at Meta, expressed, “If […] fear-mongering campaigns succeed, they will *inevitably* result in […] a catastrophe: a small number of companies will control AI.” It’s evident that striking the right balance between mitigating AI risks and preventing barriers that favor only the big entities is challenging.
- Louie Peters — Towards AI Co-founder and CEO
U.S. President Joe Biden is seeking to reduce the risks of AI to consumers, workers, minority groups, and national security with a new executive order. While some startups welcomed the order, some CEOs expressed concerns over whether it could impede smaller companies and stifle innovation.
Jina AI introduced jina-embeddings-v2, an open-source embedding model that supports 8K context length. It matches OpenAI’s 8K model in critical areas like Classification Average, Reranking Average, Retrieval Average, and Summarization Average in the MTEB leaderboard.
Google has committed to a substantial investment in Anthropic, a competitor to OpenAI, earmarking up to $2 billion for the AI startup. Anthropic is the developer of Claude 2, a chatbot used by companies including Slack, Notion, and Quora.
Anthropic, Google, Microsoft, and OpenAI have invested over $10 million to create a new AI Safety Fund, aiming to advance research in the responsible and safe development of frontier AI models. This move signifies an industry-wide effort to raise safety standards and proactively address the challenges posed by advanced AI systems.
OpenAI began rolling out a new version of ChatGPT that combines all GPT-4 capabilities — Browsing, DALL-E 3, and Data Analysis — without needing to switch between modes. The update gives access to all ChatGPT features within the main interface — without toggling through multiple modes.
Five 5-minute reads/videos to keep you learning
This guide shares how to evaluate and benchmark Large Language Models (LLMs) effectively. Learn more about perplexity, other evaluation metrics, and curated benchmarks to compare LLM performance. It also includes practical tools to select the suitable model for your needs and tasks.
This blog post looks at a few techniques for evaluating the outputs generated by an LLM. It also shares insight into techniques such as user feedback and human annotators, as well as replicating human evaluation methods using LLMs.
This article focuses on building the generative part of a chatbot, enhancing LLM chatbots with RAG, a vital component of the Chat endpoint that makes it possible to connect the API to external data for augmented generation.
This article focuses on how and why the reversal curse impacts large language models. Unlike humans, LLMs may struggle to answer questions that involve information reversal effectively. Additionally, LLMs have been found to falter in areas where human proficiency is strong.
This post discusses the philosophical question of whether large language models like ChatGPT, known as LLMs, have the reasoning abilities of Aristotle’s Syllogism. It explores the connection between AI and philosophy, specifically in logical reasoning.
Papers & Repositories
QMoE is a practical solution for compressing trillion-parameter models like the SwitchTransformer to <1 bit/parameter, greatly reducing memory demand. It achieves a 20x compression rate with minimal accuracy loss and can run efficiently on affordable hardware.
JudgeLM is a method that improves the evaluation of Large Language Models by fine-tuning them as scalable judges. By compiling a dataset and using augmentation techniques, JudgeLM addresses biases and performs well on benchmarks. It outperforms human judgment and demonstrates versatility in different formats.
Contrastive Preference Learning (CPL) is a new approach to Reinforcement Learning from Human Feedback (RLHF) that avoids the need for traditional RL methods. By focusing on regret rather than reward, CPL simplifies the learning process and has the potential to be applied effectively in higher-dimensional RLHF scenarios.
Google DeepMind conducted a study comparing Convolutional Neural Networks (ConvNets) and Vision Transformers (ViTs) for large-scale image classification. In summary, ConvNets and ViTs perform similarly when given comparable resources.
HallusionBench is a newly curated benchmark designed to study language hallucination and visual illusion in Vision-Language Models like GPT4-V and LLaVA-1.5. This benchmark challenges the models’ ability to reason with image context and highlights potential weaknesses in their vision modules.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Meme of the week!
Meme shared by rucha8062
Featured Community post from the Discord
NYLee has launched LLMwarre, a unified, open, and extensible framework for LLM-based application patterns, including Retrieval Augmented Generation (RAG). This project provides a comprehensive set of tools anyone can use — from beginners to the most sophisticated AI developers. It includes PDF and Office document parsing, text chunking, embedding vectors using Milvus, FAISS, or Pinecone, and hybrid searching. Check it out on GitHub and support a fellow community member. Share your feedback in the thread here!
AI poll of the week
TAI Curated section
Article of the week
This article is the first in a series of three blog posts explaining step-by-step how to build an AI assistant to summarize YouTube videos. Start this series with in-depth instructions for capturing the transcript of a YouTube video using OpenAI’s Whisper, followed by text summarization using Langchain and demonstrating a solution prototype using Gradio and Hugging Face Spaces.
Our must-read articles
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact email@example.com.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
Thanks for reading the Towards AI Newsletter! Subscribe for free to receive new posts and support my work.