Discover more from Towards AI Newsletter
This AI newsletter is all you need #69
What happened this week in AI by Louie
Google joined the likes of Microsoft and Adobe to announce that they will be committed to safeguarding users of their AI services from potential lawsuits related to Intellectual Property violations, provided that these users utilize Google Cloud (Vertex AI) and Workspace (Duet AI) platforms. The press release, however, made no mention of the AI ChatBot Bard and clarified that it would not extend its protection to situations in which users deliberately attempt to “infringe the rights of others.” Although it is true that AI-related lawsuits typically focus on the companies that have implemented the technology rather than the end-users utilizing the platform, it still seems like a wise reassurance for corporations to incorporate Google AI services into their products.
In other news, XLANG Lab has unveiled Lemur-70B and its chat variant, with the goal of developing a model that excels in text generation, coding capabilities and agent tasks. This sets it apart from other open source models that typically specialize in one area or the other. The concept centers around the notion that for an agent to excel in reasoning and executing real-world tasks, it must demonstrate a high level of accuracy in both language comprehension and coding skills. They pre-trained the LLaMA-2 model using a 90B token dataset, which maintained a 1-to-1 ratio between text and coding samples to create the Lemur mode. They then applied supervised fine-tuning using a dataset comprising 300,000 examples to fine-tune the model, leading to the development of Lemur-Chat. The model can outperform other open-source models by a significant margin while achieving performance similar to that of GPT-3.5. They have also made their code and, most importantly, their dataset available, which will contribute to advancing the field.
- Louie Peters — Towards AI Co-founder and CEO
This issue is brought to you thanks to AI Infrastructure Alliance:
The AI Infrastructure Alliance just released a big technical guide to Agents, LLMs, and Smart Apps. It covers prompt engineering, major frameworks like LlamaIndex, LangChain, and Semantic Kernel, vector DBs, fine-tuning, open and closed source models, common app design patterns, LLM logic and reasoning, and more. GPT created the current generative AI craze, but its apps that do real work will unleash the next wave of AI software, and this guide gets you started building those apps right.
Announcing Towards AI’s AI Tutor
Last week, we released our LLM course. We are excited to announce another related project: our own AI tutor!
Thanks to the thousands of articles published under the Towards AI publication, all the content from our LangChain and LLM course, and other great available sources like Huggingface Transformers and Wikipedia, we were able to build a powerful retrieval augmented generation (RAG)-based chatbot that can answer any AI-related question and even credit its sources!
RAG is a powerful approach to reducing hallucination risks and providing a way to reference the knowledge shared by a chatbot so you can dive in and learn more. So, the only thing required to build such an AI tutor is an excellent chatbot (GPT), a great knowledge base (Towards AI), and some handwork.
We are excited to announce it is live and free! Please try it and give us feedback on how to improve it! Are the responses too short, too detailed, and wrong sources…? Help us make it better for you! We plan to keep improving it and keep it free for all our students!
A 21-year-old student from the University of Nebraska-Lincoln used AI to decipher Greek letters from an unopened scroll discovered after the eruption of Mount Vesuvius in 79 AD. Using a machine learning algorithm, the student successfully identified over 10 Greek characters, including “porphyras,” meaning purple, and won the Vesuvius Challenge.
Google has joined other tech giants in defending users of generative artificial intelligence systems in its Google Cloud and Workspace platforms if they are accused of intellectual property violations. While they offer indemnity for software such as Vertex AI and Duet AI, intentional content manipulation for copyright infringement is not covered.
XLANG Lab introduced Lemur-70B, a new open-source LLM. It surpasses other models in agent benchmarks and excels in language and coding tasks. It achieves performance levels similar to GPT-3.5-turbo on code tasks, narrowing the performance gap with commercial models in agent abilities.
In September, ChatGPT’s mobile app experienced record downloads and revenue globally, with 15.6 million downloads and $4.6 million in revenue. However, the growth rate has decreased from 30% to 20%, suggesting possible market saturation for the $19.99 ChatGPT+ subscription.
OpenAI quietly revised all of the “Core Values” listed on its website in recent weeks, emphasizing the development of AGI. It described AGI as “highly autonomous systems that outperform humans at most economically valuable work.” The previous values resemble those of a research lab, while the new ones are more similar to startup jargon.
Five 5-minute reads/videos to keep you learning
This free course on training & fine-tuning LLMs for production provides insights into topics such as the introduction to LLMs, understanding Transformers and GPT architectures, training and fine-tuning LLMs, improving LLMs with RLHF and deploying LLMs.
The ability to reason and solve problems remains a topic of interest in LLMs like GPT. This article presents how GPT-4 can simulate human-like reasoning when prompted with riddles and logic puzzles. Still, biases and hallucinations can impact its judgment.
Researchers have found that by using a synthetic dataset created with GPT-4 and the CoD prompting technique, GPT3.5 can outperform GPT-4 in news article summarization. This fine-tuned version of GPT3.5 is not only 11 times faster but also 63% more cost-effective compared to GPT-4 zero-shot while still achieving similar performance with CoD prompting.
This article dives deeper into the strategies to mitigate hallucinations and biases in large language models. It discusses tips to combat these tendencies of LLMs, such as inference parameter tweaking, prompt engineering, and more advanced techniques to enhance the reliability and accuracy of your LLMs.
Researchers have discovered linear structures in Large Language Models that separate true and false examples, indicating the presence of an internal “truth axis.” Their work provides several strands of evidence that LLM representations may contain a specific “truth direction” denoting factual truth values.
Papers & Repositories
Prometheus, an open-source LLM, offers a cost-effective alternative to proprietary LLMs like GPT-4 for large-scale task evaluation. Using score rubrics and user-defined instructions, Prometheus demonstrates comparable performance to GPT-4 and outperforms models like ChatGPT, as indicated by experimental results.
Finetuning LLMs can compromise their safety alignment and lead to potential risks. Even a few adversarial training examples can jailbreak the safety guardrails of models like GPT-3.5 Turbo. Fine-tuning with harmful and benign datasets can inadvertently degrade the safety alignment of language models.
A study on GPT-4V reveals that we are at the beginning of Large Multimodal Models (LMMs), showcasing its potential in various tasks such as image descriptions, object localization, multimodal knowledge, coding with vision, emotional quotient tests, and applications in industries like medical and auto insurance.
Language models like LLMs have a long way to go in resolving real-world issues on GitHub, according to a recent study. Proprietary models such as Claude 2 and GPT-4 were able to solve only a small percentage of cases in an evaluation framework called SWE-bench.
A study comparing retrieval-augmentation and extended context window approaches in downstream tasks found that using a 4K context window with simple retrieval techniques can achieve similar performance to a 16K window. The best-performing model, retrieval-augmented LLaMA2–70B, with a 32K window, outperformed GPT-3.5-turbo-16k in question-answering and summarization tasks.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Meme of the week!
Meme shared by rucha8062
Featured Community post from the Discord
Hassan_0707 created Oria Windows, allowing users to control their PC with OpenAI. It integrates Open Interpreter and ChatGPT into the Windows environment to offer the power of automation, data manipulation, web scraping, and natural language understanding. Check it out here and support a fellow community member! Share your feedback and questions in the thread.
AI poll of the week!
TAI Curated section
Article of the week
This article summarizes some of the most essential LLM papers published during the first week of October. The papers cover various topics shaping the next generation of language models, from model optimization and scaling to reasoning, benchmarking, and enhancing performance. The final sections discuss papers on training LLMs safely and ensuring their behavior remains beneficial.
Our must-read articles
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact email@example.com.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
Thanks for reading the Towards AI Newsletter! Subscribe for free to receive new posts and support my work.