Discover more from Towards AI Newsletter
This AI newsletter is all you need #73
What happened this week in AI by Louie
Conversation this week was again dominated by the aftermath of OpenAI’s Devday, new product releases, and speculation on the future potential of the GPTStore, with over 10,000 GPTs created already. But we have also been interested in seeing several new studies released on the state of AI and its adoption in the economy this week.
A recent study examined the impact of the ChatGPT launch on job losses and earnings in freelance fields such as copywriting and graphic design. The study revealed that not only did ChatGPT significantly reduce the number of jobs available to individuals, but it also depreciated the value of the work. A separate study conducted by the Boston Consulting Group (BCG) indicates that employees with access to GPT-4 accomplished 12% more tasks at a 25% faster pace, with a 40% improvement in quality. The study highlighted the most significant performance boost among junior team members who could leverage GPT-4’s knowledge to complement their skills.
Additionally, the “The State of AI 2023” survey from Retool.com, which has close to 1600 participants, shows interesting data points on sentiment towards AI, its adoption, and the current leaders. (We’ve highlighted some intriguing findings in this summary and strongly encourage you to read the full report.) Most respondents shared a common belief regarding the anticipated impact of AI on their careers within the next five years. Furthermore, there is a noticeable preference for hiring engineers proficient in utilizing AI tools such as ChatGPT/Copilot. Interestingly, 80% of participants use various iterations of ChatGPT (including GPT-3, 3.5, and 4). Meanwhile, the primary concerns revolve around model accuracy and hallucination, with 67% of respondents expressing worry. In terms of developer tools, market dominance is currently held by Huggingface, LangChain, and LlamaIndex. Finally, according to respondents, the survey findings indicate that GitHub Copilot, ChatGPT, and Google Bard are the most valuable tools.
Why should you care?
It is easy to get caught up in the incredible pace of new AI model releases and capability improvements. Still, it is sometimes difficult to judge how these tools are being adopted more broadly and whether they are beginning to impact the economy. So, we think it is essential to see detailed studies on AI adoption across industries so we can start to plan for both the positive and negative impacts of this technology. Clearly, in some areas, LLM adoption is already significantly impacting employees, both negatively (wage reduction) and positively (productivity and quality improvement). But in other ways, adoption is still very early, and companies are only beginning to adapt to the new capabilities. Perhaps OpenAI’s latest GPT product and better UI for sharing prompts and ideas for LLM use cases will further accelerate these trends.
- Louie Peters — Towards AI Co-founder and CEO
NVIDIA has significantly enhanced the Pandas library, achieving up to 150 times faster performance by capitalizing on GPUs. With the new cudf.pandas module, operations are seamlessly executed on the GPU or CPU, providing automatic synchronization and efficient switching between the two.
In 2023, AI research and industry focused on improving existing technologies like GPT and DALL-E rather than making radical innovations. Companies became more protective of their proprietary information, resulting in less public disclosure of research papers. However, there were productive advancements in open-source, with Fuyu-8B leading to smaller and more efficient models. AI has proved helpful in various fields, but ethical concerns and pitfalls must be addressed in the future.
GitHub is implementing AI technology through Copilot and Copilot Chat, aiming to revolutionize software development by providing code understanding, suggestions, security fixes, and an enhanced developer experience. Copilot Chat will be powered by OpenAI’s GPT-4 model and is set to be available starting from December 2023.
OpenAI has announced an effort called Data Partnerships, aiming to collaborate with third-party organizations to construct public and private datasets for AI model training. This effort intends to make models more practical for various organizations. The program’s primary focus is to gather extensive data not readily available on the internet, particularly emphasizing data reflecting human intention in multiple languages, topics, and formats.
Adept is opening access to Adept Experiments, an AI-powered workflow builder that enables users to automate complex or tedious tasks across various software platforms with simple language commands. It helps delegate repetitive knowledge tasks, turn unstructured data into structured data, and even order dinner.
This week in AI focused on refining existing tech, data, and compute. What’s your view: Build new or refine what’s here? Share it in the comments!
Five 5-minute reads/videos to keep you learning
This essay is an excellent exploration of the artificial intelligence landscape and its trajectory. The article maps key events that have shaped AI into its current state, with influential companies like Google, IBM, and OpenAI playing pivotal roles in accelerating innovation. It also touched on the future of responsible AI, consumer AI, and more.
The latest version of GPT, GPT-4, has introduced image analysis capabilities, including chart images. While it can provide a general analysis of chart images, there is room for significant improvement, particularly in accurately quantifying the data. This article uses a few chart types to determine how good (or bad) GPT-4 is at chart image analysis.
The Hallucination Evaluation Model (HEM) is an open-source tool developed to measure the frequency of hallucinations in Retrieval Augmented Generation (RAG) systems. It assesses AI’s dependability by evaluating generative LLMs’ ability to accurately summarize results without producing unrelated or biased outputs.
DALL·E-3 is an upgraded version of DALL-E text-to-image models, showcasing superior image quality across various domains. Some notable features include prompt rewriting using GPT-4 for enhanced results, adjustable image quality parameters, and flexible image sizes. This article focuses on the new features and capabilities of DALL·E-3 with some examples of what new products can be built with the API.
This guide covers link-time optimization-related features focusing on common toolchains for languages like C and Rust, which typically use ahead-of-time (AOT) compilation and linking.
Repositories & Tools
1. XTTS v2
XTTS is a text-to-speech model that lets users clone voices into different languages. It supports 16 languages and is the same model that powers our creator application, Coqui Studio, and the Coqui API.
Giskard is a Python library that automatically detects vulnerabilities of AI models, from tabular models to LLM, including performance biases, data leakage, spurious correlation, hallucination, and many more.
The Monaspace type system is a monospaced type superfamily of fonts for code. It consists of five variable axis typefaces; each one has a distinct voice, but they are all metrics-compatible with one another, allowing users to mix and match them for a more expressive typographical palette.
MindStudio allows users to build custom no-code AI apps using any model and prompting. It allows users to train the AI on external data and deploy their AI apps publicly or privately.
Graphlit is an API-first developer platform for building applications with LLMs. Using the RAG pattern, Graphlit leverages the power of LLMs like OpenAI’s GPT-3.5 and GPT-4 to transform complex data into a searchable, conversational knowledge graph.
Top Papers of The Week!
This paper presents OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs. It can accept images in their native resolution, making it possible for the model to pick up on minute details. With similar parameter sizes, otterHD-8 B outperforms other LMMs on MagnifierBench, such as InstructBLIP, LLaVA, and Qwen-VL.
This paper compared pretrained models for computer vision tasks and found ConvNeXT, a ConvNet inspired by Vision Transformers, performs best across different tasks. While vision transformers and self-supervised learning are popular, supervised pretrained convolutional neural networks still offer superior performance in most cases.
This work proposes TEAL (Tokenize and Embed ALl), a system that simplifies modeling interactions among multi-modal inputs and generates non-textual modalities. It treats input from any modality as a token sequence and learns a joint embedding space for all modalities. This allows multi-modal large language models to predict multi-modal tokens more effectively, enabling tasks with non-textual modalities like images and audio.
DeepMind has introduced a “Levels of AGI” framework that categorizes artificial intelligence into ‘narrow’ and ‘general’ intelligence. The framework outlines five levels of AI performance, from emerging to superhuman, based on their ability to learn, reason, and apply knowledge. This framework could be helpful in an analogous way to the levels of autonomous driving by providing a common language to compare models, assess risks, and measure progress along the path to AGI.
This paper introduces JARVIS-1, an open-world agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe. In experiments, JARVIS-1 exhibits nearly perfect performances across over 200 varying tasks ranging from entry to intermediate levels.
Google is in talks to invest hundreds of millions of dollars in Character.AI as the fast-growing artificial intelligence chatbot startup seeks capital to train models and keep up with user demand.
IBM announced it is launching a $500 million venture fund to invest in a range of AI companies — from early-stage to hyper-growth startups — focused on accelerating generative AI technology and research for the enterprise.
Two former Coca-Cola vice presidents have joined forces to bring to market an artificial intelligence (AI) system that is already helping several high-profile Fortune 500 companies hone their sustainability strategies.
OpenAI’s New Weapon in Talent War With Google: $10 Million Pay Packages for Researchers.
The California-headquartered company Iterate launched AppCoder LLM — a fine-tuned model that can instantly generate working and updated code for production-ready AI applications using natural language prompts.
Who’s Hiring in AI!
Interested in sharing a job opportunity here? Contact email@example.com.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!