This AI newsletter is all you need #83
What happened this week in AI by Louie
This week, Meta’s AI strategy was in focus, with Mark Zuckerberg boasting of Meta’s GPU hoard and outlining his open-source-focused AI vision.
Meta CEO Mark Zuckerberg is entering the race to build AGI. As Zuckerberg explains, Meta’s new, broader focus on AGI was influenced by the release of Llama 2, its latest large language model, last year. While there isn’t a timeline for when AGI will be reached or even an exact definition of it, Zuckerberg believes to build the products that they want to build, they need to build for general intelligence while addressing the issue of the AI talent pool that wants to work on ambitious problems. In his vision, he also highlights the need to lean towards open-source and more transparent models for as long as it makes sense and is the safe and responsible thing to do.
At the same time, Meta is shaking things up by moving their AI research group, FAIR, to the same part of the company as the team building generative AI products across Meta’s apps. They have also been building their infrastructure to accommodate that push and plan to have about 600,000 H100 equivalent GPUs from chip designer Nvidia by the end of 2024.
In more near-term META news, we were also interested in new research and models with their new text-to-audio AI method called MAGNET and Mosaic-SDF for 3D Generative Models. MAGNET can quickly generate music, sound effects, and noises from text prompts. The sound quality is equal to other SOTA models but faster. On the other hand, the Mosaic-SDF is a simple, novel 3D shape representation for generative models. Overall, M-SDF is fast to compute, parameter efficient, and compatible with Transformer-based architectures.
Why should you care?
In the broader context, it has been only two years since Mark Zuckerberg changed the company name to focus on the metaverse. Meta’s latest smart glasses are showing early traction, but full-fledged mass adoption of AR glasses still feels further out. Zuckerberg has disagreed that he is now pivoting to AI, stating that generative AI plays a critical role in Meta’s Reality Labs and the metaverse efforts moving forward. Whether META can utilize its heavy GPU investment efficiently and truly compete with GPT-4 at the cutting edge of AI and LLMs remains to be seen. But we think it is important to see diversity in AI strategy, and one of the large tech companies still claims to embrace open source.
- Louie Peters — Towards AI Co-founder and CEO
Hottest News
1. AlphaGeometry: An Olympiad-Level AI System for Geometry
AlphaGeometry, an AI developed by DeepMind, has demonstrated human Olympiad-level proficiency in geometry by solving 25 out of 30 problems within competition timeframes. Utilizing a hybrid approach incorporating pattern recognition and formal logic emulates human problem-solving methods, effectively combining intuitive and analytical thinking.
2. Mark Zuckerberg’s New Goal Is Creating AGI
Mark Zuckerberg, Meta’s CEO, is entering the race to develop artificial general intelligence (AGI), aiming to enhance Meta’s apps and user experience. He emphasizes the need for AI talent and computing power and is considering an open-source approach to AI development, contrasting with the more closed methods of other companies.
3. Altman Seeks to Raise Billions for Network of AI Chip Factories
OpenAI CEO Sam Altman, who has been working to raise billions of dollars from global investors for a chip venture, aims to use the funds to set up a network of factories to manufacture semiconductors. Altman wants to work with ‘top major chipmakers’ to set up manufacturing plants worldwide to help meet surging demand for computing power.
4. Codium AI Proposes AlphaCodium: A New Advanced Approach To Code Generation
Researchers from CodiumAI have released a new open-source AI code-generating tool, AlphaCodium. The tool develops a code-oriented flow applicable to any LLM pre-trained for coding tasks. It uses an iterative process that repeatedly runs and fixes the generated code using the testing data.
5. Lazy Use of AI Leads to Amazon Products Called “I Cannot Fulfill That Request.”
E-commerce platforms, including Amazon, are experiencing issues with AI-generated content, leading to product listings with erroneous titles like “I cannot fulfill that request.” The AI’s mistakes in product description generation indicate broader challenges in online listing management.
Five 5-minute reads/videos to keep you learning
1. RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM Application?
RAG (Retrieval-Augmented Generation) and fine-tuning are methods for optimizing LLMs based on task-specific requirements. This article provides a comparative analysis of both approaches to test which one improves performance. It also suggests key considerations for enhancing LLM performance and shares use cases and the appropriate method for achieving improved performance.
2. Preference Tuning LLMs with Direct Preference Optimization Methods
This post evaluates three promising LLM alignment algorithms: Direct Preference Optimization (DPO), Identity Preference Optimisation (IPO), and Kahneman-Tversky Optimisation (KTO). The experiment shows that one algorithm outshines the others, and vital hyper-parameters must be tuned to achieve the best results.
3. Zeroing In on the Origins of Bias in Large Language Models
In a recent paper, Deciphering Stereotypes in Pre-Trained Language Models, co-authors Weicheng Ma and Soroush Vosoughi look at how stereotypes are encoded in pretrained large language models. This article is a concise introduction to the paper.
AI reliability is a concern, particularly regarding accuracy and potential dishonesty in responses. This article explains a recent paper introducing “honesty vectors” to assess and improve AI transparency, addressing the challenge of securing long-term AI safety and dependability.
5. Evaluations Are All We Need
The article explores the challenges of evaluating human and AI capabilities, particularly in recruitment and using LLMs. It addresses the limited effectiveness of current assessment methods for humans. It also highlights the developing and challenging nature of intelligence evaluation in AI.
Repositories & Tools
1. Artificial Analysis analyzes and ranks the best AI models and hosting providers.
2. Open Interpreter is a natural language interface for computers.
3. Friends Don’t Let Friends is a repository of opinionated essays suggesting good and bad practices in data visualization.
4. Lume automates data mappings to help you create data pipelines faster.
5. Camp 2.0 allows you to understand and organize your screenshots.
Top Papers of The Week!
1. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Vision Mamba (Vim) is a new vision backbone that replaces standard self-attention mechanisms with bidirectional Mamba blocks to enhance image processing by incorporating positional information. Vim has demonstrated superior performance on standard benchmarks like ImageNet, COCO, and ADE20k, surpassing existing models such as Vision Transformers (DeiT).
2. Self-Rewarding Language Models
Researchers have explored the concept of Self-Rewarding Language Models, where language models generate their own rewards during training. This concept posits that surpassing human-level performance necessitates training signals derived from superhuman feedback. The approach led to significant improvements in instruction-following and self-rewarding capabilities.
3. Transformers are Multi-State RNNs
Transformers, originally distinct from RNNs, are gaining a conceptual bridge to multi-state RNNs, with new research indicating that decoder-only Transformers may operate similarly to RNNs with infinite hidden states or as finite RNNs with a specific number of hidden states.
4. GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
GPT-4V offers an innovative evaluation methodology for text-to-3D generative models by automating benchmarks that align with human judgment, thereby addressing the need for robust evaluation metrics in the field. This system simulates detailed user assessments through tailored prompts, allowing cost-effective and scalable comparison of 3D assets against diverse and user-specific standards.
5. Scalable Pre-training of Large Autoregressive Image Models
Apple has released a research paper proposing AIM, a collection of vision models pre-trained with an autoregressive objective. These models have demonstrated that their performance improves with increased model size and data volume.
Quick Links
1. OpenAI partners with Arizona State University to use ChatGPT in classrooms. The focus will be on enhancing student success, forging new avenues for research, and streamlining organizational processes.
2. Google DeepMind scientists are in talks to leave and form an AI startup in Paris. The report added that the company may be focused on building a new AI model.
3. Microsoft has unveiled Copilot Pro, a premium productivity-enhancing tool for Microsoft 365 apps, priced at $20 per user/month. It offers priority access to advanced AI, including GPT-4 Turbo, for expedited responses.
4. TikTok is testing an AI song-generation feature. AI Song generates songs from text prompts with help from the large language model Bloom.
Who’s Hiring in AI!
AI Technical Writer and Developer for Large Language Models @Towards AI Inc (Remote)
Applied Scientist, AWS Responsible AI @Amazon (Seattle, WA, USA)
Big Data Engineer @Plain Concepts (Remote)
Senior Technical Sourcer, Machine Learning @Hugging Face (US/Remote)
ML Engineering Team Lead @NeuReality (Remote)
Data Scientist (LLM) @CloudWalk (Remote)
Software Engineer + Product Manager, AI @Hightouch (Remote)
Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.
If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!
This AI newsletter is all you need #82 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.