TAI 134: The US Reveals Its New Regulations for the Diffusion of Advanced AI
Also, Phi-4 model weights, Mistal Codestral 25.01, Cohere North, rStar-Math, and more!
What happened this week in AI by Louie
The Biden administration’s “Framework for Artificial Intelligence Diffusion” has been unveiled, setting out sweeping rules for managing how advanced AI technologies are exported and utilized globally. The framework creates a three-tier system for AI chip exports and model weights, aiming to balance U.S. technological leadership with international cooperation. The rules triggered immediate pushback from Nvidia and other tech companies while receiving cautious support from some national security experts. The framework was introduced in the final days of the Biden administration, leading to speculation about its longevity and the incoming Trump administration’s potential to alter or repeal it. The 120-day comment period might also allow for significant changes.
At its core, the framework establishes different levels of access to U.S. AI technology. The top tier of 18 close allies maintains essentially unrestricted access. The middle tier, covering most countries, faces caps on computing power imports unless they meet specific security requirements. This group is effectively capped at around 50,000 advanced AI chips through 2027, absent exceptions and agreements. The bottom tier of restricted countries remains largely blocked from access to chips that meet specific criteria.
The rules also control exports of AI model weights above certain thresholds (initially a rather arbitrary threshold of 10²⁶ computational operations or more during training). However, open-weight models remain uncontrolled. The framework requires security standards for hosting powerful AI systems’ weights internationally.
Nvidia responded forcefully against the framework, calling it “unprecedented and misguided” while arguing it would “derail innovation and economic growth worldwide.” The company suggests the rules will push countries toward Chinese alternatives rather than achieve their intended security goals. Other critics warn about diplomatic fallout, noting that many allies find themselves in tier two. However, the administration counters that China lacks the capacity to “backfill” restricted chip exports in the near term. They argue this creates leverage to encourage the adoption of U.S. security standards in exchange for computing access.
Why should you care?
This framework could significantly reshape global AI development, but whether it will do so effectively remains to be seen. The broader question is whether this framework will achieve its stated goals or disrupt innovation, backfire against the US’ current tech leadership, and exacerbate fragmentation in the global AI landscape. Will tier-two countries align with U.S. standards, or will they seek alternatives in Chinese or open-source technologies? Can the U.S. bureaucracy implement such a complex system without causing delays and inefficiencies? Will these rules actually enhance national security, or are they merely symbolic gestures that fail to address the real risks of AI? In the near term, could this create a new bottleneck to GPUs and LLM token access even in the US, as the global data center capacity planned for tier 2 countries gets delayed?
Given our belief in the capability, utility, and continued rapid pace of growth of these models, the answers to these questions will shape not just the trajectory of U.S. AI leadership but also the global dynamics of technology and power in the years to come.
— Louie Peters — Towards AI Co-founder and CEO
Hottest News
1. Mistral AI Announced Codestral 25.01, an Updated Version of the Codestral Model
Codestral 25.01 features a more efficient architecture and an improved tokenizer than the original, generating and completing code about 2 times faster. The model is now leading the coding benchmarks in its weight class and SOTA for FIM use cases across the board.
2. Microsoft Makes Powerful Phi-4 Model Fully Open-Source on Hugging Face
Microsoft has released its Phi-4 model as a fully open-source project with downloadable weights on Hugging Face. Although Phi-4 was actually revealed by Microsoft last month, its usage was initially restricted to Microsoft’s new Azure AI Foundry development platform. Now, Phi-4 is available outside that proprietary service and comes with a permissive MIT License, allowing it to be used for commercial applications. The model has received 60,000 downloads on Hugging Face so far.
3. Nvidia CEO Says His AI Chips Are Improving Faster Than Moore’s Law
Nvidia CEO Jensen Huang claims that the company’s AI chips surpass Moore’s Law, achieving performance improvements 30x faster than previous models. This evolution could lower AI inference costs as newer chips enhance computing capability.
4. Mark Zuckerberg Gave Meta’s Llama Team the OK To Train on Copyrighted Works, Filing Claims
The plaintiffs in a copyright lawsuit allege that Meta CEO Mark Zuckerberg approved using pirated content from LibGen to train Llama models despite internal concerns. They accuse Meta of stripping copyright data to conceal infringement and illegally torrenting LibGen, aiming to bypass legal methods.
5. NVIDIA Announces Nemotron Model Families To Advance Agentic AI
As mentioned briefly last week, at CES 2025, NVIDIA CEO Jensen Huang launched new Nemotron models, including the Llama Nemotron large language models (LLMs) and Cosmos Nemotron vision language models (VLMs), to improve agentic AI and boost enterprise productivity. The Llama Nemotron models, built on Llama foundation models, allow developers to create AI agents for applications like customer support, fraud detection, and supply chain optimization.
6. Cohere Introduces North: A Secure AI Workspace To Get More Done
Cohere launched the early access program for North, an all-in-one secure AI workspace platform for improving the quality and speed of their work. North helps employees across teams and industries offload routine tasks and focus their time on where they can add the most value to their company. North combines LLMs, search, and agents into an intuitive platform that integrates AI into your daily work.
Five 5-minute reads/videos to keep you learning
1. Building Large Action Models: Insights from Microsoft
Microsoft’s framework for building LAMs represents a significant advancement in AI, enabling a shift from passive language understanding to active real-world engagement. This article explores the framework’s comprehensive approach, which encompasses data collection, model training, agent integration, and rigorous evaluation, and provides a robust foundation for building LAMs.
2. Visualize and Understand GPU Memory in PyTorch
The article explains GPU memory usage in PyTorch, offering tools to visualize memory and optimize usage. It details estimating memory for model parameters, optimizer states, activations, and gradients, considering factors like batch size.
3. Best Practices for Building and Deploying Scalable APIs in 2025
Building, deploying, and managing an API is a crucial step for scaling products, connecting systems, and ensuring everything communicates seamlessly. This article explores the world of APIs — what they are, why you might need one, and what deployment options are available.
4. GPT-4o Python Charting Insanity: Prompting For Instant Data Visuals
GPT-4o greatly simplifies the process of creating working Python data visualization code by displaying the visualization within the chat window. This post shares a step-by-step tutorial on generating hands-on, no-code visuals from datasets.
5. My 6 Secret Tips for Getting an ML Job in 2025
There are many different ways to get a job in ML. This blog post shares six tips on landing one. It emphasizes questions you should ask yourself, such as how to demonstrate your skill and how to show that you can actually add value to a project.
Repositories & Tools
1. FACTS Leaderboard is a benchmark that measures LLMs’ factual accuracy, addressing hallucinations.
2. Trending Open LLM Models is a directory of 5000+ open-source models.
Top Papers of The Week
1. rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
rStar-Math demonstrates that small language models can surpass OpenAI o1 in math reasoning without distillation. Implementing deep thinking via Monte Carlo Tree Search and self-evolution, rStar-Math enhances models to solve advanced problems, raising Qwen2.5-Math-7B to 90% accuracy. It solves 53.3% of USA Math Olympiad problems, placing in the top 20% of students.
2. REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
This paper presents REINFORCE++, an enhanced variant of the REINFORCE algorithm that incorporates key optimization techniques from PPO while eliminating the need for a critic network. This approach offers simplicity, enhanced training stability, and reduced computational overhead.
3. VideoRAG: Retrieval-Augmented Generation over Video Corpus
VideoRAG dynamically retrieves relevant videos based on query relevance and utilizes visual and textual information in output generation. It leverages large video language models to process video content for retrieval and integrates these videos with queries.
4. Dive into Time-Series Anomaly Detection: A Decade Review
In recent years, advances in machine learning have led to diverse time-series anomaly detection methods vital for fields such as cybersecurity and healthcare. This survey categorizes these methods under a process-centric taxonomy. It performs a meta-analysis to identify trends in research, highlighting the shift from statistical measures to machine learning algorithms for detecting anomalies.
5. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
This paper presents TabPFN, a trained Transformer that can perform supervised classification for small tabular datasets in less than a second. It does not need hyperparameter tuning and is competitive with state-of-the-art classification methods.
Quick Links
1. LlamaIndex introduces Agentic Document Workflows, an architecture that combines document processing, retrieval, structured outputs, and agentic orchestration to automate end-to-end knowledge work.
2. New OpenAI job listings reveal the company’s robotics plans. In a post on X, Caitlin Kalinowski said that OpenAI’s robotics team will focus on “general-purpose,” “adaptive,” and “versatile” robots that can operate with human-like intelligence in “dynamic,” “real-world” settings.
3. Researchers open-source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450. Sky-T1 appears to be the first truly open-source reasoning model because it can be replicated from scratch; the team released the dataset they used to train it and the necessary training code.
Who’s Hiring in AI
2025 Summer Internship, Tech Research Scientist — PhD @Spotify (London, UK)
AI / Large Language Model Architect @Accenture (Multiple US Locations)
Python Developer @InvestorFlow (Remote)
Data Visualization Analyst @RYZ Labs (Remote within Argentina & Uruguay)
AI-Driven Quality Engineering Specialist (PCD-PWD) @CloudWalk (Remote)
Software Engineer III — Backend @Deputy (Melbourne, FL, USA)
Junior Backend Engineer — Search @BlaBlaCar (Paris or Remote from France)
Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.
Think a friend would enjoy this too? Share the newsletter and let them join the conversation.