Nemotron 3 Ultra and Odysseus: The Biggest Week for Open-Source AI

NVIDIA launched Nemotron 3 Ultra, an open model with 550B parameters optimized for autonomous agents. PewDiePie launched Odysseus, a self-hosted AI workspace that exploded on GitHub. Two sides of the same movement toward open and local AI.

The first week of June 2026 brought two announcements that, seen together, tell a clear story: open and locally runnable AI is no longer a promise — it’s a tangible reality.

On June 4, NVIDIA unveiled Nemotron 3 Ultra at GTC San Jose 2026. It is their largest open model to date: 550 billion total parameters, 55 billion active per token, a hybrid Mamba-Attention MoE architecture, and up to 1 million tokens of context. And it’s not a generic chat model — NVIDIA explicitly designed it for long-running autonomous agents.

Three days earlier, on May 31, Felix Kjellberg (PewDiePie) launched Odysseus, a self-hosted AI workspace under the MIT license. Within days it amassed over 52,000 stars on GitHub and 6,100 forks. It’s a complete platform: multimodal chat, agents with tools, deep research, document editor, calendar, email, and more — all running on the user’s hardware via Docker or native installation.

They are two different worlds — an enterprise research lab and an individual content creator — converging in the same direction.

Nemotron 3 Ultra: A Model Built for Agents

Nemotron 3 Ultra is not simply another large open-source model. Its LatentMoE architecture combines Mamba-2 layers (state space models) with traditional attention and mixture of experts, achieving 90% sparse active — only 55B of the 550B parameters are activated per token. This gives it 5.9x higher throughput than GLM-5.1-754B-A40B and 1.6x higher than Qwen-3.5-397B-17B on long-context benchmarks.

The model is available on HuggingFace under the OpenMDW-1.1 license, with several checkpoints: BF16 (post-trained), NVFP4 (quantized), Base BF16, and GenRM (reward model for response judgment). It can be run via NVIDIA NIM, vLLM (day-1 support), SGLang, Ollama, Together AI, and AWS SageMaker JumpStart.

The key takeaway is right in NVIDIA’s official blog title: “Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents.” This is not a model for quick chat responses. It is a model for agents that need to maintain context, plan, execute tools, and reason over minutes or hours.

The hardware requirements are steep: minimum 8 GB200/B200/GB300/B300 GPUs or 16 H100/8 H200. This is not a laptop model. But support in Ollama and vLLM suggests smaller or quantized versions will soon reach the local ecosystem.

Odysseus: The AI Workspace That Exploded on GitHub

Odysseus is a project that defies easy categorization. YouTube videos call it “an AI agent,” but it’s more accurately described as a self-hosted AI workspace that includes an agent system among many other features.

The interface offers multimodal chat (vLLM, llama.cpp, Ollama, OpenRouter, OpenAI, GitHub Copilot), an agent system with tool-calling (MCP, web, files, shell, skills, memory via ChromaDB), a “cookbook” that scans hardware and recommends models based on available VRAM, deep research with web sources, blind model comparison, document editor, email triage, notes and tasks, CalDAV-synchronized calendar, and PWA support for mobile.

Everything runs locally. Docker is the recommended method, but it also works with native installation on Linux and macOS (including Apple Silicon).

Adoption was immediate: 52,000+ stars in five days makes it one of the fastest open-source launches of 2026. The repository lives under the pewdiepie-archdaemon organization on GitHub, though the degree of Kjellberg’s personal contribution versus the community’s is not fully documented. The project builds on existing open-source code (opencode, llmfit, Tongyi DeepResearch), which is consistent with the project’s philosophy.

What It All Means

The pattern is unmistakable. In the same five-day span:

A research lab with decades of history releases its most powerful model as open-weight, optimized for autonomous agents.
A content creator with 120 million subscribers launches a local AI workspace that functionally competes with ChatGPT and Claude.

Both are betting on the same thing: open models, local execution, agents with tools, and an ecosystem where the user has control over their data and infrastructure.

For developers, the message is practical: the pieces to build your own self-hosted AI environment already exist. Nemotron 3 Ultra gives you the model. Odysseus gives you the interface and agents. And it’s all open-source, runnable without relying on external APIs.

The next question is not “whether” local AI is viable, but “how far can it go.”

Primary sources: NVIDIA Nemotron 3 Ultra · Odysseus GitHub · NVIDIA Developer Blog

Nemotron 3 Ultra and Odysseus: The Biggest Week for Open-Source AI

Nemotron 3 Ultra: A Model Built for Agents

Odysseus: The AI Workspace That Exploded on GitHub

What It All Means

More in this category