Microsoft ASI references

Section I: Agentic Logic & Orchestration

SUTRADHARA: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference (ASPLOS 2026)
- Note: Eliminates the "black box" gap between AI orchestrators and engines to reduce latency in multi-step agent tasks.
Sequential Diagnosis with Language Models (arXiv 2025)
- Note: Introduces MAI-DxO, a model-agnostic system that simulates a panel of physician-agents to iteratively solve complex medical cases.
LLM-42: Enabling Determinism in LLM Inference with Verified Speculation (Preprint 2026)
- Note: Solves the non-deterministic "hallucination" problem in agents through a formal verification layer.
Tell Me When: Building Agents that can Wait, Monitor, and Act (MSR Blog 2025)
- Note: Develops behavioral protocols for agents to function as long-running monitors rather than just reactive chat tools.
QoServe: Breaking the Silos of LLM Inference Serving (2026)
- Note: A unified serving architecture that enables diverse models to collaborate within a single agentic ecosystem.

EvoDiff: Controllable Protein Generation in Sequence Space (MSR 2025)
- Note: A general-purpose diffusion framework that treats biology as a language to design proteins without needing 3D structural data.
Self-adaptive Reasoning for Science (MSR Blog 2025)
- Note: Details the "AI Co-Scientist" framework where agents generate and self-correct hypotheses in physics and chemistry.
MatterGen: A Generative Model for Materials Discovery (2025)
- Note: Accelerates the search for new clean-energy materials by generating stable crystal structures with desired properties.
AURAD: Anatomy–Pathology Unified Radiology Synthesis (2026)
- Note: A framework for synthesizing high-fidelity radiology data to train medical agents in data-scarce clinical environments.
MIRA: A Medical Time-Series Foundation Model (2026)
- Note: Decodes complex temporal dynamics in patient data (EHR, sensors) to predict disease mechanisms.

vAttention: Dynamic Memory Management for Serving LLMs (ASPLOS 2025)
- Note: Decouples physical and virtual memory to support the massive context windows required for long-term agentic memory.
ModServe: Modality- and Stage-Aware Resource Disaggregation (2025)
- Note: Optimizes hardware allocation specifically for multimodal models (vision + text) to prevent compute bottlenecks.
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval (NeurIPS 2025)
- Note: Uses vector search to only process the 1–3% of data relevant to a specific query, enabling 128K token context on consumer GPUs.
The Analog Optical Computer: Scaling AI at the Speed of Light (MSR 2026)
- Note: Uses light-based chips to handle the matrix multiplication bottlenecks of ASI-scale systems with minimal energy.

Merge2Depth: Outdoor Dynamic Scene Depth Estimation (NeurIPS 2025)
- Note: A core world model that allows agents to perceive 3D depth and dynamic movement in unmapped outdoor environments.
MindJourney: Enabling AI to Explore Simulated 3D Worlds (2025)
- Note: A project where agents "dream" and explore virtual spaces to improve their real-world spatial reasoning.
Crafting Spatial and Embodied Foundation Models for AI (MSR Asia 2026)
- Note: Defines the shift toward Vision-Language-Action (VLA) models for generalist robotics.

Serial #	Focus	Official/Reference Repo
1	Orchestration	microsoft/SUTRADHARA
6	Biology	microsoft/evodiff
11	Infrastructure	microsoft/vAttention
13	Logic	microsoft/RetrievalAttention
All	Multi-Agent	microsoft/agent-framework