MedGRPO Demo — Medical Video Understanding
This demo runs uAI-NEXUS-MedVLM-1.0c-4B-SFT (base: Qwen3.5-VL-4B), part of the uAI-NEXUS-MedVLM 1.0 family trained with SFT + MedGRPO on MedVidBench, for medical video question answering across 8 tasks: temporal reasoning, spatial grounding, captioning, and clinical assessment. The publicly released family member is uAI-NEXUS-MedVLM-1.0a-7B-RL.
📄 Paper 🌐 Project Page 💾 Dataset 🤖 Model 💻 GitHub 📊 Leaderboard
Browse pre-computed predictions from the test set (no GPU needed).
Select Task
⏱️ Temporal Action Localization (TAL)
Identify when specific surgical actions occur in the video (start–end times).
Choose Example
Upload a medical video or frames and ask a question, or try a pre-loaded example. The model runs on ZeroGPU (may take 30–60s on first load).
Try a Pre-loaded Example (click a card below):