3 stories
The right baseline for diagnostic AI is the decisions people would have made without it, not physicians. Most published research ignores that counterfactual, so we don't know whether AI actually improves care or merely matches clinical benchmarks in lab conditions.
Read story →The thread mixes admiration and frustration. People acknowledge Google’s best-in-class image/music/video generative components and a smooth Studio playground, but many feel there's an odd gap between the full Gemini Pro model and the stripped-down app/website experience. That gap fuels a safety-versus-access argument: pragmatic support for "site-only" or sandboxed releases of risky ‘Mythos’-class models (limits misuse) vs. criticism that surface-only access either hides capability from researchers or is a performative safety measure.
The community is excited, a bit giddy: a robot finishing a half-marathon faster than the human world record feels like a milestone and fuels talk of a coming 'Robolympics.' The reaction mixes awe with skeptical follow-ups about fairness (controlled conditions, tethers, assistance), plus playful speculation about what sports robots will dominate next.
DAIR's 'Top AI Papers of the Week' posts generate the usual combo of FOMO and gratitude: people appreciate curated lists to triage the flood, but there's also weary skepticism about incremental-sounding titles and claim inflation. The thread is serving both as discovery and as an informal gatekeeping signal about what's worth reading this week.