OpenAI has released HealthBench, a benchmark designed to evaluate the capabilities of AI systems in healthcare. The benchmark aims to help large language models (LLMs) support patients and clinicians with trustworthy, meaningful health discussions. HealthBench looks at seven key areas, including emergency care, managing uncertainty, and global health. OpenAI has found that recent models have improved quickly, with ChatGPT outperforming human doctors in some cases.