ChatGPT now lets some users connect their Apple Health data to a new Health section inside the app. The idea sounds simple. You give the chatbot your fitness and medical data, and it gives you health insights in return. But when one journalist tried it with a decade of Apple Watch data, the results caused concern instead of clarity.
The test showed how risky this kind of feature becomes when the system gives strong answers based on weak understanding. Instead of helping, the tool ended up creating confusion and fear about real medical issues.
What the reporter did and what happened
A reporter for The Washington Post, Geoffrey Fowler, gave ChatGPT access to his Apple Health data. That included 29 million steps and 6 million heartbeat readings collected over ten years. He then asked ChatGPT Health to rate his heart health.
The chatbot gave him an F.
The result shocked him, so he took it to his real doctor. His doctor rejected the result right away and said Fowler had such a low risk of heart disease that insurance would likely not even cover extra tests. That made the gap between the AI’s answer and real medical advice very clear.
Experts push back on the AI’s claims
Cardiologist Eric Topol from the Scripps Research Institute also reviewed what happened. He called the chatbot’s analysis “baseless” and warned people to ignore medical advice from tools that are not ready.
He also explained why this is dangerous. As he said, “People that do this are going to get really spooked about their health.” He added that it can also create the opposite problem by giving unhealthy people a false sense that everything is fine.
The bigger problem is inconsistency
The most worrying part was not just the bad grade. It was how much the grade changed. When Fowler asked the same question again, ChatGPT shifted between an F and a B. At the same time, it forgot basic details like his age and gender, even though it had full access to his data.
That shows the system does not understand the data in a stable way. It reacts differently each time, which makes its health advice unreliable.
Claude did not do much better
Fowler also tested Anthropic’s Claude chatbot with the same data. Claude gave him a C instead of an F, which looked better at first. But it still failed to deal with the limits of Apple Watch data, which is not meant to replace medical tests.
So while the score changed, the core problem stayed the same.
What these health tools claim
Both OpenAI and Anthropic say their tools do not replace doctors or give diagnoses. They argue that they only provide information. Yet the tools still hand out grades and judgments that sound like medical conclusions.
That creates a gray area. The U.S. Food and Drug Administration recently said its role is to “get out of the way as a regulator” to support innovation. At the same time, an FDA commissioner warned against AI making medical or clinical claims without review. ChatGPT and Claude say they do not cross that line, even when they rate someone’s health.
Where ChatGPT Health stands now
ChatGPT’s Apple Health connection is only available to a limited group of beta users. After the report, OpenAI said it plans to improve the system. As OpenAI VP Ashley Alexander said, “Launching ChatGPT Health with waitlisted access allows us to learn and improve the experience before making it widely available.”
For now, this test shows one clear thing. Giving an AI access to health data does not mean it understands that data. Until that changes, these tools risk doing more harm than good.
