AI in Healthcare: How Collaboration Can Improve Data Use in Clinical Trials

Leverage the potential of a mass data-sharing network

Why Share Data?

Data is unquestionably the fuel for AI. When developing an AI, you need clean, diverse data to train your system, no matter how elaborate or advanced your solution. Attaining the massive volumes of high-quality data required to train machines to be smarter is no easy task. It’s even more difficult in the medical space.

Patients understandably hesitate to share their information (i.e., their data) with anyone other than their doctors. Doctors, in turn, are bound by rules of law and ethics to maintain that privacy. These limitations make medical data sets that do exist even more valuable. What, then, would drive a company or institution to share proprietary patient information and data sets?

The benefits to society are substantial. Everyone in the clinical trial world, too, stands to directly benefit from the improved productivity, faster result reproduction, and duplication reduction that data sharing can provide.

Productivity Increase

Again and again, studies show that collaboration improves the work of everyone involved.  Sharing proprietary data with stakeholders—patients, physicians, drug developers, researchers, etc.—could mean that success will come sooner and more cost-effectively for data owners. In a report, Elsevier noted that sharing research data “can accelerate the pace of research.” The report demonstrated the importance of collaboration with a meta-analysis finding that all antidepressants are more effective for short-term treatment of acute adult depression than placebo treatments. The analysis (which Elsevier calls “the most comprehensive assessment of antidepressants to date”) arose from a combination of published studies and previously-unpublished data sets. Researchers, regulatory agencies, and pharma companies that shared their work upon request made this meta-analysis possible, and the authors published the full data sets to allow for replication attempts and future uses. That means the original collaboration could have a positive ripple effect on future research.

Duplication and Risk Mitigation

You already know the exorbitant costs of developing a new drug. In about seven of eight cases, drugs that make it to testing stages never reach approval. Imagine the benefit of knowing in advance whether another company had already pursued the same path, only to be derailed. By sharing information about success and failures, you could reallocate your resources to a different pathway, molecule, or disease. You could even skip straight to improving what might soon be available. Seeing those errors early and increasing transparency can also save patients from harm. More than once, keeping negative results or even inconclusive ones under wraps has caused serious negative outcomes (even deaths) after continued use of pharmaceuticals.

Proof of Efficacy

The more transparent the research and trial practices, the easier it is for others to support (or disprove) your work. Reproducibility is key to scientific advancement, and that reproducibility is simplified when data is made public. In the realm of AI, sharing testing sets can help others attempt to improve upon existing AI systems. The earlier researchers find areas for improvement, the faster they can iterate.

What Data to Share

Despite clear benefits, a central challenge remains: patients balk at the idea of sharing their medical data. How can researchers and trial sponsors overcome this problem? One key to sharing data and thus driving AI creation and application in healthcare is anonymizing data, thereby assuaging some of patients’ concerns about privacy. Organizing small groups of researchers who offer to share anonymized data on a quid pro quo basis can be an effective way to encourage information-sharing. Some governments and journals require certain levels of data sharing to fund or publish scientific work already. This, in turn, prompts researchers to share data.

So, what exactly is up for grabs?

Data sets

Sharing training sets, the blueprint of an AI system, could endanger IP. Why, though, not at least share the testing sets you use to prove your system works? If you truly believe in the AI system you're using, someone else should be able to test their own system with your test data to see if they can beat your performance.


Everyone is proud to share their successes, but failures can be just as informative. We need to build a culture in which people are open to sharing what didn't work as well as what did. Peers can learn from your mistakes as much as you can, and perhaps they can deduce why you made them. And, as previously noted, sharing the bad news can even save lives.

Patient pools

One of the most powerful applications of AI in healthcare is patient recruitment and matching for clinical trials. As AI helps identify the perfect candidates for a given trial, the pressure to protect potential participants from “poaching” lowers. With the help of AI, you may find that you have patients in your clinical trial’s pool who are great matches for a competitor’s trial—and vice versa.

At the end of the day, using AI in health fields hast he potential to make a better world for patients. A little more sharing in the clinical trial world has the power to speed the innovations that could truly change lives.

Eager to find out more about how AI is revolutionizing the world of clinical trials? More information is just a click away.

linkedin sharing button
  • #life_sciences
  • #ai
  • #blog_posts

Mark Aiello
Mark Aiello