Team members at Mass General’s Neurological Clinical Research Institute (NCRI) are confident that big data can lead to big results and huge gains in clinical research.

The center, which designs, supports, and manages clinical research in neurological disorders, is developing novel ways to harness and interpret data sets generated from patient information that are so large and complex traditional approaches simply are not feasible. The end goal of the data-specific programs: to better understand rare conditions, better diagnose these disorders, and ultimately find new or better treatments.

The data come from all different types of sources. Take, for example, EEG recordings, which produce continuous streams of electronic information. Then there are vital signs, disease-specific measures, and medical images, not to mention tangible data such as vials of biofluids and DNA samples. Beyond that, patients now record their own so-called patient-reported outcomes and health statistics via mobile apps and websites. Any source on its own may not suffice, but many taken together, could be the key to a cure.

Making Order Out of Chaos

Applying the oft-used business concept of big data to medicine means that information from all these avenues has a purpose beyond an individual’s clinical care at a point in time.

To the NCRI, big data mean the opportunity to learn more about rare diseases that traditionally don’t attract the attention of pharmaceutical and biotech companies. By organizing disease-specific consortia, bringing multiple stakeholders like academic research centers, clinicians, patients, and foundations together, and facilitating clinical research studies, a rare disease may become ready to attract the industry to develop much-needed drugs or improve existing medications.

For a biotech company to take on a drug development, especially for a rare disease, clear biomarkers and outcome measures must be in place. Determining and validating such biomarkers require lots of information: clinical and phenotypical data, DNA, disease natural histories, -omics, and images.

Of course, just having the data is not enough. Compliance to regulations, legal requirements and laws, both national and international, and recommendations from governing bodies in data acquisition, curation and handling without compromising patient privacy is paramount.

“We had to ask how do we merge this information knowing that it all comes from the same person without revealing who this person is,” explains Alex Sherman, director of the NCRI’s Center for Innovation and Biomedical Informatics.

His answer: Neurological Global Unique Identifiers (NeuroGUIDs) that uniquely and securely identify patients in a research continuum. Since such software to handle patient data from numerous sources while concealing that patient’s identity did not exist, Sherman had to create it. The new program, “allows anyone, including patients, who is interested in aggregating patient-centric information, to generate NeuroGUID and save the data with this unique identifier,” Sherman shares.

Say a patient participates in one study focused on genetics and another on imaging. Thanks to NeuroGUID technology, data from both studies can be combined, providing a more complete picture of the patient with their privacy intact.

“Think about data or information as flows—rivers—coming from multiple sources,” Sherman says. “We’re building infrastructure, highways … We’re bringing multiple suppliers to the same assembly line.”

He continues, “We’re building a research environment with tools and methods and procedures that can be shared and multiplied and made scalable enough to have a critical amount of information that could be interpreted for the benefit of patients.”

NeuroBANK™: Big Data Takes Aim at ALS

Another example of Sherman’s intricate plan in action is NeuroBANK™, an online patient-centered platform launched in 2013 to help researchers push through the hurdles typically faced in investigating rare diseases such as ALS. By enabling the sharing and aggregation of patient data between ALS researchers and across studies, the software makes collaborations possible—and, broadly, makes neurological disease research easier, faster, and more comprehensive.

Utilizing NeuroGUIDs, NeuroBANK links data from an individual patient across multiple research projects to a variety of sources, from medical images to genetic data to tissue repositories. Researchers designing neurological research studies can choose to use NeuroBANK and its standard libraries of disease-specific forms instead of creating their own databases, thus speeding the development of their study. “We look at it as an accelerated research environment,” says Sherman, who views one of his department’s main objectives as establishing clinical trial readiness for those rare neurological conditions.

But the data do not stop working once a NeuroBANK study is completed. After data are analyzed and the results are published, the entire study data set is de-identified and released into a central pool of disease-specific information available to anyone studying ALS. This, Sherman notes, is NeuroBANK’s true success.

A Culture Shift

“The breakthrough of NeuroBANK,” he says, “is that it changes the clinical research culture.” An understanding across the medical industry, he posits, that big data are not a scary thing, and that openly sharing patient data and information, certainly without compromising patients’ privacy, can lead to potentially life-saving advances.

He references PRO-ACT, a shared database he created in 2011 in partnership with Prize4Life, an Israeli non-profit, and the NEALS ALS Consortium. Numerous biotech and pharmaceutical companies and academic institutions contributed to the initiative, resulting in nearly 11,000 de-identified subject records from past clinical trials—data that would otherwise normally sit with a company and not be accessible to researchers. It was then and remains the largest clinical ALS data set in the world.

After sharing these data across the globe, analysts created algorithms that can successfully predict ALS disease progression. In fact, one predictive model proved so accurate it is under consideration for use in clinical trials to reduce sample size and speed up the drug-approval process.

While Sherman is proud of those results and the fact that nearly all companies interested in ALS research download this data set and compare the subject population in their trials to the set, he is equally if not more thrilled that those companies that initially just utilized the data are committed, once a new trial is completed, to contribute their data sets to PRO-ACT, —meaning availability of more data for future analyses that may benefit the patients.

That’s the sea change he wants to support, the true power of big data.

“My team of extremely talented and dedicated individuals,” he says, “comes up with new ways to bring collaborators together and new tools to support such collaborations.”

In this way, the NCRI and Sherman are master organizers, gathering stakeholders to form disease-specific research consortia; wrangling data by building novel systems that can compile and interpret unprecedented amounts of patient information; and then, in strategic fashion, disseminating that information to the medical community at large. Well, not just medical, but to anyone who has the skills and the grudge against the disease.

Nine research networks that investigate various diseases, from vascular cognitive impairment disease to such rare conditions as X-linked Adrenoleukodystrophy, Canavan, and Late-Onset Tay-Sachs, now use or plan to use the same ALS approach and NCRI’s platforms.

Being in the business of creating disease consortia, data aggregation and analyses, and serving as an Academic Contract Research Organization that conducts academia-, foundations- and industry-sponsored clinical trials, the NCRI will gladly share their ideas, approaches, policies, and platforms within the department, Mass General, and beyond.

Learn more about the NCRI at and Sherman’s work at and Or, contact Kristin Drake, NCRI Senior Director of Business Development, Strategic Initiatives and Operations, at or 617-724-7076.