Explore This Laboratorya


Our goal is to use human and artificial intelligence to collate, annotate and organize the world's literature on hereditary cancer in order to make it easily accessible, understandable and actionable for clinicians as well as to maximize the quality of care for their patients. Our current efforts include:

  • Identifying cancer susceptibility genes and their disease spectrum
  • Developing machine-learning algorithms to accelerate literature review
  • Building and optimizing clinical decision support tools for cancer risk prediction
  • Designing and developing calculators

The lab has previously developed a cancer risk assessment software (CRA Health, formerly called Hughes riskApps™) that focuses on increasing the quality and efficiency of care while decreasing cost.

Research Projects

Identifying Cancer Susceptibility Genes and Their Disease Spectrum

The lab developed a framework of verifying gene-disease associations based on six prominent genetic resources and natural language processing-based (NLP) literature review. We have currently reviewed 77 genes, including 31 genes listed on the ASK2ME website. These numbers are constantly changing as we update our database.

Developing Machine-learning Algorithms

With medical literature regarding genetics and cancer growing exponentially, we need a way to automatically collate and prioritize medical literature. In collaboration with the Dana-Farber Cancer Institute and Massachusetts Institute of Technology (MIT), we have developed a web-based annotator interface (in testing) NLP that helps us review and classify medical literature on cancer susceptibility genes. The NLP converts free text abstracts into structured data, allowing us to extract relevant penetrance studies. The data is human annotated to train the algorithm. In collaboration with Northeastern University (NEU), we developed machine-learning models that can extract ascertainment mechanisms and risk estimates from studies from full-text penetrance papers.

In collaboration with NEU, we have also developed VarHarmonizer, a variant name harmonization and mapping tool.


  • Yin K, Singh P, Drohan B, Hughes KS. Breast imaging, breast surgery, and cancer genetics in the age of COVID-19. Cancer. 2020 Oct 15;126(20):4466-4472. doi: 10.1002/cncr.33113. Epub 2020 Aug 4. PMID: 32749697; PMCID: PMC7436610
  • Yin K, Liu Y, Lamichhane B, Sandbach JF, Patel G, Compagnoni G, Kanak RH, Rosen B, Ondrula DP, Smith L, Brown E, Gold L, Whitworth P, App C, Euhus D, Semine A, Dwight Lyons S, Lazarte MAC, Parmigiani G, Braun D, Hughes KS. Legacy Genetic Testing Results for Cancer Susceptibility: How Common are Conflicting Classifications in a Large Variant Dataset from Multiple Practices? Ann Surg Oncol. 2020 Jul;27(7):2212-2220. doi: 10.1245/s10434-020-08492-9. Epub 2020 Apr 27. PMID: 32342295
  • McCarthy AM, Guan Z, Welch M, Griffin ME, Sippo DA, Deng Z, Coopey SB, Acar A, Semine A, Parmigiani G, Braun D, Hughes KS. Performance of Breast Cancer Risk-Assessment Models in a Large Mammography Cohort. J Natl Cancer Inst. 2020 May 1;112(5):489-497. doi: 10.1093/jnci/djz177. PMID: 31556450; PMCID: PMC7225681
  • Wang C, Wang Y, Hughes KS, Parmigiani G, Braun D. Penetrance of Colorectal Cancer Among Mismatch Repair Gene Mutation Carriers: A Meta-Analysis. JNCI Cancer Spectr. 2020 Apr 23;4(5):pkaa027. doi: 10.1093/jncics/pkaa027. PMID: 32923933; PMCID: PMC7476651
  • Hughes KS, Zhou J, Bao Y, Singh P, Wang J, Yin K. Natural language processing to facilitate breast cancer research and management. Breast J. 2020 Jan;26(1):92-99. doi: 10.1111/tbj.13718. Epub 2019 Dec 18. PMID: 31854067
  • Bao Y, Deng Z, Wang Y, Kim H, Armengol VD, Acevedo F, Ouardaoui N, Wang C, Parmigiani G, Barzilay R, Braun D, Hughes KS. Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes. JCO Clin Cancer Inform. 2019 Sep;3:1-9. doi: 10.1200/CCI.19.00042. PMID: 31545655; PMCID: PMC6873946
  • Deng Z, Yin K, Bao Y, Armengol VD, Wang C, Tiwari A, Barzilay R, Parmigiani G, Braun D, Hughes KS. Validation of a Semiautomated Natural Language Processing-Based Procedure for Meta-Analysis of Cancer Susceptibility Gene Penetrance. JCO Clin Cancer Inform. 2019 Aug;3:1-9. doi: 10.1200/CCI.19.00043. PMID: 31419182; PMCID: PMC6873944
  • Braun D, Yang J, Griffin M, Parmigiani G, Hughes KS. A Clinical Decision Support Tool to Predict Cancer Risk for Commonly Tested Cancer-Related Germline Mutations. J Genet Couns. 2018 Sep;27(5):1187-1199. doi: 10.1007/s10897-018-0238-4. Epub 2018 Mar 2. PMID: 29500626; PMCID: PMC6240422
  • Forsyth AW, Barzilay R, Hughes KS, Lui D, Lorenz KA, Enzinger A, Tulsky JA, Lindvall C. Machine Learning Methods to Extract Documentation of Breast Cancer Symptoms From Electronic Health Records. J Pain Symptom Manage. 2018 Jun;55(6):1492-1499. doi: 10.1016/j.jpainsymman.2018.02.016. Epub 2018 Feb 27. PMID: 29496537

Group Members

Kevin S. Hughes, MD

Co-director of the Avon Comprehensive Breast Evaluation Center
Surgical Director, Breast Screening Program
Massachusetts General Hospital

Associate Professor of Surgery
Harvard Medical School

Dr. Hughes is a member of the Division of Gastrointestinal and Oncologic Surgery at Mass General where he is the surgical director of the Breast Screening Program and co-director of the Avon Comprehensive Breast Evaluation Center.

He is a graduate of Dartmouth College and Medical School, and trained at Mercy Hospital of Pittsburgh in general surgery. He did a fellowship in surgical oncology at the National Cancer Institute. Dr. Hughes is an associate professor of surgery at Harvard Medical School and was formerly on the faculty of Tufts University, the University of California, Davis and Brown University. Dr. Hughes has served on numerous national and regional committees and is actively involved in research regarding the genetics, screening, diagnosis and treatment of breast cancer.

He has authored numerous papers and book chapters on breast cancer, screening, diagnosis and treatment and risk assessment.

Sherwood S. Hughes

Director of Web Development, Developer and Consultant
Massachusetts General Physicians Organization

Sherwood Hughes is a member of the Massachusetts General Physicians Organization at Mass General where he develops web based applications and reporting systems. He was educated at Boston University and Harvard University in IT and Economics. Mr. Hughes is also an active community member in Boston's South End and past president of the Blackstone/Franklin Square Neighborhood Association.

Regina Barzilay, PhD

Delta Electronics Professor, Department of Electrical Engineering and Computer Science, MIT
Faculty Co-Lead, J-Clinic
MacArthur Fellow

Senior Leadership
Artificial Intelligence

Regina Barzilay is a Delta Electronics professor in the Department of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory at the MIT. Her research interests are in natural language processing, applications of deep learning to chemistry and oncology. She is a recipient of various awards, including the NSF Career Award, the MIT Technology Review TR-35 Award, Microsoft Faculty Fellowship and several Best Paper Awards at NAACL and ACL. In 2017, she received a MacArthur fellowship, an ACL fellowship and an AAAI fellowship. She received her PhD in Computer Science from Columbia University and spent a year as a postdoc at Cornell University.

Lab Members

  • Preeti Singh, MD
  • Kanhua Yin, MD, MPH