- Colloquium on Digital Transformation Science
October 15, 3 pm CT
COVIDScholar: Applying Natural Language Processing at Scale to Accelerate COVID-19 Research
Gerbrand Ceder, Chancellor’s Professor, Department of Materials Science and Engineering, University of California, Berkeley
Amalie Trewartha, Postdoctoral Scholar, Division of Materials Science, Lawrence Berkeley National Laboratory
There is a critical need for tools that can help the COVID-19 researchers stay on top of the emerging literature and identify critical connections between ideas and observations that could lead to effective vaccines and therapies for COVID-19. To this end, our team at UC Berkeley and Lawrence Berkeley National Laboratory is building covidscholar.org, a knowledge portal tailored specifically for COVID-19 research that leverages natural language processing (NLP) techniques to synthesize the information spread across more than 140,000 emergent research articles, patents, and clinical trials into actionable insights and new knowledge. Having its origins in our text-processing work in Materials Science, COVIDScholar is powered by an automated system that scrapes research documents from dozens of sources across the internet, cleans/repairs metadata as necessary, and analyzes the text with a number of NLP models for classification, information extraction, and scientific language modeling. We then integrate this information with specialized knowledge graphs which has the potential to give users unparalleled insight into the complex interactions that govern the transmission of COVID-19, the disease’s progression, and potential therapeutic strategies. This approach to combining textual information, such as word embeddings, with ontological knowledge graphs has the potential to improve the performance of machine learning models that operate on these data structures and to enable new ways of exploring literature on emerging subjects by leveraging past knowledge more efficiently.
Gerbrand Ceder is the Chancellor’s Professor of Materials Science and Engineering at the University of California, Berkeley. His research is in computational and experimental materials design for clean energy technology and in Materials Genome approaches to materials design and synthesis. He has published over 400 scientific papers and holds more than 20 U.S. and foreign patents. He is a member of the U.S. National Academy of Engineering and the Royal Flemish Academy of Belgium for Science and The Art, a Fellow of the Materials Research Society and the Minerals, Metals & Materials Society, and has received awards from the Electrochemical Society, the Materials Research Society, the Minerals, Metals & Materials Society, and the International Battery Association. He is Co-Lead Scientist for new battery technologies at the U.S. Department of Energy’s Joint Center for Energy Storage (JCESR) and Chief Scientist of the Energy Frontier Research Center at the National Renewable Energy Laboratory (NREL).
Amelie Trewartha is a postdoctoral scholar in Gerbrand Ceder’s group at Lawrence Berkeley National Laboratory. She began her career as a nuclear physicist, before moving into materials science in 2019, with a focus on machine learning. Her research interests include the application of natural language processing (NLP) techniques to scientific literature, and building thermodynamically-motivated machine learning models for materials property prediction.
Training Materials (password protected)
C3 Administration (password protected)
Have Questions? Please contact one of us:
- Jay Roloff, firstname.lastname@example.org (Executive Director, c3.ai.DTI)
- R. Srikant, email@example.com (Co-Director, c3.ai.DTI)
- Tandy Warnow, firstname.lastname@example.org (Co-chief Scientist, c3.ai.DTI)
Recent space activity