Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Announcements

  • Colloquium on Digital Transformation Science

  • October 1522, 3 pm CT

    COVIDScholar: Applying Natural Language Processing at Scale to Accelerate COVID-19 Research

    Gerbrand Ceder, Chancellor’s Professor, Department of Materials Science and Engineering, University of California, Berkeley

    Machine Learning-based Design of Proteins, Small Molecules, and Beyond

    Jennifer Listgarten, Professor of Electrical Engineering and Computer Sciences, University of California, Berkeley Amalie Trewartha, Postdoctoral Scholar, Division of Materials Science, Lawrence Berkeley National Laboratory

    REGISTER FOR ZOOM WEBINAR

    There is a critical need for tools that can help the COVID-19 researchers stay on top of the emerging literature and identify critical connections between ideas and observations that could lead to effective vaccines and therapies for COVID-19. To this end, our team at UC Berkeley and Lawrence Berkeley National Laboratory is building covidscholar.org, a knowledge portal tailored specifically for COVID-19 research that leverages natural language processing (NLP) techniques to synthesize the information spread across more than 140,000 emergent research articles, patents, and clinical trials into actionable insights and new knowledge. Having its origins in our text-processing work in Materials Science, COVIDScholar is powered by an automated system that scrapes research documents from dozens of sources across the internet, cleans/repairs metadata as necessary, and analyzes the text with a number of NLP models for classification, information extraction, and scientific language modeling. We then integrate this information with specialized knowledge graphs which has the potential to give users unparalleled insight into the complex interactions that govern the transmission of COVID-19, the disease’s progression, and potential therapeutic strategies. This approach to combining textual information, such as word embeddings, with ontological knowledge graphs has the potential to improve the performance of machine learning models that operate on these data structures and to enable new ways of exploring literature on emerging subjects by leveraging past knowledge more efficiently.

    Data-driven design is making headway into a number of application areas, including protein, small-molecule, and materials engineering. The design goal is to construct an object with desired properties, such as a protein therapeutic that binds tightly to its target. To that end, costly experimental measurements are being replaced with calls to a high-capacity regression model trained on labeled data, which can be leveraged in an in silico search for promising design candidates. The aim then is to discover designs that are better than the best design in the observed data. This goal puts machine learning-based design in a much more difficult spot than traditional applications of predictive modelling, since successful design requires, by definition, some degree of extrapolation -- a pushing of the predictive models to its unknown limits, in parts of the design space that are a priori unknown. This talk will anchor the overall problem in protein engineering and discuss emerging approaches to tackle it.

    Jennifer Listgarten is a Professor in the Department of Electrical Engineering and Computer Sciences and the Center for Computational Biology Gerbrand Ceder is the Chancellor’s Professor of Materials Science and Engineering at the University of California, Berkeley. His research is in computational and experimental materials design for clean energy technology and in Materials Genome approaches to materials design and synthesis. He has published over 400 scientific papers and holds more than 20 U.S. and foreign patents. He is a member of the U.S. National Academy of Engineering and the Royal Flemish Academy of Belgium for Science and The Art, a Fellow of the Materials Research Society and the Minerals, Metals & Materials Society, and has received awards from the Electrochemical Society, the Materials Research Society, the Minerals, Metals & Materials Society, and the International Battery Association. He is Co-Lead Scientist for new battery technologies at the U.S. Department of Energy’s Joint Center for Energy Storage (JCESR) and Chief Scientist of the Energy Frontier Research Center at the National Renewable Energy Laboratory (NREL).Amelie Trewartha is a postdoctoral scholar in Gerbrand Ceder’s group at Lawrence Berkeley National Laboratory. She began her career as a nuclear physicist, before moving into materials science in 2019, with a focus on machine learning. Her research interests include the application of natural language processing (NLP) techniques to scientific literature, and building thermodynamically-motivated machine learning models for materials property predictionShe is also a member of the steering committee for the Berkeley AI Research (BAIR) Lab and a Chan Zuckerberg investigator. From 2007 to 2017, she was at Microsoft Research in Cambridge, MA (2014-2017), Los Angeles (2008-2014), and Redmond, WA (2007-2008). She completed her Ph.D. in the machine learning group in the Department of Computer Science at the University of Toronto, located in her hometown. She has two undergraduate degrees, one in Physics and one in Computer Science, from Queen’s University in Kingston, Ontario. Jennifer’s research interests are broadly at the intersection of machine learning, applied statistics, molecular biology, and science.


Quick Links:

C3.ai DTI Webpage

Events

Information on Call for Proposals

Proposal Matchmaking

Training Materials (password protected)

C3 Administration (password protected)


Have Questions? Please contact one of us:



Recent space activity

Recently Updated
typespage, comment, blogpost
max5
hideHeadingtrue
themesocial

Space contributors

Contributors
modelist
scopedescendants
limit5
showLastTimetrue
orderupdate