Thursday, 2:25 - 3:00pm
Research Data Alliance: building an international data sharing community
Mark A. Parsons, Rensselaer Polytechnic Institute
Project: Research Data Alliance
This talk will give a brief overview of RDA, how it works, what it has done, and how it relates to a potential NDS.
SHARE – Shared Access Research Ecosystem
Richard Luce, University of Oklahoma
Individual Development Through Data
Matthew Turk, NCSA
The most powerful phase transition in a research career occurs when an individual begins to drive their own scientific inquiry. In this
talk, I will describe how with the yt project (yt-project.org) our community has attempted to develop an enabling technology for individuals to ask questions of data, and how this philosophy can be applied elsewhere.
Big Data Comes to School: Challenges the Learning Sciences
William Cope, University of Illinois Urbana-Champaign
Project: UIUC College of Education
This brief presentation will provide an overview of emerging “big data” challenges in education, including the range of data sources, data types and data collection technologies, and the problem for meta-analysis presented by variant data models. Successfully addressing these challenges could reap significant dividends for education, not only transforming the methodologies and processes of educational research, but also for the development of a new generation of student assessments. I will conclude by proposing in outline form a ‘National Data Dictionary’ that might support the federation of datasets that are at this point semantically incommensurable.
Friday, 11:30am - 12:00pm
The Role of Publishers in Data Access
Jennifer Lin, PLOS
The role of publishers in data access is quickly evolving as attention to its importance grows. In this talk, I will present PLOS's perspective in light of its new data availability policy and the engaging discussions we have had with the scientific community from the outset of defining the policy through to its implementation.
Survey of IEEE Research Authors
Ken Moore, IEEE
Project: Research to Assess Data Needs of IEEE Communities
In 1Q 2014, IEEE surveyed ~1,000 authors in its fields of interest (electrical and electronics engineering, computer science and engineering) to determine their needs and interest in data services. The presentation will present highlights of the results.
Publishing Findings in the Life Sciences
Dan Hall, National Institute of Mental Health (NIMH)
Project: National Database for Autism Research (NDAR) Informatics Platform
The autism community defined the goal to share 90% of all human subjects research data. Through a Subject Identifier, harmonized data definition, data federation with other public and private funders, and progressive data sharing policies/techniques, research data on 77,500 subjects related to autism are now shared. Analyzed results, specifically associated with the outcome measures, and research subjects defined by cohorts is supported. In our lightning talk I will present the NDAR model for data sharing focusing on the scientific and computational results (see data from papers), which is likely relevant to the creation of a National Data Service.
Making Science Data Infrastructure a First Class Citizen
Erin Robinson, Foundation for Earth Science
Project: Federation of Earth Science Information Partners (ESIP), EarthCube
We live in a world rich with data, where use and reuse would benefit not just science but also serve national security and society-at-large. However, our scientific data enterprise is evolving and maturing in an unmanaged fashion and due to insufficient coordination across planning, management, and resources, the potential benefits of all these data are not realized. Reliable, long term funding as well as cultural changes including financial incentives and rewards are needed to turn Science Data Infrastructure into a first class citizen equal to Science. The National Data Service has the potential opportunity to provide this overarching national coordination.
Is All Big Data ‘Messy’? What Questions Must Researchers Ask Before, During, and After Crunching the Numbers?
Cathy N. Davidson, Duke University
Project: Humanities, Arts, Science, and Technology Alliance and Collaboratory (HASTAC)
This talk brings the interdisciplinary perspective of the social sciences, humanities, and digital humanities to data science and is a follow-up to our May 28 “Big (and Messy) Data” workshop as part of a two-year NSF EAGER grant on data and cross-disciplinary collaboration and mentoring. A key concern from this workshop that needs to be applied to our National Data Service is what my colleague and collaborator Richard Marciano has termed the “forensics” of understanding and interpreting big data. If we are going to provide a national data service for researchers, we must include in that service useful questions that any researcher, in any field, must pose in order to fully understand the biases, histories, and ambiguities of data, including the way that the inputs can distort the outputs an that all data requires interpretation and context.