4:00 - 4:05pm: Announcements and Approval of 7/28/22 and 10/27/22 Minutes (5 minutes)
Dorothy Coletta is rotating off stenographer duty after two and a half years; Matt Mielke will be the new stenographer.
The minutes from 7/28/22 and 10/27/22 are approved.
Tracy Futhey reports on the LMS transition from Sakai to Canvas on behalf of Michael Greene. The transition has been vetted across campus and feedback has been positive. Whoever wants to be an early adopter of Canvas, let us know. We are looking forward to a smooth transition.
4:05 - 4:15pm: Research IT Needs Assessment Update – led by Tracy Futhey (10 minutes)
What it is: Over the last several months, a draft report was created that compiled and explained process recommendations for improving IT support for research at Duke. This information came from multiple working sessions with faculty culminating in a final session in August where ITAC members voted on prioritization of the IT needs. The draft report was presented to the University Priorities Committee and the Academic Council, discussed at the Deans’ Cabinet, and circulated to ITAC and other leadership teams for multiple rounds of review.
Why it’s relevant: On December 1, a link to the final report was added to the ITAC website and shared with the Deans and the Academic Council. Today we’ll hear about progress toward Phase 2 in which solutions will be proposed.
Tracy Futhey provides an update on the Research IT Support Needs assessment. Phase 1 has been completed with the release of the public report. Now, Phase 2 is underway and is mostly service partner driven. Phases include:
- Phase 1 – assess needs Duke-wide (faculty-driven)
- Phase 2 – propose solutions (service partner driven)
- Phase 3 – determine structures (institutionally determined)
It is important to note that feedback loops are present throughout the process.
Summary findings and recommendations have been categorized according to People, Process, and Technology. The six findings listed by category are:
- People: Expand and Improve IT Support
A. Duke lacks sufficient personnel to support domain-specific research
- Build and support new teams of domain-specific technical personnel
- Develop and catalog training resources as an ongoing education program
- Process: Reduce Structural IT Barriers
B. Separate research infrastructures hinder research and collaboration
- Objectively assess costs/benefits of dual and decentralized research IT infrastructures, which confuse and frustrate faculty.
C. Current security/compliance approaches seem “one size fits all”
- Evaluate current policy, security, and compliance IT-related requirements and processes and move toward a holistic risk-based institutional approach
- Technology: Enhance and Simplify IT Offerings
D. OIT Services are valuable but not as expansive as faculty require
- Evaluate approaches to extend OIT’s computational services (HPC, GPU)
- With faculty input, tune services to better support faculty need (data, ML)
E. A plethora of technical solutions and use constraints create confusion
- Clarify/simplify technical solutions for particular research uses, stressing common services available to both campus and SoM (both cloud and local)
F. Current storage services don’t span research lifecycle or university
- Implement long-term storage options spanning campus and SoM
- Automate data migration over lifecycle
- License datasets as we do software
The final report is available at https:/duke.is/72sjn
Tracy underscores that People was the area of greatest need as expressed by every group. There aren’t enough personnel with IT expertise to support research. Important questions are:
- How to solve this issue in a tractable way.
- How to ensure solutions are scalable and sustainable.
As far as Process, Separate IT infrastructures and “one size fits all” approaches to security and compliance cause the most pain for most groups. Important questions to ask here are:
- How to keep SoM/SoN whole.
- How to balance risks of the current security and compliance requirements.
Technology findings encompass the need to expand and tune services, devise new storage solutions, and clarify solutions. Questions are:
- How to bridge different OIT and DHTS offerings.
- How to adapt OIT services to meet needs.
It is important to note that findings and recommendations are interdependent.
Tracy continues saying that Phase 2 is underway. The goal in December is to identify sponsors and 6 working groups (1 working group per finding.) In January, the working groups will be formally launched. The working groups will meet to develop solutions from February through March or April. This stage will include faculty and sponsor check-ins to validate solutions.
Tracy then addresses what is next in this process for ITAC:
- Help disseminate the report and raise awareness
- Identify 1 ITAC faculty for each working group as a liaison
- Suggest others who should be part of Phase 2
Q. Steffen Bass – In my last 20 years as faculty, not a month has gone by when I don’t talk to IT but I rarely speak to domain-specific librarians. The need for resources has shifted; the role of the library is not as important as most subscriptions are electronic.
A. Tracy – I hear this as 2 things. We must address what are the relevant IT resources in the library, and be sure those are be well communicated. The new librarian is very responsive. Joel Herndon whose group Duke CDVS will be presenting today is a fantastic resource. Some of our work here may be to knit things together as opposed to creating a whole new set of services.
A. Sunshine Hillygus – of the Social Science Research Institute (SSRI) says the department works with the library but not everything is covered: there are gaps.
A. Colin Rundel – has used Joel’s group but discovered it in an ad-hoc way. There are different efforts in different places and Colin doesn’t know where to tell people to go because there is no centralized place to inquire.
A. Tracy – Yes, communication is needed. A faculty liaison would be helpful to connect people to resources.
A. JoAnne Van Tuyl – With subject librarians, I always know where to go.
A. Sunshine – Is someone collecting a list of what already exists?
A. Tracy – Yes.
4:15 – 4:45pm: Research Data Management and Curation at Duke – Joel Herndon (15-minute presentation, 15-minute discussion)
What it is: Duke Libraries Center for Data and Visualization Sciences partners with the Duke research community to provide support for data research, tools, and methods. Through recommendations by a faculty working group and funded by the Provost, the Libraries have significantly expanded support for data management and curation through new staffing, a public access data repository, and new campus partnerships.
Why it’s relevant: Today’s presentation focuses on Duke's data repository and curation efforts with an eye toward the:
- NIH Data Management and Sharing Policy (January 25, 2023)
- OSTP 2022 "Nelson" Memo (deadlines varying by agency through 2025).
- Journal data sharing policies for many academic publishers
We are interested in hearing your impressions of the data sharing landscape in your disciplines and the opportunities and challenges of publishing/sharing research data at Duke.
Joel Herndon is the director of the Duke Libraries Center for Data and Visualization Sciences (CDVS.) Joel introduces the Research Data Management consultants, Sophia Lafferty-Hessand Jen Darragh. The CDVS provides multiple programs supporting data-driven research including:
- Data Science
- GIS and Mapping
- Data Visualization
- Data Management
Joel’s team has 8 staff members who partner with other data departments across campus and Duke Health. Joel’s department has been receiving a large increase of questions on the Duke Health side.
Joel reviews The History of Data Sharing Policies at Duke:
- 2011 - NSF DMP requirement
- 2013 - OSTP Memo on Public Access to Research
- 2014 - PLOS data sharing policy
- 2018 - ICMJE data sharing requirement
- 2023 - NIH data management & sharing policy
- 2025 - OSTP memo on Free, Immediate, & Equitable Access to Research
Sophia Lafferty-Hess speaks to the data-sharing landscape becoming more complex with many repository options that fit different researcher needs and funder requirements. The goal is to help researchers find the best repository for their needs.
The Duke Research Data Repository (DRDR) was launched with the mission to curate, publish, and archive Duke digital research data from any discipline. The Duke RDR provides long-term public access to support research transparency and reproducibility, and aims to foster new discoveries.
Sophia outlines the requirements for Duke RDR data:
- Data containing NO sensitive or restricted information
- Data you have the rights to disseminate
- Data in a final-publication-ready form
- Datasets up to 300GBs
Sophia explains the process the researcher would take to use RDR:
- The researcher has data on local storage
- The researcher makes a publication request through the RDR website
- A shared Box folder is automatically created for data upload
- Data is updated to permanent storage
- Discovery and access are through the RDR website and direct download is provided to the researcher
Data curation involves
- Ensuring files are open
- Checking documentation for completeness
- Assessing file formats
- Performing disclosure risk review
- Identifying other data and code enhancements
Sophia continues by saying that Joel’s team is a member of the Data Curation Network, a trusted, community-led network of curators advancing open research by making data ethical, reusable, and better. Making data curation FAIR (Findable, Accessible, Interoperable, Reusable) is the goal.
Q. Colin Rundel – A big care is about making research reproducible. Along this line, it seems there is a fundamental piece missing and this is a place for raw research data. Researchers are being asked to keep raw data along with the final published paper. Integrating this into the process would be helpful.
A. Sophia – We would definitely take the data as long as there are no identifiers. Sometimes we are asked to save code files.
Q. David MacAlpine – It would be great if someone could help fit data into the data forms, say the top 10 data forms.
A. Sophia – We can help with meta-data. We can help you think through how to process data.
A. Tim McGeary – The library also has a metadata architect.
Q. Robert Wolpert – Maybe 20 years from now, there will be a need to run an MS-DOS script. Is there a way to use containers for needed software?
A. Sophia – We are looking into archiving containers as well.
Jen Darragh continues the presentation by outlining the desirable characteristics of data repositories:
- Unique persistent identifiers
- Long-term sustainability
- Curation and quality assurance
- Free and easy access
- Broad and measured reuse
- Clear use guidance
- Security and integrity
- Common format
- Retention period
Jen also reviews the RDR data management plan agreement which specifies that data will be stored with Dublin Core metadata and Digital Object Identifiers (DOI) for long-term preservation. The RDR was launched in 2017 and its use has been growing exponentially. The RDR now hosts 218 datasets. Workshops are also offered on data sharing; 2667 attendees have participated in these workshops so far. With Covid and the use of Zoom, workshop participation has greatly expanded.
Jen says integrations have been created to make data stored visible and to make data storage scalable (with the use of globus). Partnerships include:
- Duke University Research Computing
- Duke Office of Scientific Integrity
- The Data Curation Network
Additional services provided by CDVS are:
- Identifying external repository options
- Reviewing data management and sharing plans
- Reviewing consent forms for data-sharing language
- Consulting on data management practices and resources during a project
- Training groups/lab on data management, sharing, and curation
Contact information for CDVS is:
Q. Sunshine Hillygus – The data management and providing of resources are terrific. The pain points are dealing with the additional layers of oversight. How does this fit in with all the other groups making recommendations? Researchers need help in navigating the data verse. Researchers are struggling with all the various groups managing our data management.
A. Jen Darragh – All of these various groups have been meeting and we are looking into having one place that will point researchers to each place they need to go. So this is being discussed.
A. Tim McGeary – Consent can be written one way but might not be enough for publication which then, must be communicated back up the chain. We are having conversations and trying to bring transparency to the process.
A. Joel Herndon – We are looking at the realities of academic file sharing. For example, what is the cost to researchers and to others involved with the oversight process and what is the benefit?
Q. Sunshine – How do we define sensitive? This is an issue.
A. Jen – Things can be deidentified but still might not meet oversight needs.
Q. Sunshine – There are so many layers and then, sometimes there are disagreements over what is needed.
Q. Steffen Bass – Every item in this presentation, I associate with Research Computing. It would not dawn on me to look to the library for this.
Q. John Board – How scalable is what you are offering? How much more can you take on?
A. Sophia – We had 150 unique depositors this year and took in 50 data sets half of which were new. We are seeing a lot of new people, but we have the capacity for more and so we are doing outreach.
A. Tim McGeary – Right now, we have a 300GB size limit; we are talking about growing this size. This is a fundraising campaign goal, and the libraries are leading in this area.
Q. Sunshine – Many grad students and faculty leave the institution after a time so what is the advantage? Will Duke host for people who leave?
A. Charley Kneifel – Even if students and faculty leave, their data will remain and continue to be accessible.
A. Sophia – Sometimes repositories stop working so people like data being stewarded by Duke University and libraries have been stewarding and preserving data for all time.
4:45 – 5:15 pm: End of year celebration!