ITAC Agenda: October 12, 2023

 

Rubenstein 153

 

4:00 - 4:05pm: Announcements and Approval of 09/14/2023 minutes (5 minutes)

 

Paul Jaskot: Welcome. Approval of 9/14/23 minutes.
Announcements:  AI Exhibit at Nasher.

 

4:05 – 4:45pm: DukeHub Updates – Chris Derickson (30-minute presentation, 10 minutes Q&A)

 

What it is: DukeHub is the user interface to Duke’s Student Information System, which is the system of record for students’ academic progress (courses, grades, degrees, etc).


Why it’s relevant: DukeHub continues to evolve and improve, especially after the 2.0 upgrade. ITAC has heard in the past about renovations first to the student experience and later to the faculty experience in DukeHub; it is now the turn of the Department Center to get a refresh and how Department Center will be modernized to fit the new DukeHub look and feel.  We will hear about plans for these and other ongoing improvements to this critical system.

 

Chris Derickson:

 

Slide presentation

 

Three years ago the DukeHub user experience was updated, first focused on the student experience, which we’ve continued to improve upon. The advisor experience has been improved upon, and we’ve also made some tweaks to the faculty experience and Department Center.

 

Faculty experience improvements included the addition of section numbers to the view, and class and grade rosters have been improved. There is still work to be done with some of the gaps for grade rosters, such as midterm grades.


The advising experience required a lot of improvement, notably the “Act as User” feature was inefficient, requiring multiple clicks and received comments about it being “creepy.”  The information presented wasn’t what advisors needed to do their jobs.


We initially submitted vendor enhancement requests, however Duke has a highly customized experience and there were things needed that were not going to be implemented for a base product. In order to make improvements, we built our own pages, with pilot groups and programmers at the same table, and with direct input from advisors. that weren’t going to be done for a base product that were needed. 

 

We had multiple rounds of updates during the beta testing. The OIT-SISS partnership has been great and our approach has established a model for future success.

 

Summary of ongoing DukeHub-related projects

  • Incentivizing student course evaluation response rates
  • Grad tracking tool: looking at ways to replicate or replace T3.  We are working directly with graduate school on this now.
  • Canvas: integration with DukeHub for grading to replace Sakai Connect.
  • Electronic Grade Change Process—a straightforward customization should be in place by the end of the semester. Note: we cannot change vendor pages so we have  been thinking about creating our own Faculty Hub. For grade roster changes, we’ve submitted a ticket with the vendor but we might end up building it ourselves as well
  • Student-specific permissions to replace permission numbers: we need a way for departments to be able to handle this process.
  • Slate’s Student Success tool replacing an older solution for Bass Connections.
  • Stellic, a degree-auditing tool, is rolling out to undergraduates; 7 of the 10 schools will be using by the end of the year.  It will complement the grad-tracking tool mentioned earlier.

 

Department Center: We would like to improve the campus experience and do what we did with Advising Center, which is to pull together a group of people to participate in focus groups and talk about enhancement requests.

 

We would like to use the same process with the Faculty Hub.  Please e-mail Chris Derickson if you know others that might be interested. They will be challenging but fun but projects.

 

Department Center - sampling of the recent maintenance completed

  • Display only active students (no longer show LOA)
  • Mass update the “Grant advisor ability to release advising hold”
  • Changed the “authorized to release advising hold” symbol from check to YES/NO
  • New reports include:

-Expected grads multiple plans

             -Expected grads by plan; added the degree/diploma name

             -Exam schedule by term

             -Graduates by academic year

             -All active academic plan codes
             -Unavailable catalog numbers

 

To give you an idea of what the SISS department is working on, here is a Smartsheet dashboard view of our SISS Student Projects roadmap (shown on slide)

 

 

 

Finally, we want to form a more active and participatory advisory committee for SISS with different campus representatives. We’d like to feed that into a roadmap of how we plan our activities.  We hope to roll that out soon. 

 

Any questions?

 

John Board:  The Advising Hub update will make a big difference for me, especially with the new challenges of teaching a large group of students this year.

 

Mark Palmeri: The feedback button has been great and the response time has been stellar.

The information the registrar sends out regarding exams included an e-mail with 5 steps to follow and screenshots, which is not intuitive.  The screenshots they showed are not what I saw.  That was the frustration point. I’m not sure why I have to keep checking for exams every semester, and the rooms they are assigned to. It took way more time than it probably should have.


Chris Derickson: Agreed.

 

Arnov Jindal: Can you increase timeout for inactivity in DukeHub?

 

Chris Derickson: We did increase not too long ago, but I’ll confirm.

 

Andrew Li:  When we try to login to both DukeHub and Sakai, when you press the back button, it makes you login again.  On both the Sakai site and DukeHub – primarily Sakai.

 

Charley Kneifel: Recently, like in the last month?

 

Andrew Li: I don’t think so.

 

Charley Kneifel: There was a lot of work done over the summer for everything that used Shibboleth single sign on. If you are still experiencing please let us know.

 

Colin Rundel: Is there a time frame for integrating Canvas Gradebook?

 

Chris Derickson: The goal is this semester.  OIT has 2 of their best programmers on it. I’m confident. 

 

If anybody has any questions please e-mail Chris Derickson.


Paul Jaskot: Are there ongoing conversations about fixing sidebar option?

Chris Derickson:  Yes.

 

4:45 – 5:05pm: FAST Storage – Charley Kneifel, John Board and Prof. Mark Palmeri (10-minute presentation, 10 minutes Q&A)

 

What it is: Charley Kneifel, Duke’s CTO, is Principal Investigator of an NSF-funded project entitled Flexible Affordable Scalable Technology for Research Storage (FAST-Research Storage).  This project exploits a modern, open source file system and commodity hardware to deliver an integrated system of storage solutions that allow research data to migrate from “hot fast” to “cold slow” storage with variable steps in between as a research project proceeds through its lifecycle.  Cost estimations from early test cases are very promising.

 

Why it’s relevant: Our recent Research Needs assessment identified challenges with data storage for research data as one of our faculty’s greatest pain points, FAST is a promising avenue for addressing some of those challenges in performant and cost-effective ways.  The first “real users” are trying out the system now, and so an update to ITAC is timely.


Charley Kneifel:

 

Slide presentation

 

FAST Research Storage:

  • Flexible Affordable Scalable Technology for Research Storage
  • NSF funded
  • About 5 PB raw capacity
  • Runs Ceph, an open-source OS (ceph.io)
  • Use of commodity hardware
  • 25G ethernet
  • https://sites.duke.edu/FASTResearchStorage

 

It uses object storage, so it can be pulled down from anywhere.  CephFS is the client operating system, with performance better than NFS. (It’s only been problematic when running on machines that aren’t managed by central IT.) The use of commodity hardware lets us easily change the mix of hardware.  We can also tune the workloads.

 

We are looking into a cost model so researcher could pay for one of the nodes and we would manage the node like we do with the Duke Compute Cluster (DCC).   Your “capital purchase” would turn into 5-7 years of storage.  I like to think of this as a “condo model” with good HOA.

 

Tools to support data lifecycle management include the creation of initial data set tags.  (who what/when/where/why) and migration of data between states. You can add metadata as you go along, and you can find data sets with searches.  We can do this for both local and cloud-based.

 

You’ll have a configuration for high-performance, designed to be shared. You can move to short-term NAS, to long-term object storage. It’s easy to share with people outside, because you aren’t sharing an smb or nfs mount.

 

John Board: Faculty tend to overestimate the “hotness” of their data, for example, like you need to run an intensive analysis of your data right now even though it was gathered three weeks ago and you are still waiting for the grad student to finish the project--

 

Charley Kneifel: Our data is very cold. Lots of data, not often accessed.

 

John Board: We all collectively can be saving a lot of money, automating to the greatest yield possible. It’s cost-saving.

 

Charley Kneifel: One of the outcomes of the IT Needs Assessment was storage beyond the end of the grant; we are in a much better position for storage beyond the end of grant than we would be without these tools.

 

Colin Rundel: How do you account for that? This model where a faculty member buys a node but hardware is heterogenous.

 

Charley Kneifel: In this case it’s all the same, we have a base config, for example, “this is what you would get for the middle tier.”  And we provide that for the lifecycle.  If we look at data beyond the end of the funding, we might push a copy into the cloud and then pay for it. Or push it in to an onsite cloud if other people want to use it.

 

Mark Palmeri: This may require a rethinking on the researcher end on how they interact with the storage.  Our workflow is “we need to work with it locally;” the interactive back-and-forth is really important.  We have lots of local ssds, and scratch space. We might end up having a hybrid model that includes locally mounted file systems along with the big group-wide storage

As things go colder.

 

Charley Kneifel: Great point. We want to offer flexibility.

 

SuitecaseCTL is a toolset we created that packages up your files and creates metadata.

https://sites.duke.edu/fastresearchstorage/suitcasectl

 

John Board: Metadata can be automated; that gives me hope that there is a taxonomy that someone can decode in the future. Many of our sub-disciplines don’t have an agreed-upon taxonomy within their field.  Data can be discoverable.

 

Mark Palmeri: The beautiful thing here is that uses open-source protocols so you aren’t tied to any proprietary, vendor-based archiving format. To John’s point about metadata, I think the “pie in the sky” use case for this would be the Google inbox approach, where you can tag and search with a flat inbox structure. But use of this tool still requires a thoughtful partitioning of projects. As more people use this, I would love to see the schemes people use to make it more efficient.

 

Charley Kneifel: When we push data into the Cloud, we keep a copy of metadata on prem, so we can do local searching, and it includes the file manifest.

 

David MacAlpine: The School of Medicine tried this approach, with object storage, but it turns out it was missing features, for example, with indexing.  Are we still limited?

 

Charley Kneifel: Part of it goes back to the idea that you need to think about how you structure your workflow.  I don’t want this to become a “data motel,” where it checks in but doesn’t check out.  We’ve seen schools scrambling because they had Google Drive, and now they are charging for Google Drive.

 

John Board: The early financial numbers from this approach vs. the standard storage commodity approach from big-name vendors are spectacular. 

 

Charley Kneifel:  60 percent cheaper.

 

John Board: We started this before the outcome of any of our IT research needs assessments.

 

Charley Kneifel: We started this because NSF had a solicitation for storage, and we said “this is what we need.”

 

John Board: We started this independently, but it is smack in the middle of the challenges faculty noted.

 

Steffan Bass: Am I correct in the assumptions you can scale to infinity, scale and speed?


Charly Kneifel: You can tune it up to do different things. Ultimately whatever is the drive or SSD path that is absorbing it.

 

John Board: It has a lot of knobs! 

 

Colin Rundel:  How complicated is that tuning, then?  If I came with a certain workflow, or a certain need?  How many man hours?

 

Charly Kneifel: In the beginning it takes a little longer. It’s an investment. We can localize it to particular directories. There’s a lot of flexibility.

 

Colin Rundel: Some effort to the average user. It’s a slightly different thought process than just a drive you mount.

 

Charley Kneifel:  It can just be a drive you mount, but I don’t think you get the full benefit of the lifecycle.


Charley Kneifel:  Maybe a basic setup so onboarding can be a little more feasible.


John Board:  And that’s one of the outcomes of the grant. We presume that 80 to 90 percent will fit in a small number of buckets.

 

David MacAlpine:  Will there be common workflows, for Genomics, and sequencing?

 

Charley Kneifel: I think there will be common workflows that will help with that. But part of that will may very dependent on what do you do with the data after it’s generated.

 

Mark Palmeri:  The jobs are the same, but the thought process on how you deploy things to scale…some people would just rather have that one machine sit there and run for a week. With a lab, doing it on a group level, it almost becomes a social engineering task.

 

Charley Kneifel: It’s the “lab lore.” Who wrote something in a notebook that you use because it works? If I think about an object store, there are models that allow you to grow and scale beyond what you can do on campus.

 

Steffan Bass:  Will we have interfaces or conduits to these remote computing facilities (like Open Science Grid) for data exchange?

 

Charley Kneifel: We should already have that in reasonable ways in the services we provide there.

 

5:05 – 5:15pm: Updates from the Latest CSG meeting – John Board & Mark McCahill (10-minute presentation)

 

What it is: The Common Solutions Group (CSG) is comprised of a small set of research universities that participate regularly in meetings and project work. These universities are the CSG members; they are characterized by strategic technical vision, strong leadership, and the ability and willingness to adopt common solutions on their campuses.

 

Why it’s relevant: At CSG meetings, the members engage in detailed, interactive discussions of strategic technical and policy issues affecting research-university IT across time. There are updates from the most recent meeting to share with ITAC.

 

John Board:  Common Solutions Group (CSG) is a group of 30-some research universities.  We meet to talk about our common problems. The agenda had a few topics.

Mark McCahill:

  • Classroom use of AL/ML was one of the topics.  The question came up about using ChatGPT for coding and whether coding from scratch was that important, or is something else important?  For beginner programmers, they actually did better with ChatGPT because it gave them more confidence.
  • The co-pilot stuff has some risks. if you have GitHub doing co-pilot things there is potential for sensitive data disclosure. 
  • University of Delaware is putting together a walled-garden internal AI model for some of their courses for 20 years of captured content.
  • Harvard started putting together a tool that would give access to multiple different models.
  • Faculty panel: There was a discussion about companies like Google and the resources available to their employees.

John Board.  At Google, if you’re an AI scientist, they give you unlimited resources.  If you’re an AI researcher at a university, how can you make substantive contributions to the field without access to those resources? Typically, this involves partnership with industry, which raises its own set of questions.

Also, discussions of how do IT folks understand the business of the units they support?

On Thursday we discussed Shibboleth, which is the core of our identity management system here. Shibboleth is long in the tooth and the number of people who really understand its guts might be countable on two hands at this point. This has created both opportunity and angst in the identity management world about what is the future of managed identity. 

Mark McCahill: There is a drive to use cloud providers for identity management instead using their own identity management infrastructure on prem.  There is the “upgrade fatigue” associated with on-prem management that has pushed some people towards the cloud, but there is concern that that using the cloud may create a “take-it-or-leave-it” scenario with a lack of flexibility.

John Board: One discussion was about how universities are having to comply with government requirements for managing secure research enclaves.

The very last discussion was called “cover your assets” and was about challenges universities have in tracking assets. One of our peers had a large-scale laptop theft, which cost millions of dollars. At Duke we have in-house tools, like Planisphere, that give us a huge leg up.

Paul Jaskot: Thank you everyone