Duke ITAC - March 10, 2022 Minutes

Agenda – March 10th, 2022

 

4:00 - 4:05 p.m. - Announcements (5 minutes)

 

Robert Wolpert is filling in for David MacAlpine. It is spring break.

 

4:05 - 4:35 p.m. - DCRI - Crucible App Platform (CAP), David Burdick (20 minute presentation, 10 minute discussion)

 

What it is: The Crucible App Platform is being introduced as a new and streamlined method to deploy and manage stateless applications.

 

Why it’s relevant: This new platform will be secure by default, cloud native, utilize infrastructure as code, and ultimately serve as a much-improved avenue for the deployment and management of stateless applications.

 

David Burdick introduces himself as Chief Architect for the Duke Clinical Research Institute (DCRI) and Senior Principal Architect at Duke Health. The goal of the Crucible App Platform (CAP) is to provide a building block service with containerized applications for deploying services in a simple and efficient manner. This is similar to what made Heroku easy for developers to get things up and running quickly.

CAP includes:

·      Building blocks for web apps

·      Modular (Kubernetes) K8s & cloud PaaS stacks

·      Consistent security design

·      Enforcement of best practices

·      Unified monitoring

·      Multi-group support

·      Sensitive data and (Protected Health Information) PHI provisions

·      GitOps deployments where all code and configuration for deployments is captured in Git

The business problems that CAP seeks to address are:

·      High demand for secure web apps

·      Difficult to engineer private resources

·      Separate roles for Devs and Ops

·      Certificate management being poorly handled

·      Need for an audit trail of changes

·      Low agility in normal DHTS Demand/Bespoke Build process

·      High maintenance on Bespoke build

The entire CAP platform is built in Terraform. Terraform provides a
language format (HashiCorp Configuration Language) so the system can go out and use the cloud vendor’s API. When dealing with protected data and external vendors, we can run a vendor’s software on their behalf on our sensitive data instead of our bringing our data to them.

Terraform is a language that works with all platforms including Amazon, Azure and Google Cloud to name a few. App developers can build a database, tear it down, and run it back up using Terraform programs. Anyone including developers and external contractors can use these programs.

David provides another example of how CAP provides increased efficiency. All web apps can obtain the Information Security Office’s (ISO) signoff on networking and engineering and then, security signoff does not have to be revisited every time a new app is deployed.
Application specific information security review (pen testing, etc.) is still an ongoing requirement.

David then presents diagrams of the CAP architecture which uses an Azure hub and spoke topology with Terraform and Kubernetes. End-to-end encryption is also provided along with a proxy layer for communication.

Q. John Board – What about the idea of not getting locked into specific cloud vendors? What is your practice, and do you have insights into this?

A. David Burdick – Our goal was not to be cloud-agnostic. We do lean on DHTS for networking and security controls. HIPPA management groups are overlaid onto subscriptions. But CAP has been built for Azure and significant resources would be needed to rebuild it.

Q. John Board – How has the experience of the developers been in comparison with Heroku?

A. David Burdick – It has not been quite as easy as Heroku, but the platform is more secure. Developers need to learn more about provisioning their app but once the app is containerized, the app can be reused over and over.

Q. Colin Rundel – What about readiness for use and adoption rates for CAP?

A. David Burdick – So far CAP is not a generally available Service offering but is a blueprint. CAP is being used for compliant applications at present.

Q. David Burdick – What is the university doing for containers?

A. Charley Kneifel – OKD4 and OpenShift. Drupal 9 and Stevedore were built here although Stevedore has retired in the last couple of days. Mark can talk about static containers.

A. Mark DeLong – OIT has 2500-2800 containers using GitLab. We’re doing some Docker Compose where it looks like Kubernetes would be overkill. Then, containers for coursework are almost entirely in Azure and about 50 of these are on campus.

Q. Robert Wolpert – So it doesn’t have to be Azure?

A. Charley Kneifel – No, but we use Azure because we use Microsoft. We can point students to a container and there’s no reason for the traffic to come through campus.

Q. Robert Wolpert – Will the choice of static containers cause problems?

A. Mark DeLong – In some cases, this is an advantage. If a student runs code with no limits, it can blow things up and we know students will do bad things occasionally. If you have something that dynamically sizes and the student writes some bad code, then well, you have to set limits, and that's the real art of this kind of thing.

Q. Colin Rundel – I am a big fan of containers. Using Terraform-existing scripts around creating databases seems valuable. Having existing best practices as far as config and logging in, etc. would be greatly appreciated.

A. Kris K Hamilton – Our goal is to provide building blocks so that staff can take steps knowing that the steps have been stamped with approval. This is especially important on our Duke Health side with all the regulatory rigor. This type of oversight in place of audit has many advantages. The hope is to have these available building blocks so that you don’t have to build something exotic as well.

Q. Matthew Hirschey – I work part of my time in the School of Medicine and I’m wondering what a typical use case looks like. Does a PI come to you and say I have some data that I want to spin up in a container and make publicly available? Or is it the Centers and Institutes that come with data or app needs?

A. David Burdick – We are expanding more into digital health trials for the School of Medicine and the Cancer Center. For example, we’ve done some data marks where we pull data from multiple data repositories and transform the data into the fire standard, and then, provide a visualization in a custom web app or Tableau. Then, there are simple crowd apps like governance credit apps where you’re managing the state of some workflow. Another one that’s running is the Snowball application which was a respondent-driven sampling for Covid. We are sending out coupons to people in the community to get Covid tested. That’s a mixture of RedCap and a custom-built application.

Q. Matthew Hischey – What is your cost recovery model?

A. David Burdick -   We have a kind of retail rate for external users. We look at the multiple tenants and the specific tagged resources and we charge directly back. Then, if they have their own dev ops, that’s great. Otherwise, we charge back for our team’s time.    

4:35 - 5:05 p.m. - Introduction to the Duke Center for Computational Thinking, Matthew Hirschey (20 minutes presentation, 10 minutes discussion)

 

What it is: The Duke Center for Computational Thinking is a developing resource for the Duke Community focusing on innovating computing majors and minors, infusing computational thinking across programs of study, enriching co-curricular opportunities, and bringing computational literacy to all.

 

Why it’s relevant: The Center is part of the Provost’s strategic initiative to enable broad computational education across Duke and beyond. Matthew Hirschey will be joining us to introduce the Center, speak on their goals, current initiatives, and future plans.

Matthew Hirschey introduces himself as an associate professor in the department of Medicine which is in the Carmichael Building across from the Power House where OIT is located. The other half of his time is spent directing the Duke Center for Computational Thinking, which was born of the vision of the Provost, Sally Kornbluth, the Dean of Faculty, Valerie Ashby, the Pratt Dean, and VP for Information Technology and CIO, Tracy Futhey. This group saw the need for both technical and Arts and Sciences students to have access to technical and applied computational training. There are many efforts across Duke to address this need.

Matthew was trained as a chemist. Then, he worked in the fields of biochemistry and physiology which led to his need for bioinformatics (data science and statistics) which then, led to the realization that every student needs to know this. Matthew taught classes and worked with Larry Carin who has since moved on to be Provost for the King Abdullah University of Science and Technology.

Matthew defines computational thinking as “a creative and intellectual process that applies computational tools to ideas, challenges, and opportunities relevant to the betterment of society.” The mission of the Duke Center for Computational Thinking (CCT) is to enable computational education at Duke and beyond. The aim is to try to get more people to learn computation and computational thinking in an increasingly digital and complex world and for all Duke students to have digital literacy whether they have a technical major or are Arts and Sciences students. The goal is for CCT to reach all students. Matthew outlines four pillars to support this goal:

 

·      Pillar 1 – Computing majors – innovative, cutting-edge, research-based.

·      Pillar 2 – Every major – embedded computing, computational thinking, ubiquitous computing (e.g., examining text) for non-tech majors.

·      Pillar 3 – Any student (any time) – co-curricular, Data+, DTech scholars, coding and data science, on-demand, synchronous and asynchronous.

·      Pillar 4 – Every student – digital citizenship, digital intelligence certificate (University-wide course cuts across all departments: a class was provided on race that any student could enroll in; next year, the university-wide course will be on climate.) Digital citizenship=digital literacy.

This year, 30 faculty and staff have been working toward these pillars. Many experiments are running with Code+, a new fellowship, and various computing institutes. Presently, CCT is always open and eager to have engagement. You can join the group which meets once per month or you can participate by creating content.

Q. John Board – At ECE, we view one promise of CCT to be level sets. Not everyone knows everything coming into the various programs. There used to be only one path so we are optimistic about skill-up modules with specific purposes as well as for fun. What is the production schedule and what is the capacity of CCT?

A. Matthew Hirschey – Well, no one takes the Git module for fun, just for level settings! CCT can set up a system for anyone to create a module; this will be a collaboration with CCT and Learning Innovation (LI). The idea is to take a one-hour lecture and convert it into a learning object that would form a chunk like an Intro to Git class. Some students might need an Intro to Git and others might just need a reminder of git-commit. Faculty would then work with LI designers to modularize the class. The container for this is still an experiment but so far we have had some eager and talented faculty who did videos; we want to enable all experts in a field to be able to do this. We are so busy during the school year but have carved out time in the summer. The goal is when you teach students in the Fall and someone doesn’t know Git, you can say go watch these videos. That is the production schedule. As far as the queue, we have a lot going on but we are open to whoever wants to generate learning material.

Q. John Board – Some faculty want video classes as prerequisites.

A. Matthew Hirschey – Matt is working closely with Evan Levine to provide this.

Q. Robert Wolpert – Who owns the modules when faculty leave Duke?

A. Matthew Hirschey – Faculty own the intellectual property. CCT is also looking into this as a performance as well where CCT will pay the faculty for the performance so that Duke owns the performance. Every video that has been made so far is on YouTube but some of our material is not videos so is not on YouTube. The goal is to have an integrative learning system that would include slides, videos, exercises, etc. all behind Duke’s shibboleth. In this case, technically Duke would be the owner but so far, class elements are open to all.

Q. Robert Wolpert – There must be many videos on YouTube on Git.

A. Matthew Hirschey – Yes, there are more than 1,000 Intros to Git but we want to teach Git in a way that students need to learn it. And we want to make it easy for John Board and his colleagues to say here is CCT’s vetted video because faculty can’t vet everything. We want to provide opportunities for faculty to say watch this 3-minute video before class or this is a prerequisite for a specific class. The more CCT can collaborate with faculty, the more likely materials that are created will get into the curriculum. If a Duke syllabus says watch this YouTube video and this YouTube video and this YouTube, then why is a student paying for Duke. The goal is to create synchronous and asynchronous ecosystems. For example, 30 faculty would make 3 videos each that are integrated into their curriculum.

Q. Robert Wolpert – The 1st 2 pillars sound like every student at Duke should know something about computation and have some skills. I’m always wary when something says every student because often students have complementary skills.

A. Matthew Hirschey – CCT can envision a curriculum spectrum where some need a skill to do a job and some faculty want to provide opportunities no matter where a student falls on that spectrum. Not every student needs to be a practitioner, but all students need to be aware of topics such as Security, Ethics, Whether AI is good for Society. These present big questions that have no right answer and we want to have students with balanced liberal arts educations who can comment on these topics. Also, like John Board said this can be used for level setting. I can imagine someone who doesn’t know anything about cybersecurity but whose data has already been hacked. So, there is something for everyone.

5:05pm - 5:30 p.m. - Common Solutions Group Update, John Board, Charley Kneifel, Mark McCahill (15 minutes review, 10 minutes discussion)

What it is: The Common Solutions Group (CSG) works by inviting a small set of research universities to participate regularly in meetings and project work. These universities are the CSG members; they are characterized by strategic technical vision, strong leadership, and the ability and willingness to adopt common solutions on their campuses.

Why it’s relevant: CSG meetings comprise leading technical and senior administrative staff from its members, and they are organized to encourage detailed, interactive discussions of strategic technical and policy issues affecting research-university IT across time. We would like to share our experiences from the most recent meeting this month.

John Board begins by stating that CSG is still in the Covid online format, so the meeting is still shorter in format. But the next meeting may be hybrid.

The main takeaway from the meeting was that John now has an idea of what low code means. There are some major initiatives along this line at Duke that Charley and Mark are involved with. The notion of low code is democratized programming so that those who are not coders can create useful computational artifacts. John asks Charley and Mark to talk about Duke’s initiatives.

Mark McCahill – Low code is about the ideas of citizen developers and that citizens can do useful work. An example of this is the Kauli build at the University of Washington where they wanted to build forms and workflows without needing heavy-duty coders. One piece of low code is letting the business-process people have better tools and the opportunity to create the artifacts. At Duke, ServiceNow has low code for some add-ons and SalesForce has low code. If the application is being built for internal use, then any risks associated with low code are much lower. When exposure is to external users, then the risks go up so the need to enlist help is more critical.

Charley Kneifel – Our goal is to have fast reusable workflows. For example, if a student wants to change a major or drop a class, this requires a number of approvals. Low code can provide routing decisions without needing to invest a lot of money. This is very useful for departmental admin types who think about processes.  Chris Derrickson is excited about this from a sys process standpoint. Another opportunity is to use pdfs as opposed to paper. So there are many opportunities and in aggregate, these opportunities could increase time savings and satisfaction.

Robert Wolpert – In thinking about the life cycle of documents and structures, often a first draft is written by someone with a narrow focus and then, it evolves over time and after a while, nobody knows what is going on under the hood.

Charley Kneifel – Absolutely right. But the beauty of this is there is a graphical portion to this so you can see the flow and so the building of the process is cohesive. A problem can be solved with a throw-away form for a short-term ephemeral need but building a lasting process has value as well.

Mark McCahill – Having the tools for laying out workflows limits their complexity. There is an upper bound so you can’t do super complicated here.

John says Charley Kneifel was a presenter on a session called “Recalibrating from the Covid Sprint.” Charley talks about a survey he did asking people what they wished had been done better. Responses included:

1. More standardized classroom technology.

2. Better data.
3. Better hygiene.
4. Having policies in place about remote work in advance.
5. Better virtual desktops.
6. The realization that this was a long-term project early on rather than a bunch of small sprints that were tiring and draining.
7. Better ways to think about data security and data integration.
8. Quicker decisions.
9. That the mental health of your employees is an important thing to start with.

John Board says another topic was the successful automation of research tools and classes. Many schools are embracing automation wherever possible.

Mark DeLong – Texas A&M is using ServiceNow which triggers a Terraform code generator to make things happen in Amazon Web Services.

John Board concludes with the topic of network modernization and reports that Duke is ahead on this front thanks to Will Brocklesby. It is great to be in Durham and to have our great network. The next CSG meeting will be at Cornell from May 11th through 13th.