ITAC Meeting Minutes - May 2, 2024
Via Zoom
4:00 – 4:05pm – Announcements (5 minutes)
Victoria Szabo: We're gathered to have our first summer ITAC. We are now on summer
schedule and calendar invitations have been sent.
We had discussed the idea of having some sort of larger AI Round Table event, but we have
other AI events happening. Today, we're going to hear about Code+ from Jen Vizas: and the
project line up for the summer.
4:05 – 4:50pm – Code+ Project Line-up for 2024 – Jen Vizas: (30 minutes presentation, 15
minutes Q&A)
What It is: Code+ is a full-time 10-week project-and team-based summer coding experience for
undergraduate students. The program is part of the “+Programs" ecosystem with Data+ and
CS+, which has expanded in recent years to include Climate+ and Applied Ethics+. The
+Programs have outgrown Gross Hall - this year Code+ will be located in the Edge. Led by IT
professionals, students acquire both technical and professional skills, expanding their
experience outside the classroom and preparing them for future tech internships and careers.
Why It is Relevant: Now in its seventh year, Code+ creates a unique learning experience for
students, engaging them with faculty and staff who serve as project stakeholders. Students
selected into the program are chosen based on their need for technical experience and their
interest in tech - they typically have a STEM major and/or minor but is not required. By the end
of the program, students have the confidence and skills to pursue coveted industry internships
and continue in their chosen fields.
Link to presentation: https://acrobat.adobe.com/id/urn:aaid:sc:VA6C2:ed491b3f-0e39-4af6-
b0cb-49fa0ba4e650
Jen Vizas: For those unfamiliar with Code+ the program is a summer coding experience
traditionally for Duke undergraduate students. We choose students based on their lack of skills,
so to speak. We know Duke students are extremely talented; we typically choose students who
haven't had internships, or they are just dipping their toe into coding. The outcomes for the
students are 1) they learn that they can quickly learn the tech 2) they are learning to code in a
collaborative environment working on project teams. 3) We want them to develop long-lasting
friendships and relationships that keep them connected to Duke. And then for the community
who we're doing this for, they get a proof of concept or minimal viable product at the end of
the ten weeks.
Some changes for this year:
Workshops are focused on applied topics, not necessarily specific languages. We
provide the lineup for classes, and team leads can provide additional resources for their
students to do a deeper dive into some languages.
- We are spreading the workshops out over the first few weeks and not frontloading –
- this was overwhelming the students.
- We have also increased the number of events. We want them to get connected with
- one another, especially at the beginning of the 10 weeks.
- The +Programs continue to grow (AE+, History+, Arts+, etc) and we have outgrown
- Gross Hall. This year, the Edge in Bostock Library will be our home. We plan to be in
- Gross Hall several times a week to keep the connectedness back to the plus programs.
And just a little bit about the numbers:
This is the seventh year of the program, with about three hundred participants that have gone
through it. This year we have: 60 Students: 32 she/her, 26 he/him, 2 they/them, 32 first-year,
26 sophomores, 2 juniors
We have four universities participating. This is the first time we're opening the program up to a
handful of students outside of Duke: Duke (53) NCCU (2) Davidson (4) NC State (1)
This is the first year we're fully financially self-sufficient, with all student stipends covered
($5k/student). Approximatey 70% of funding through corporate or Duke departmental project
sponsorship. Project Sponsors $302K: Cisco, Microsoft, Transact, Apple, CCT, ORI, Provost
Office, CREATE, MISTRAL Grant, NCShare Grant
Jon Reifschneider: I’m the Executive Director of the AI for Product Innovation Masters of
Engineering Program and a faculty member in Pratt. I also run a small, applied research lab
called CREATE, which is the Center for Research and Engineering of AI Technology and
Education. We work on projects that explore use cases of AI to improve teaching or learning at
Duke and beyond. One of our current projects, has been working with our university registrar to
develop a system that makes it easier for students to discover courses that better align with in
their interests or skills that they want to build. One of the pain points is that it's very difficult to
find courses at Duke to take that align with your interest. Typically, students resort to talking to
other students, alumni, etc., and getting personal recommendations. This projects seeks to
bring that process online and open up to all students to have that type of information.
We partnered with the registrar's office, and we've built a site called Duke Atlas. The interface
is a chat bot like interface, where you can go in as a student you can ask questions. You can say
what something like, “What classes are offered about machine learning?” and it will give you
some recommendations of various courses throughout the university that relate to machine
learning. Then ultimately, we'll filter that sort of based on graduate undergrad, school, etc.
Then you can ask follow up questions. You can ask it to compare a couple of classes in terms of
the content or the skills you'll build or things. like that to kind of hone in on a selection of
classes to take. We've built and launched a proof of concept of this this semester. We did a
small soft launch with a couple of graduate programs and Pratt and I've collected some student
feedback. We're excited to engage with the Code+ team this summer to do a couple of things.
One is some enhancements prepared for kind of a broader launch this fall. We hope to open up
to more students across Duke to use on a pilot basis this fall. Secondly, we plan to work on
building in a mechanism to collect additional input from students on courses. The best way we
could think of to do that is to crowdsource from our students. So, we’re going to ask the team
to work on a mechanism to try to accomplish that and build that into our platform. Then the
third thing we hope to do is a mapping of courses to a structured skills taxonomy, a discrete set
of different skills. You could go and tag a skill and find a list of all the courses that map back to
that skill. Ultimately, you should be able to go in and say, I want be a data scientist. What skill
set do I need? And then you should be able to click on one of those skills and get a list of all the
related courses that Duke. That's the long-term vision.
Jen Vizas: There's a couple of other projects that are Duke community related: Blue Devil
Bridges (BDB) is working with Alumni Engagement and Development. BDB was piloted in the
fall, and they had they were very pleased with the turnout. They had 1200 students sign up and
roughly, the same number of alumni volunteers sign up. The technology used was Qualtrics.
Next, we'll hear about the +Durham project. First from Anna Hung.
Anna Hung: I'm an assistant professor at the School of Medicine, Department of Population
Health Sciences. I also am an investigator at the Durham VA Healthcare system. I've got a five-
year career development award about prescription insurance for veterans who have both the
VA and Medicare Part D insurance. Essentially, the plan is to build a tool that allows veterans,
and for now, focusing on veterans with diabetes, to be able to understand what would their
drug costs be through VA versus through Medicare. And, what would some of the formulary
coverage restrictions be? There are a lot of different ways that this project could go. But I'm
excited. A lot of the different angles that we can tackle. I think it's an important problem you've
probably heard about drug costs and something that's meaningful for veterans trying to figure
that out.
Jen Vizas: Thank you, Anna. We are excited to see the breadth of stakeholders in this project
and to have a Duke student who is a veteran who will be serving on this project.
Another Durham related project is Visualizing the Assets of the Public-School Communities.
Amy Anderson: My name is Amy Anderson. I'm faculty in our program in Education. I've come
today as a representative of a Bass Connections team. One of the Bass Connections sub teams
is working on this project, creating a data dashboard that is asset focused on resources in our
Durham community. Imagine you are a parent wondering what childcare offerings exist. How
could you find that out? What's the sidewalk coverage around a certain school? And so, we've
leveraged resources of various platforms in in Durham to put together a dashboard. We started
with Data+ three years ago. This year Code+ has been a great partner as we move this to HTML,
and then also create a template that other communities might use.
Jen Vizas: This effort has received national recognition the work they have done.
Many of you know Terry Oas. He has joined us to talk about his project.
Terry Oas: Hi, thank you for inviting me to tell you about our project. It's based on a project
that's been ongoing with Scott Schmidler, who's in Statistical Science, and myself for fifteen
years in various forms. To briefly describe AlphaFold 2 and why it has gotten so much attention.
It's an AI based way to predict the three-dimensional structure of proteins based on their amino
acid sequence. It has been remarkably successful at doing so, much more successful than any of
the previous efforts, and that there have been many to predict three-dimensional structure. It
uses very powerful AI, but also some unique approaches that had not previously been used. It
has made a big impact on my field of protein biochemistry, because now we can take the
sequence of a protein that we're interested in and have a pretty good idea what it looks like in
a three-dimensional way. What AlphaFold 2 can't do is determine or predict the structure of a
protein that isn't a fixed structure.
The problem that Scott and I have been working on for quite a while is how to describe the
structure of a sequence that isn't a unique structure but rather a structure that flickers in and
out over time and over a collection of molecules, called Alpha Helix or Coil. The tool that we
have developed takes the sequence and predicts an ensemble of structures, a collection of
different configurations that this particular sequence might adopt. There are a very large
number of possible configurations, and this tool is able to enumerate the most probable or
most populated.
This is very useful to people around the world for taking an amino acid sequence and predicting
this kind of helix coil structure. And so, what I proposed to Code+ and thank you to Mark
McCahill and Jeremy Bandini for working on this project, is that we put up a web page-based
API, that allows people from all over the world to submit their amino acid sequences, and from
that get a prediction of this helix coil flickering structure. One purpose for taking a sequence
like this and trying to predict how helical it will be is that it allows us to predict the distribution
of the distance between the ends of the of the sequence. The reason that's important is
because these kinds of sequences are being used by an increasingly number of biochemists,
biotechnologists around the world to link together antibodies. By linking antibodies together,
you can get a higher affinity, the higher tendency to bind whatever the antibodies are being
designed to bind. There are many antibody-based drugs being developed now, including some
targeting, for instance, the spike protein on the covid virus that are taking advantage of the
extra affinity that linking antibodies together in a chain can give. What we need to know is what
the distance between those antibodies might be when they're linked by a particular sequence.
Using a tool developed by Aaron Blanchard, who's a postdoc in Brent Hoffman's lab, we can
convert each one of those enumerated configurations into an end-to-end probability
distribution, and then we can use the relative probability or population of each, multiply all that
together, and predict what the end-to-end distribution might be, and that will help people
design better linkers for linking antibodies together. The sub project would be to add this last
column of the table (on slide) to the API, so that people could use it to predict linker probability,
distance, probability, distributions.
Jen Vizas: Up next, we have security related projects. Alex Mereck will discuss this project that's
part of the MISTRAL Grant.
Alex Merck: A little background on the MISTRAL grant before we get into this. There are three
problems that we're trying to solve with the MISTRAL grant, 1) create an architecture where we
can easily and dynamically monitor various parts of the Duke network, including research labs
to understand of the threats that those particular labs may be facing; 2) collecting a set of data,
flow data and IDS hot data that allows us to dynamically detect those threats that are targeting
research labs; 3) provide a corpus of data for researchers, so they can actually work with live
network data network traffic and build their own analyses around that data.
The Code+ project is building out a way to present that data to researchers. They're easily able
to query the data, see what type of data is available, and then pull-down targeted chunks of
that data as they see fit. The team will also be building in some default visualizations around
the data along with different methods of pulling that down and tagging the data. So, ways of
identifying if this maybe an attack, or this is file transfer traffic Being able to provide a tagging
system so researchers can target those specific data sets. They'll be working very closely with a
Data+ team. The Data+ team will be working on analyzing the data and detecting threats and
baselining that traffic. This is a joint and they'll both be building off each other
Jen Vizas: We also have a couple of other security related projects, and all of the security
projects are sponsored by Cisco. The +Durham projects, for the most part, are being sponsored
by Microsoft, who seeks to be a good steward in the Durham community
Alex Merck: We had developing and prototyping honeypots last year and we're doing it again
this year, but we're changing it up a bit. Hackers will post information about the honeypots on
the dark web making them discoverable. What we want do is try to change up the behavior of
the honeypots so that the attackers won't be able to recognize what they are.
The attackers, before they try to run anything, run a set of commands to try to determine if
they are on a honeypot or a legitimate system. If they find that they're on a honeypot, they
usually back out and quit whatever they're doing so the goal here is to figure out what those
attackers are checking for, to determine that those hosts are honeypots and then figure out a
way to mix those up to screen them from the attackers so we can collect more information.
Jen Vizas: It's a very exciting group of projects. I'm looking forward to seeing what the students
accomplish this year. We will be presenting back to ITAC, towards the end of July and maybe we
can have a few of the teams from the today return to discuss the outcomes.
Victoria Szabo: Thanks so much, Jen. It was very exciting.
4:50 – 5:10pm – AI For Media Production - Stephen Toback (10 minutes presentation, 10
minutes Q&A)
What It Is: How AI is changing Media Production
Why It Is Relevant: AI is being used in a number of ways by staff, faculty and students around
Duke. Steve Toback, Senior Producer for Academic Media Production and Enterprise Media
Architect will provide details on the tools and methods he used to produce his National
Association of Broadcasters convention daily vodcasts. The presentation will showcase the
remarkable improvement AI music, images and text generation as well as the current
challenges and shortcomings of the technology.
NAB 2024 BEHIND THE SCENES VODCAST:
https://youtu.be/242KHUS5AJ8?si=yeEeQmZILftGKL0l
https://youtube.com/playlist?list=PLN7e7wVxurHd85ekuIDly16jL86rHI26c&si=zudBw3z46AacN
_CT
Stephen Toback: I recently attended the National Association of Broadcasters Convention in Las
Vegas, and every year I do a daily vodcast. This year I decided to look to see which AI tools can
help me do that process. This music is written by AI, and we'll talk a little bit about how that
was done. AI is everywhere, with everyone adding it onto their products.
AI did help me overall im producing my daily vodcast. I wanted to speed the production
process and make it better, and to use AI to generate better graphics. For the video, the music,
voice overs, logo, and script draft were AI. That was kind of the biggest time savings. None of
the voiceovers in any of my videos were actual humans talking. It was all done via AI.
Let's talk about the music. A product called Suno was released a few months ago; you give it a
prompt and for this one I hit Instrumental and it came up with this.
The next thing is script generation. I used Harpa AI, a really good product. It’s easier. I just gave
it a prompt to write it as a paragraph. Sometimes I'll have to go through and take out some of
the marketing terms. Then I'll copy that and paste it into a text editor, and that's the basis of my
script. I edit it and then the script is pretty much done.
For voice over, I went with Eleven labs which has improved a lot since last year. Last year it was
easier for me just to do the voiceover but this year, they've upgraded and updated and there's
something called professional. It basically took thirty minutes of my voice over from last year's
vodcast, fed it into the system, and it came up with what I thought was absolutely amazing.
Nobody that I've talked to thought that it was AI. Nobody even questioned that it wasn't me.
But it's not perfect. It did take some tweaking. Some of the punctuation it didn't get, it even
pronounced MacOS wrong it, said Macos. It said AJA, because it was capitalized, but in the first part of the sentence it said Aja correctly. I'd have to go back in and tweak. It didn't really
respond to emphasizing things or commas or pauses that well, but ultimately just took some
tweaking to understand how to write things so that it would read it correctly for me.
For the intro I wanted something live and spicy and there's no way to change the emotion of
something. This is called Speakeatoo. It's an emotion-based AI voice generator. This is another
way to look at it. For some, you may want to make voice overs as realistic and human as
possible. But for other things you want a cartoony, I picked one of their voices, and then added
some specific effects. I did “excited,” and it was perfect. And there’s a blooper. It cleared its
throat. That wasn't written. It just did that, and I had to render it again. It generated some text
that I had to read and then it matched that to the voice I was trying to digitize. That's some bit
of a safety net that I couldn't just download audio of some person on the Internet and have
trained to build a voice on it.
For the logo update - this was the logo that I've been using but I wanted to do something
snazzy. I asked OpenAI, used ChatGPT to build me a logo for the Duke Digital Media community.
It did a pretty good job. It still has problems with text. I did have to bring it into Photoshop, but
it gave me the idea that I liked, and all I had to do was edit it. I changed the text, and added a
representation of the Duke Chapel. I went into Copilot in a Microsoft Bing and told it to build
me a pop art style image of the top of Duke's Chapel. it did a much better design than I would
have come up with myself.
I wanted to add some graphics for the sections. I wanted to talk about this really nerdy but
important change in video standards, so I thought it'd be funny before I did that to do a “nerd
alert.” I went into ChatGPT and said, draw an abstract picture of a nerd alert with alarm lights
and formulas. I gave it a few prompts to get close to what I wanted. I took one of the pictures
and used the generative AI in Photoshop add details and reshape it.
One more thing, Open AI is set to release a new model called Sora, that is going to change
everything all over again. This is a 100 % rendered video from OpenAI, not edited. No video
processing at all. This is just based on prompts. It’s amazing. They've been giving limited
release. Adobe is really pivoting, they have some new generative AI stuff, but I think that's
going to change things for synthetic video.
Victoria Szabo: Wow! I mean, I've saw it before, and I'm still wowed. Thank you, Steve. We
have a few minutes for questions or comments.