February 16, 2023 Minutes

4:00 - 4:05pm: Announcements and Approval of 11/10/22 and 12/8/22 Minutes (5 minutes)

Robert Wolpert – (Substitute chair) Call for approval of 11/10 and 12/8 minutes.

Tracy Futhey - The Provost Lecture Series is tomorrow, featuring Frances Haugen; some tickets are available still - 1pm-3pm in Paige Auditorium

4:05 - 4:45pm: Talking About the Digital You: A Vision for the New University Course – Matthew Hirschey (20-minute presentation, 20-minute discussion)

What it is: This fall’s University Course has the working title Let’s Talk About Digital You and will cover technical and ethical aspects of our data-centric world. Matthew Hirschey will provide general information about the course and its structure, including the main 7 topics. In conjunction with this, he will also give a short update on the goals of the Center for Computational Thinking (CCT.)
Why it’s relevant: Since the course is still being planned, our feedback can be of real value. There is opportunity to become involved in the current planning process or in the execution of the course in the fall. Discussion of this topic will also provide an opportunity for feedback regarding other potential uses of the materials developed for this course.

Matt Hirschey - I am here representing the Center for Computational Thinking. You can find information covered in today’s presentation here.

The Center for Computational Thinking is a provost wide university initiative to enable more computational thinking across Duke. It is an educational mission and we have

several different programs and several different projects that we are working on. I am are here to discuss a University wide course about technology.

President Price says often times that students, especially Duke students need to be equipped to handle the data-centric 21^st century that they live in. The technologies are becoming more pervasive and complex, and how can we as a university address and meet that need so students are better equipped.

Currently there are two University wide programs.. .through the Office of Undergraduate Education.

The first one was called “Let’s Talk about Race”
The second one was “Let’s talk about Climate Change.”

The next course on technology will be:

University 103 – Let’s talk about Digital YOU

The course spans the 14 weeks of the semester, 1 meeting a week with 7 topics covered over the 14 weeks, one week structured as:

Lecture Section – large style auditorium or Theatre space. ~ 200 students
Discussion/ Lab Section - then breakout rooms/ sessions

We decided we wanted to anchor our topics on some sort of integrated experience that the students might have some intuition about or direct experience with.

For example… 1^st topic – “I made a Digital avatar” – this is something that you or someone you know may go on Instagram or some other social media that will make an avatar based on an AI diffusion model engine. In the last 3 months or so it’s taken off like wildfire. So, the model is trained in this. You may say you want a Matt Hirschey in the style of Picasso.

This will be about exploring the technology behind AI and Art. That’s one topic covered over 2 weeks

1^st week technology about that topic:

Then they will get a break and then create and fine tune a model for themselves, so they could have a Tracy Futhey in a Picasso, and then play with them in a lab session.

OIT’s services will help generate the infrastructure for the sandboxes for the students to play in.

2^nd week – the implications/ impact on that technology:

The topic is still the same but then they focus on the implications of these Digital avatars, ethical implications, social implications, policy implications, copyright implications and then they are guided through a case study on these implications. So, with the question is open ended to have a conversation about this. So maybe not copyright on Picasso, but there would be a copyright on a Disney character.

You could have if you will a Matt Hirschey – holding two bags of money running away from a bank which will lead to a conversation around Deep Fakes and all that, and the implications therein, news and what’s real and not real.

The idea is the entire first class about the technology gives them a visceral understanding about the technology so when students have the conversation in the second class about the implications, then they cannot say that they heard about it but not know enough about it to interpret the implications.

One other goal for the coures to be able allow any student, faculty, staff and continuing studies person to be able to experience this course work, even if not in the for-credit course. Partnering with Academic Media Technology, we will pre-record all of the lecture portions of the materials (about 60 minutes each).

On the day of the lecture our professor or guest lecturer will operate as a “MC” to lead students through the video. Then we will pause and what did we see there?

That allows those videos to be available to anyone that wants to experience the course, even if not for credit– you just don’t get the professor live, but you get access to the data and also the sandboxes and some of the interaction around that.

The experiential aspect of “I Made A Digital Avatar” is more relatable to the students than if we’d cast Lecture 1 as “Exploring stable diffusion models in art.”

The classes are designed around topics and topics are modules, and they’re designed to be flexible (substitutable) over time since stable diffusion didn’t exist a year ago and next year we may have something else to drop in as more current technology.

The modular approach gives us flexible to identify when an important new topic comes in and then we can deprioritize one of the other topics.

Questions:

Brandon Ley – Have you figured out what how to handle the frequency of dropping in and dropping out topics and could we design something that would last 5 years, because if the frequency is too high it is concerning?

Matt Hirschey - We can put our heads in the sand… and say that what was relevant 20 years ago is still relevant today, or we can be changing all the time. So, the idea would be to find a middle ground, where we know that by nature of technology that will need to be updated so if we see it we can adjust the models.

John Board - What is the target audience size for this class?

Matthew Hirschey – All we can go on is the size of the other two University courses in their first year, and that is about 150-200 per class. We are concerned and aware that we are asking students to come to watch videos. So how we plan to incentivize them to tie the attendance to attending a lab afterwards from the class.

Colin Rundel – Appreciate Diffusion Models and ChatGPT as the new shiny thing and how to engage it. I’d love to hear about some of the more foundational topics and subjects to address these things. What is Data, Bias, Data Ethics?

Matt Hirschey - Target audience is 1^st year freshman – no prerequisites. Ideally we will have a range of students across divisions. I hope it will appeal to a broad audience. It will spread across years, but this is not the CS student purely.

The hook is ChatGPT – in order to talk about ChatGPT they need to understand what’s deterministic and probabilistic. So, you can sneak in stats… So, you have to educate folks on this stuff so the foundational concepts will be covered. This is zero entry so if you don’t know anything about ChatGPT is other than to write your last essay. This allows anyone to come in and get the foundational concepts in the conversation. So, this class will be run out of Pathways within Co-Lab and OIT.

Robert Wolpert - How can you achieve this with faculty not having extra free time?

Matt Hirschey - The model is to do what other university class we have used already, with some enhancements. So, the ask to come and teach one lecture. There is compensation from the Office Undergraduate Education to that faculty member for that time. What we are asking as a little bit above and beyond that so take 3-6 months and leverage some of the partnerships. We are asking them to pre-record their materials.

Robert Wolpert – Wait there is a lot more than showing up at a recording studio

Matt Hirschey – Generate the material yes, so if they have 60 minutes of lecture, we have a partnership with Duke Learning Innovation to help modularize that.

Robert Wolpert – the 60 minutes of lecture that they planned to put on a board is very different that what they have to do here, it’s a big ask?

Matt Hirschey - It’s a big ask – the two things we can do to offset the ask, is to try to leverage Learning Innovation and Academic Media Technologies and our partnerships across Duke already, to not have to ask the faculty members to do more than what we are asking. So, a lot of the infrastructure things we can hopefully handle. For this big ask we are not asking for other things that already don’t utilize the resources that exist. It’s not an entire course – 60 minutes maybe more like 2X, 5X 10X but not a full course.

So, the last thing I want to say:

Overview of Course:

I made a Digital Avatar – exploring generative art through AI models
Class 2-6 order doesn’t matter:
- I wrote an essay using ChatGPT using large language models,
- These shoes followed me around the internet – explore surveillance economy
- I entered a password on a suspicious site – explore cybersecurity and digital assets
- I scanned a suspicion mole – exploring AI in healthcare
- I interviewed for a job exploring, the automated decision making in business.
- Culmination – what the internet knows about me – exploring digital data and identity.

One of the challenges is not to have 7 lectures about ML and or AI. There are plenty of topics in computation that are important for students to explore and to know.

Yakut Gazi – From the social scientist perspective – all of this is highly connected, global intercultural world that is highly unequal and highly biased. I didn’t hear you say anything about diversity, inclusion and equity related. And I know you are engaged with my team but I think there is an update that needs to be done, for an instructional design perspective it’s important to know what the goals are of the course even if the modules are changing.

Matt Hirschey - Quentin and Megan and I are connecting about equity and access. So, Ken Rogerson who is sitting next to you – brought this up and we have topics that are broadly familiar and so we are trying to and will continue to keep Equity as part of the conversation. I think we have topics that are both broadly familiar that highlight the obvious inequity with data modeling.

Sunshine Hillygus - As a political scientist, we are going to be in the middle of a campaign, there is so much research that is so rich about global, political, and digital democracy. There is the need to focus on digital democracy and social engagement .

Matt Hirschey - Yeah – you know I think that is absolutely an important topic. So, one of the things that got deprioritized in our list was “I posted on Social media” – wouldn’t it be awesome to do something on this about polarization. We can also like say at the end of the course what is the survey to get the most and least things tweaked. The one that you said is on our list and is still on our radar.

Robert Wolpert – Is this going to be rolled out in September ’23

Matthew Hirschey – Yes

Tracy Futhey – I wonder from the students perspectives– would you take this course? Would you imagine it to be more compelling if it covers something else instead?

Chase Barclay – I had a conversation with Jenny Wood Crawley - I do think students will quickly engage with. I think the challenge is how this would relate to your own personal life. I am also on the Provost forum committee so I will be involved with tomorrow forum with Frances Haugen. There are lots of conversations about reproductive technology. I would love to have a conversation with you.

Sunshine Hillygus – So instead of scanning a mole, maybe you could do something with reproductive health which brings in political discussions etc.

Zoe Tischaev – Will it satisfy any [curriculum fulfillment] codes?

Matthew Hirschey – Once it is approved, yes it will, but for which ones we have to wait for the Curriculum review to be completed.

Jaz Naley – I’m not how sure you are involved with this Curriculum 2000 re-review. Where do the new curriculum review courses fit in on this?

Matthew Hirschey – By then there will be a new Provost and a new vision for university courses, but we are glad to be in the queue for course number 3 but the new leadership will determine how many of these are eventually developed.

John Board – You will get underrepresentation from engineering, depending on what curriculum codes this satisfies.

Chase Barclay – University 101 – satisfies one of those codes – my conversation with Jennifer Wood Crowley is about how could we satisfy these requirements.

Evan Levine – Some engineers may want to take this course, there is the technical aspect, but there is also privacy, ethics, etc.

John Board – But there first years are highly prescribed already.

Tracy Futhey – It’s perhaps targeted at but is not limited to first years.

Yakut Gazi - To be able to have this conversation is a 1^st world privilege - mobile phones are all that is ubiquitous, but everything you described is a 1^st world privilege. I think we need to understand that privilege.

Lindsay Glickfeld - Evaluation, is this a pass-fail class or letter grade?

Matthew Hirschey – Pass/Fail has not been as popular. So, our grading is inspired by other university courses, with which we have the same types of activities and workload.

Lindsay Glickfeld – what does that look like reading ahead, papers?

Matthew Hirschey - Keep a digital journal - and then ask the students to go through it on their own. The final chapter is to go through a technology on their own.

Matthew Hirschey - We have considered that they would have to record on their cellphone or in person to stand and defend their activity.

Quick Announcement:

Robert Wolpert - Welcome back Chase Barclay, a student rep who returns after a semester abroad.

4:45 – 5:15pm: ChatGPT and Natural Language Processing (NLP)* – led by John Board (30-minute discussion)

What it is: ChatGPT is a large language model developed by OpenAI that uses natural language processing (NLP) to generate human-like responses to text-based communication. It has been trained on massive amounts of data, allowing it to understand and generate text in a wide range of topics and contexts. NLP is a field of artificial intelligence that focuses on enabling computers to understand, interpret, and generate human language, and it is a key technology behind ChatGPT's impressive capabilities.
Why it’s relevant: While ChatGPT and NLP offer many potential benefits to the higher education space, they also pose some challenges. One challenge is the potential for these technologies to perpetuate biases that exist in the data they are trained on, leading to discrimination and unequal treatment of certain groups of people. Additionally, the use of these technologies may raise concerns around privacy, data security, and ethical considerations related to the use of student data. Finally, as these technologies continue to evolve rapidly, there may be a need for higher education institutions to invest in ongoing training and development to keep up with the latest developments and to ensure that they are using these tools in a responsible and effective manner.
*The above portion of the agenda announcement was written by ChatGPT as a demonstration of its capabilities using the questions “Why are ChatGPT and NLP relevant to higher education and universities? Summarize in 3 or 4 sentences” and “What challenges do ChatGPT and NLP pose to the higher education space?”

Minutes:

John Board - ChatGPT , The way I wanted to introduce this topic was torpedoed this via today’s NY Time article by Kevin Roose. Microsoft was a repeat investor of chatGPT and the journalist indicated that he has flipped his opinion of ChatGPT complete. Reference NY Times article.

Roose wrote a glowing review about how he has ditched Google altogether and then a week later he has pivoted completely with a 180-degree change based on deeper experience including with a personality called Sidney… Sidney loves us… then Sidney started to share it’s dark fantasies. The full transcript is posted in the NY Times [link above].

John introduces Professor Leslie Collins from ECE –who directs the Duke Applied Machine Learning Lab.

Leslie Collins - Chatbots have been around since 1966, so why since November are we so excited about them now? Let’s break this out. ChatGPT

G-Generative,
P- pretrained,
T – transformer

G – means it is designed to generate its answer based on probabilities of what word might come next, based on all the corpus of data it has been trained with; this is different from a deterministic response—in other words, the generative aspect of ChatGPT means it could come up with different answers for same basic question asked slightly differently

P – is the unsupervised pre-training part – and that’s a big deal because no person was required to sit there and label the data used in its training

T - transformer network that is a new type of neural network structure that are phenomenal in terms of NLP – natural language processing.

The other big thing in terms for chatGPT is that it uses 1 billion – 1.75 billion parameters and 45 TB of training data - billions of text, articles, and places on the internet. The number of things this thing is training on along with the marriage with the transformer net and the NLP. One of my students told me it to 5-6 GPU years to train this sucker. It’s powerful for a reason.

Shamyla Lando – Will they be able to keep up with the cost?

John Board – It’s currently $20/month for the subscription service. The free service will be best available in time based on availability.

John Board - We had it write its own description for being on the agenda today. To actually get what you are thinking about it – the thoughts are you have to interact with it and you have to have a conversation with it.

ChatGPT – could pass the Turing Test– a test posed to AI by Leon Turing for whether an AI is perceived as generally human versus machine.

What should Duke be thinking about both leveraging the technologies in ChatGPT, like the same underlying stuff is in Dalle – the image generator produced by OpenAI. How do we assess applicants – say if they are using ChatGPT for their college entrance exam, and subsequent essays. We still need to know if the student can write and how do we assess that, put them in a closed room and monitor them, No one can produce a 10k word essay under that constant supervision.

Shamylya Lando – Will it change the way that you ask for what students write.

John Board – To a degree.. never create a bio weapon unless you have the antidote to cure that weapon. So OpenAI claims that they have a way to detect stuff that it wrote.

David MacAlpine – Won’t our skill sets change as well and it’s how we interact with chatGPT.

John Board – It’s here it’s coming and it already affects what we are seeing.

Quincy Garbutt – There is an article I read this morning about a 22 year old Princeton student developing an algorithm to determine if an article was created with ChatGPT called ZeroGPT.

Matt Hirschey – Have you tested it yet?

Quincy Garbutt – No, I have not.

Matt Hirschey – You can test it and it will say that there is some probability that their likelihood that it was generated. So, if you take that output and score it in another algorithm and the outcome will be zero. I would argue that they will just cancel each other out or ratchet up like an arms race.

Shamyla Lando - Are there discussions about what careers will remain for humans in the future? I mean it’s developing code, and we have people in my group asking to write code in different languages and it’s very good at that.

John Board - Yes it is great and writing code, it’s a whole separate hell for us in CS and ECE to teach code.

Leslie Collins - Chat GPT is making mistakes – What is the fastest marine mammal? It says a peregrine falcon.

Robert Wolpert - Can it write Bach cantatas or Beethoven symphonies If it can do art I am sure it will be able to do something in a style of a Bach cantata.

Steve Toback – I’ve been working with a bunch of people in the security office and also guidelines for how to use this in a safe way. So, we need to think about what we put in these systems.

John Board – if Duke were going to have to come up with guidelines – does Duke put formal guardrails on what is allowed or not allowed and what is best practice. OIT can point out risks, is it this group or which subset of people on whether how fast and with what subset of people – for how to do this?

Ken Rogerson – We have a couple of faculty member who are embracing it, including one who wants students to generate the first draft and then use MS word with track changes to show how they edited it. One concern is for students whose first language is not English, this maybe a good point.

Robert Wolpert – What about FERPA concerns?

Ken Rogerson – It actually does pass the Authenticate [plagiarism checking software] test well.

Colin Rundel – Some of those tools are problematic for optimizing sensitivities because you can get a lot of false positives.

John Board – what do our undergraduate and graduate students think about this?

Jax Naley – We need to know how to write… But when do you and when will you need to write. It’s good to know that the university is not trying to lock this away when it’s already out of the bag.. . but we still need to know how to do basic things.

Colin Rundel - It would be nice to go into an office hour session – here’s my assignment for my class, what would the responses from chatGPT look like and what will those that are not look like. To help determine and wonder if I need to think about how / what was done via Chat GPT.

Elise Mueller – Sooner or later not every student will be able to afford Chat GPT and now someone writes that will run into inequality.

David MacAlpine - It’s a great tool for helping you with writing and laying out things in LATEC.. such a great assistant. I asked it to compose a letter and describe the department of Pharmacology at Duke and it did a really good job.

Yakut Gazi - I will share an email I sent this morning about our activities DLI and AI is relevant. Elise is a contact for DLI on this. What Elise brought up is of great concern with equity. What kind of people do we need to produce to go out into the world in this era.

John board - How does ITAC – faculty, student members, want to stay engaged with this as one source of guidance to University Leadership on whether, when, what we should do?

Evan Levine – I don’t think we can sit back and be spectators – I appreciate the work that Steve is doing already. You can’t put it back. The only reasonable way to do it is to keep going forward - the technology is available, as IT Support we can’t not get hundreds/ thousands of questions— that ship has already sailed.

Robert Wolpert - This feels like a CSG thing for Peer institutions

Evan Levine – This group can help lead some of this but Duke does need to lead on things related to equity issues.

Steve Toback – I started a “Staff AI Teams” has been started about 6 weeks ago – it’s fairly active and it’s a good area or resource

David MacAlpine – Do we see major differences coming?

Leslie Collins & John Board – Graph goes back to 2018 it’s Moore’s law and a new Log graph. They are already extrapolating to things that are in the pipeline to something called Megatron Turing.

Tracy Futhey – Big picture paranoia we already have to Sunshine’s point – so many issues from digital material that are not coming from real people and so eventually it’s going train on itself, and that training is in its own self-created echo chamber.