Duke ITAC - August 30, 2018 Meeting Minutes
4:00 – 4:05 – Announcements
Collin Rundel, special Guest Chair for the day and faculty member from the Department of Statistical Science, convened the meeting and welcomed everyone to the first ITAC meeting of the Fall Semester.
All attendees introduced themselves and their department affiliations. Collin also acknowledged ITAC members who recently completed their committee terms. New members from the student representatives will also be joining this team as the semester progresses.
4:05 – 4:45 – Data+ Student IT Projects, Paul Bendich, Robert Calderbank, Michael Faber, Eidan Jacob, Evan Levine, Jen Vizas
What it is: Now in its fourth year, Data+ is a summer research experience for Duke Undergraduates interested in exploring new data-driven approaches to interdisciplinary challenges. This summer, two Data+ project teams partnered with OIT to visualize Duke wireless network data and map co-curricular pathways with an “e-advisor.” This presentation will review the evolution of Data+ and take an in-depth look at these IT-focused projects, from the perspective of the advisors, students, and customers.
Why it’s relevant: Data+ encourages collaborations across disciplines, and these particular projects brought diverse students and staff together at the intersection of two expanding fields: data science and information technology. Such co-curricular experiences have the potential to reap enormous benefits for the Duke community. We will discuss these case studies and how their findings can be applied across campus.
Data+ is a 10 week Summer Undergraduate program in Data Science made up of 24 or 25 teams with 200 to 300 graduates per team and a graduate student project manager or post doc project manager. The range of projects can be from fancy deep learning on medical data to wireless mapping and these are all co-located in the same space and time where students get a real sense of the depth and breadth of what data science can be.
There’s been two projects this summer sponsored by OIT:
- Mapping wireless networks
- Building a recommendation Engine e-Advisors for co-curricular activities at Duke
Every year the teams learn how to make it more successful. This year it was discovered that breakfast was a better idea than lunch as it gets the students in early and they are all set to work throughout the day. The students are expected to have a question and a dataset that enables them to answer that question, and they must prove that they actually have access to the data. The team leads work with the students to analyze the data and shows them what is possible and feasible and help their ideas along and polish their proposals. They are trained to be really good Data Science Consultants in the real world. The above two student projects were fantastic examples of Data Analysis.
At the end of the program, the students are asked to do a brief survey and one of the student reflections had a great quote that read: “The most surprising thing I gained from Data+ was patience. Starting the project, I imagined an explosion of results around every corner, beautiful graphs with clear trends, and machine learning models with near perfect performance, but then I realized it doesn’t like picking plum fruit of the tree, It’s more like hoeing potatoes from the ground. You need the persistence to keep digging and the humility to get dirty”.
An RFP for next year is up on the website and the application process is very simple with only two pages. The goal is to vet the proposal by mid-December and advertise the project before the students make other choices for the summer.
- Wireless Networks Mapping
The video highlighted the following:
- Duke OIT engaged with Data+ to work with one of the core constituencies on campus - the students, on how to identify the problems of wireless network on campus.
- The main goal was to investigate why the students in Perkins Library could not stay on the wireless.
- Within a week, the students mapped all the devices that passed through the wireless network, ported that data into R Studio, and used R Shiny to create an interface almost like a heat map of where people end up throughout the day.
- This public facing App allowed the input of any building and showed which Wi-Fi nodes were being used more heavily than others.
- The maps were then used to compare the number of devices in the buildings to the access places, and also compare the amount of information that was passing through between the devices and the network.
Richard Biever brought attention to the following 3 key points:
- Working with students was very refreshing as they took different tasks at looking at data from their perspective and provided a frame of reference in terms of analyzing it.
- Visualization of the wireless network was found to be the most valuable as it showed us the network activity in buildings and across campus in terms of usage and events, and the possibilities of future expansion.
- OIT is keenly aware of the privacy concerns therefore all precautions were taken as we worked with this data and worked with the team.
In summary, the visual cues provided us with valuable information of what was happening in the wireless network in various building as opposed to looking at rows of data. This app will be very scalable and adaptable to any building and we hope to develop and apply this to a wider use.
Questions and Comments:
Q. Do the data points on the map represent bandwidth or total number of connections?
A. This was a number of events in the wireless access points. It’s more or less the same.
Comment: So there’s a high co-relation between the numbers of associations and the amount of bandwidth.
- Recommendation Engine “e-advisor”
The e-advisor project evolved from the premise that there is an increasing number of co-curricular activities on Duke Campus but the students don’t have enough time to engage in everything and they want to maximize their time and only do the activities that are most advantageous to them.
A video was played highlighting the following:
- The Recommendation Engine project was to create some sort of tool to help students discover co-curricular programs that align with their interests.
- Co-curricular programming refers to all the learning that happens outside of the academic classes, all the clubs, all the projects, independent work, things that are not academic but are still considered learning opportunities.
- Since Duke is such a huge campus with several resources, when new students come in they don’t know what to do.
- The idea was to build a system where new students could input their general interests into the program and based on their interests, the program would provide them with general recommendations.
- This App was designed to hold all of the activities and programs offered at Duke. To build the app, the students used R shiny to create the web interface based on R code and built a recommendation algorithm using user to user matrices and user to item matrices which is Collaborative filtering and Content-based filtering to create better recommendations.
- Students input their data to create a profile of all the programs they’ve participated in and this profile data is then fed back into the system to create a match for new students.
- A great tool to connect students based on their interests and help them navigate through Duke’s co-curricular ecosystem.
In summary, three of the four students who worked on this project will be hired to continue the project. Recommendations from this engine are okay for now and will only get better and smarter overtime as this tool continues to be used and gathers more data.
Questions and Comments:
Q1. Does it explore co-relation or sequences of programs such as how many students took roots classes before taking other programs?
A1. No, because we did not have data to connect that to anything else.
Q2. Did you try to break the model of the algorithm as to get random recommendations?
A2. Not implemented so far but would be interesting to test it.
Q3. How many students are you shooting for next summer?
A3. 25 projects to cover enough topics and to have all groups co-located in the same space.
Q4. Were there any philosophical discussions on the idea of students following other student’s paths?
A4. No, but it would be interesting to see how the algorithms are developed to capture those aspects.
Q5. From the Computer Science perspective, as a co-sponsor with us on this effort, what are thoughts from that end?
A5. The challenge is getting the data and in this case it worked out well because the students ended up gathering the data.
Q6. Was there a naming challenge to choose a better name than “e-advisor” to attract more users?
A6. No, we plan to make it a development challenge.
- Having an MVP version of an App is better than having a polished version.
- This is a perfect example of Duke Students working with Duke Data to make their Duke experiences better.
- The students would have failed to prove that they had access to the data but they turned it into a data gathering exercise and figured out how to collect it via the App and still make this happen.
4:45 – 5:00 – Code+ Wrap-up, Hugh Thomas, Jen Vizas
What it is: The Code+ student internship team presented its Park Duke mobile application at an ITAC meeting earlier in the summer. Now, with the program’s inaugural year under its belt, Code+ reflects on its successes, lessons learned, and next steps.
Why it’s relevant: As a new offering in Duke’s growing co-curricular catalog, Code+ connected students with technology in a way that not only helped the students further their professional skill sets outside the classroom – but, it also yielded surprising results for the Duke community. We will discuss ideas for expansion and how this model could translate to other areas of focus.
Code+ 2018 Update
- 10 Week summer internship supported by OIT in conjunction with DTECH
- 6 female undergrad Duke Students (CS/EE majors)
- With no prior mobile app development experience...
- OIT managed with 2 part time resources (1 FTE)
- Initial goal was to investigate the use of a mobile app to address the parking challenges at Duke
- Team ended up creating an iOS application “ParkDuke” that will help improve the parking experience for University students & employees and Hospital visitors
Video highlights were the following:
- There were no guidelines for the project and the students had to think of everything on their own and write code from scratch. They also found an appreciation for all the work that goes into developing an app before writing a single line of code.
- The students were able to design the App exactly the way they wanted it.
- The students enjoyed working together and felt proud of their achievements and gained self-confidence.
Next steps for ParkDuke
- Convert the “prototype” app into a “product”
- 2-3 months of work for 2-4 interns (part time) plus OIT staff
- Integrate backend with Parking server & add real time data, user alerts and permit pass purchase features
- Improved security & standard Shib support for login & user authentication
- Beta test with select user group in Q4
- Target public launch early 2019
- Port to the Android app
- Explore other revenue opportunities by selling the excess capacities
Code+ in 2019
- Growing from 6 interns to ~25 interns
- potential areas for expansion
- Mobile apps
- Security apps
- Incorporating Artificial Intelligence / Machine learning
Questions and Comments:
Q1. What are your thoughts on evaluating these programs? Is there a plan to do long term surveys or follow the students who have participated in Data+ or Code+?
A1. DTech is trying to take a somewhat longitudinal look at the DTech Scholars. I thought I'd seen a reference recently that over three or four years that DTech has existed there. Now at this stage they've identified that every single student who was a DTech scholar has continued in the tech space which from the standpoint of attrition and otherwise that we might otherwise see of female students in Tech that's really impressive. Right. But that's three four years of data.
Q2. Is it the intent that Code+ will continue to have the focus on underrepresented groups?
A2. Code+ will steal whatever we can from data+ as our approach. But we have supplemented it to try to also reach out to underrepresented minorities and female students as well we've had a strong internship program in OIT for several years that draws on students from NCCU and other places. So we see Dtech as a good complement to that.
Q3. Are there mechanisms to de-identify the data as in genomics?
A3. There’s two different issues – hesitancy in sharing data for all kinds or reasons and then there’s security - HIPAA protected etc.
- Each one these teams have gone from being scared and not knowing how to do stuff to being confident and proud. The intense community experience in this effort has been great.
- We want to line up the same kind of ideas and projects for next summer in terms of how we help Duke Students with Duke’s data to improve Duke and in general to improve the IT experience at Duke in a very broad way.
5:00 – 5:30 – Celebration & Reception