Duke ITAC - February 8, 2018 Minutes

Duke ITAC - February 8, 2018 Minutes

Agenda – February 8, 2018 4:00 ‐

Announcements

The minutes for 11/30/2017 and 2/23/2017 were approved.

Research Computing and the Research Computing Symposium – Mark Delong, Mark McCahill (20 minute presentation, 10 minute discussion)

What it is:  Research Computing offers services that are useful to researchers across Duke and often in collaboration with researchers at other institutions.  The group provides cluster computing services through the Duke Compute Cluster and offers storage, a protected computing environment for researchers using sensitive data, and well‐tailored virtual machines of various configuration.  Duke Research Computing offers wide‐ranging education and training opportunities to the Duke community and seeks in its activities to bring researchers together who apply computational and quantitative methods to every field represented in the University’s faculty.

Why it’s relevant:  The fourth annual Duke Research Computing Symposium was held in January 2018 where 170 researchers from across Duke converged on Penn Pavilion for research talks, poster competitions, and information gathering.  These presentations will give an overview of the services and resources available for research and the outcomes of events such as the Research Computing Symposium.

The goals of Research Computing Support are to improve workflows for researchers and to provide a flexible and responsive infrastructure that can change in minutes rather than weeks. Researchers want self-service and on-demand options for the hardware and software they need for analysis and research. Containerized tool suites that offer easy migration across environments are key factors in flexible and responsive workflows. Research Computing staff have made field trips to researchers and it is evident needs have changed over the years. The infrastructure has grown from a cluster of servers to multiple options that provide an infrastructure tailored to projects.

The Duke Compute Cluster is comprised of 318 servers with over 10,000 CPU cores and options for Infiniband low-latency interconnects and Nvidia GPUs. The Duke Compute cluster historically is used by submitting "jobs" which are processed and the researcher notified when the task is completed. Unlike the cluster, the Research Toolkits service allows researchers to interact with the "job" or task. Researchers spin up virtual machines or VMs which they completely control and use to execute tasks immediately rather than wait for jobs to complete. Research Toolkits provides 155 servers with 800 CPU cores and 3.9 TB of RAM with self service on-demand provisioning of virtual machines up to 40 CPU cores and 480 GB of RAM. Storage is greater than 4 PB for University and Duke Health and is presented as NAS and SAN for scratch storage. The Duke Data Commons has 1.5 PB for NIH-supported researchers which comes from a grant established in 2014. These traditionally last for 3 years but they have been able to extend it for an additional year with a service contract. The Duke Data Commons will be retired in September of 2018 and other options are cooperatively being explored. As for communications, there are special purpose networks including protected data enclaves that provide locked down environments for sensitive data. Work has been done to provide self-service provisioning of special circuits that bypass the core network for higher speed. There are protected data enclaves in the network for both University and Duke Health use and the science "DMZ" has high-throughput access to off-campus resources. Resources not owned by Duke include the Open Science Grid which provide an opportunity for certain kinds of computing research. The XSEDE system is a more formalized shared computing resource that we know is being used because a researcher inadvertently used all available startup allocation. Researchers are thinking broadly about where they can compute which is adding richness to the infrastructure that can be consumed.

The population breakdown of Duke Compute Cluster user community is split almost evenly between Trinity, Pratt, and the School of Medicine. For Research Toolkits, the percentages are nearly the same except there is a larger "Other/Undefined" community. The Research Toolkits user base is one-third the size of the Duke Compute Cluster community.

Research Computing does cross the institution and is not limited to a particular department although participants are expected to adhere to policies and guidelines (ex. handling PHI). Research Computing has options for protected data using Protected Data Enclaves. Duke Health has Protected Analytic Computing Environment (PACE) and the University's is Protected Research Data Network (PRDN). The virtualized compute resources can be presented to either of these environments. By design, PACE and PRDN limit access to the internet which means it is hard to build or update tools that are inside the enclave. For this reason, we encourage the use of packaging your tools so that the container can be copied into the secure enclave.

A state-of-the-art example is a researcher working with clinical data in the PACE environment using researching machine learning models with clinical data stored inside PACE. The researcher is using virtual machines provisioned by OIT but running in PACE. The tools are housed in a Gitlab repository. Whenever changes are committed, a script builds a new Singularity container that after a short delay is copied onto the VM in PACE. This type of "Singularity container" has benefits in that these can also be used in other places like Research Toolkits, the Duke Compute Cluster, and third party providers such as Azure and Amazon Web Services. It also has huge implications in research reproducibility since these containers include an entire software environment. Specialized libraries or particular versions of software that need to be consistently used across the life of the project can be replicated in the containers. Software environments then can become research artifacts and ancillary data when the work is published.

The Research Computing Symposium was held on January 22, 2018 at Penn Pavilion. Talks included "Duke Research Support, "Data Enabled Precision Medicine for Tomorrow’s Health Care @ Duke" and "Scale and Ambiguity in the Digital Analysis of the Spaces of the Holocaust (or Why Bother Making an Art Historian a Member of Your Team)". Speakers also came from different backgrounds. Some were expert with research computing an integral part of their work environment. An art historian, expert in his field, spoke of his challenges understanding some of the basic building blocks of computational research.

The departments and schools of participants included Trinity Arts and Sciences, Pratt Engineering, and the School of Medicine. There were also posters that were presented which followed two tracks: Research Computing which was very broad and Scholars@Duke Visualization which used Scholars@Duke plus a dataset provided by the Graduate School (memberships of dissertation committees). The idea was to get the functional interdisciplinarity of education research endeavor at Duke. There were twenty-five Duke departments represented in the poster session, eleven collaborating institutions world-wide including a couple from India, and a total of thirty-six posters.

What's next? There will be another Research Computing Symposium next year in January of 2019 with one speaker already confirmed. There will be workshops on automation, containers, and tools. There is work on setting up GitLab repositories with automatic builds making it easier to use Singularity containers for research workloads. There is exploration on what would be involved in automated "cloud bursting" for short-term rental of machines normally more expensive, especially if they are not kept busy. A use case would be extremely-large scale resource pools containing GPUs to be used during machine learning training.

Q: What was the type of XSEDE allocation that was accidentally used up by the researcher?

A: It was an introductory offering which lessened the impact of the incident.

Q: There are commercial services like Azure and Amazon Web Services (AWS) that have an educational branch with discounts but these are not well advertised. How can Duke take advantage of these?

A: Duke is actively working on a contract with Amazon but does not have an announcement at this time. The goal is that there is a contract that covers both the University and Duke Health. There is also Globus Connect which provides a fast delivery of data from outside the Duke network with grid ftp.

Q: Automation and containers are great but are created via web interfaces. Is there a way to use a command-line interface to spin up and bring down instances?

A: This could be done. A lot of the web calls are using command language and if there is a need, we could look into developing this as an alternative.

HackDuke – Stephanie Ding (10 minute presentation, 5 minute discussion)

What it is:  HackDuke is Duke University’s largest undergraduate tech community.  Our overarching goal is to build a collaborative and socially conscious tech environment at Duke.  Each fall, we organize what we are best known for, our Code for Good hackathon, the nation’s first student‐lead social good hackathon which has since inspired many other college hackathons to incorporate similar themes.  

Why it’s relevant:  HackDuke, one of the largest student hackathons in the Southeast, is held annually throughout Duke’s engineering campus.   It started in the fall of 2013, becoming the nation’s first philanthropic hackathon with its emphasis on social good.  In the Fall of 2017, the annual hackathon had grown to more than 500 student hackers representing about 23 schools across the nation who spent 48 hours coding for good on Duke’s campus.

HackDuke started in 2013 with a small group of students who wanted to start a Hackathon at Duke. HackDuke is now established on campus with a membership of thirty-five students. Each year, HackDuke organizes a Hackathon in the fall. A "hackathon" is a 24 to 48 hour event with technical students and their mentors gathering to attend workshops and complete tasks or challenges. Hacking in this case involves building a project and demonstrating it at the end of the hackathon. Attendees also benefit from the hands-on experience of coding. Organized hackathons have been gaining popularity in the last few years.

What separates HackDuke apart from other organizations like it is that HackDuke focuses on social good. One of the founders felt that although hackathons were great, they could do more to affect real change. The Duke Hackathon was organized into four tracks: education, health & wellness, inequality, and energy & environment. Attendees hack along those tracks working with mentors, both technical and non-technical, to learn how technology and social good can be used together. Prize money is donated to a charity of the winner's choice. Duke was one of the first schools to combine hacking with social good and other learning institutions are interested in learning how these two areas can be combined.

Previously, HackDuke offered education series during the year but other technical organizations have begun doing this and HackDuke may be discontinuing the education series so content is not duplicated. HackDuke did see a unique area to explore and that was in design. The art classes on campus do not cover product design and HackDuke decided to organize a conference focused on design-thinking and applications for design. This gave an alternative to students who liked the software but were not as attracted to engineering studies. The third Ideate Design Conference is an one-day event at the Rubenstein Arts Center with design classes and workshops focused on what product design is.

HackDuke Partners was started last spring with a focus on practical approaches to social good by reaching out to local non-profits in search of technical needs that HackDuke could address. They worked with the Community Engagement Fund by building a web scraper to find sources for affordable housing in the area. With Meals on Wheels, they built an optimizer for routes that saves time and fuel when delivering meals.

One issue is that the HackDuke club is small with only 35 members and a lot of time is spent planning the Hackathon and the Ideate Design Conference. They are talking to Duke Catalyst, a tech social organization with a rush period each semester where members pledge. Duke Catalyst was started by HackDuke members two years ago. There are hopes that these two groups can partner on activities and events.

The Hackathon style has been positive for the Hack Duke members including a "Girl Hack" event. It was rewarding to build something with other women and while the end result was not as good as they hoped, it did encourage some participants to become computer science majors. The experience was enhanced by mentors who worked with individuals through the night. This type of focused assistance differed from experiences in classes and was a huge benefit of hackathon-type events.

Hackathons are also team-based which teaches the participants to work with others in order to reach a goal. Sponsorship is also an important part of these events. Tech companies participate which allows them to see what Duke students can accomplish. Hackathons also include community involvement. At last year's Hackathon, the organizers invited entrees from the "Code for Durham" chapter, a volunteer organization of IT industry staff. A sample project included developing a tool that reported on the lines and wait times at election polling locations. This type of collaboration gets the HackDuke members out of the Duke "bubble" and into the community.

It has been challenging combining technology and social good in the space of a one-week hackathon. The HackDuke members are a small body and there is a need to reflect on what they've been doing and get feedback on how they can be effective while meeting the needs of the club. Some may be interested in technology without the social good component. DukeHack is trying to partner with more Duke technical organizations so they can have more of an impact. In the end, the challenge is integrating social good into the hackathon without detracting from the technology aspect.

Q: Are you still getting good products out of the weekend? Is the energy good? If yes, keeping the theme makes sense but if the "well" is starting to dry then that may be an argument for broadening your theme.

A: In previous years, the products were disappointing and not thought out. This year the organizers made changes and added a lot of resources to the website in preparation for the hackathon. This year's products were some of the best the organizers had seen. This is encouraging but there is still some uncertainty and this might be something the organizers can try to measure.

Q: How many participants in the hackathon?

A: In the past, they've had over 800 but have scaled down to around 500 attendees. Users must apply to join the event (there are over 2000 applicants). Each submission is reviewed which is a time-consuming process. This year, Duke students were automatically accepted since this was a Duke organization (75% of the attendees were Duke students). The rest were students from other colleges.

Q: Have you explored having more of these events or ways to expand beyond the hackathon weekend? What support would you need?

A: Time was not a resource this year but they are exploring working with startup accelerators for funding or mentorship, perhaps with some kind of automatic acceptance or entry as a startup for hackathon winners .

Q: You mentioned trying to find those connections in the community to keep up that social good aspect. Have you heard of a group called "Social Innovators"? Each semester, they partner with a local non-profit business or city office. They may need tech talent.

A: HackDuke will look into that.

Q: You mentioned some resistance in the administration at supporting HackDuke. Are there things that could make it easier or ways to partnership in order to help your mission?

A: The biggest collegiate hackathons receive monetary funds that could be in the tens of thousands of dollars. The prize for the HackDuke hackathon this year was $5000 (up from $1000). This is important because raising the money is time-consuming and that time could be better served working on the event itself including recruiting mentors and integrating the social good aspect in a better way. HackDuke would also love to see more professors at the hackathon. Also, some students cannot commit to the weekend because they have midterms or assignments. At some engineering schools, the professors acknowledge the hackathon and try lesson assignments or defer midterms. It was suggested they reach out to the engineering or computer science deans to see if this could be a possibility.

IoT (Internet of Things) – Libby Evans, Maria Liberovsky, Evan Levine (20 minute presentation, 5 minute discussion)

What it is:  The Internet of Things is a network of physical “smart” devices ranging from appliances used in your home or dorm to cars to heart implant monitors that are able to inter‐operate using various technologies.  As consumer‐level and enterprise device options continue to grow at an incredible rate, the Duke Digital Initiative is partnering with the Innovation Co‐Lab to embrace and prepare for this influx of technological possibilities.  

Why it’s relevant:  During 2017‐2018, the Duke Digital Initiative (DDI) is exploring the opportunities and challenges of IoT and how this massive connectivity of devices might impact Duke inside and outside the classroom.  As a part of this research, the Innovation Co‐Lab has sponsored three IoT kit distribution events.  The kits included a Photon microcontroller, a small computer that can control a variety of sensors (light, temperature, moisture, motion, etc.) and actuators (buttons, LEDs, motors, etc.).  What types of creative masterpieces were invented using these toolkits?  What role will Duke and the Innovation Co‐Lab continue to have in the IoT initiative?  Those working on these endeavors will provide answers for these questions and more!

The Duke Digital Initiative or "DDI" began in 2004 with "The iPod Experiment" with Apple where every incoming student at Duke received an iPod. Observers saw the benefits of digital experimentation and the Duke Digital Initiative or “DDI” was formed in 2005. DDI was a collaboration between OIT and the Center for Instructional Technology (now Learning Innovations). DDI funded a variety of faculty projects between 2014-2018. The process involved a faculty member submitting a request to DDI which was reviewed by the DDI admin team to determined where or not to fund the project. Some staff requests were also funded in the areas of emerging technologies (hardware, software, etc).

In 2017-2018, the Internet of Things or "IoT" Initiative was formed as a result of staff inquiries into this technology. There were IoT devices available at local department stores and the technology was showing up in unexpected places like light bulbs and appliances. There were four components to the initiative. The first was to install consumer IoT devices in the Technology Engagement Center where anyone on campus could come into the TEC and try them out. The second was to host events related to IoT for the campus community. The third was to purchase IoT kits that by request could be available for faculty use in courses and students in classes. The fourth was to distribute up to 500 IoT kits to any Duke student who wanted one.

As to why an initiative was needed, the number of devices with internet connectivity is rapidly increasing and is expected to exceed 50 billion by 2020. These items are showing up in dorm rooms and offices and are becoming part of digital literacy. We wanted to learn more about the challenges and opportunities of this technology. That type of investigation is a core component of the Duke Digital Initiative. There is also the concept of "coding for all" regardless of a student's academic program affiliation so that a user can have a basic understanding of what coding is, what it means, and how it's done. Duke as institution also needs to know how to handle these types of devices and how they impact our security and network. A variety of IoT devices have been added to the TEC including smart lightbulbs, weather stations, home assistants, padlocks, and air quality monitors and more items are coming. Staff at the TEC are available show users the devices. Users can also suggest other smart objects for consideration as they become available.

Several options were considered and evaluated for the IoT kits that would be purchased for the students. Considerations included cost, documentation, market relevancy, and ease of use regardless of technical background. Once a kit decision was made, the DDI bought some for localized testing. Staff from DDI and the Learning Innovation Center participated in a "group build" using the IoT kits. The experience was very useful in that it revealed challenges in getting the kits onto the Duke network. This included identifying the hardware address, a required value for registering these kits on the network. Testers had to connect wirelessly to each kit to capture that value and everyone attempting to do this at once resulted in uncertainty about which broadcasted name was associated with each kit. In the end, the testers determined it was best to have only one person attempting to connect to a device at a time. The testers were able to pass on this information to subsequent users and the CoLab added this to the documentation.

One of the elements of the IoT initiatives was to provide these devices for use by professors and instructors in the classroom. Three curriculum-based courses were interested in participating, one in the fall of 2017 ("Computing and the Internet of Things") and two in the spring of 2018 ("Performance and Technology" and "Engineering Innovation"). Students from the fall course gave project demonstrations in a evening event at the TEC. Kits have also been given to independent study users. The DDI does plan to host a showcase for how these kits have been used by the recipients some time this year.

The DDI were distributing kits several ways during the initiative. Kits were provided as part of a Roots class with three more scheduled in the spring. They are considering a class beyond beginner for those who've received the kit but are having trouble setting it up or want to move farther along in their experimentation. DDI had a studio night in the fall and plan another in the spring and also did a popular Internet of Things class as part of "Hack Duke". There have been four distribution events where almost all the kits were given out. The participants were checked in using a IoT device built by student developers at the CoLab. Students could swipe their DukeCard and the device captured the netid and tracked who already had a kit. Interest at the first distribution event was very high and participants were lining up 45 minutes before the start of the two-hour event. The group self-organized by assigning numbers to those in the line as a way of keeping distribution fair. Before the event started, all the kits had been claimed. The staff learned after the first event to publicize only a start time. Otherwise, students expected to come any time during that two-hour window and get a kit.

The outcomes from this initiative include implications of IoT devices on cybersecurity, skills around circuitry and electrical engineering, basic coding skills, and wireless security. There was also exposure to this technology for faculty, staff, and students even if they were not building with the kit regardless of academic background. One student employee in the CoLab built a facial recognition lock his dorm room. Another developed a module for an Alexa device so that certain commands or keywords spoken provided specific information about the Innovation CoLab. There have also been a few "Open Houses" where anyone could come and learn about and use IoT devices.

Students were sent a survey 4-6 weeks after they received a kit. The survey was relatively short with 7 questions and out of the 110 who were contacted, 50 completed the survey. This was a good result considering the difficulties usually encountered in getting recipients to respond to surveys. Users can contact the DDI team if there is a device they would like to see included at the CoLab or events they would like to see on campus around this topic.

Discussion

These type of devices have been an issue on the Health system side. In the past, access was set using login and password authentication. However, the wireless networks now use a certificate model which is not possible for many of these devices. They are still working through the best ways to include IoT objects which such as using a wired connection instead of wireless if that is possible.

One of the first devices DDI tested was a light bulb but the directions required that a button on the wifi device be pushed. The tester recognized this would be an issue on the Duke network and it was not possible to bypass this requirement. Another issue was that these devices were registered on the DukeOpen network but the networking group preferred the devices use the secure Dukeblue network. And finally, a lot of IoT devices required the user create an account with a vendor and management of that account information in shared environments was problematic.