Duke ITAC - July 12, 2018 Minutes
Note: ITAC meetings are digitally recorded for meeting minute generation; audio files for each topic are posted to an authentication-protected repository available on request to any ITAC member. Presenters are welcome to request to have their audio program excluded from the repository.
Agenda – July 12, 2018
4:00 - 4:05 – Announcements (5 minutes)
The minutes from 1/11/18 were approved.
4:05 – 4:40 – Code+ Summer Internship Program, Hugh Thomas, Carolyn Blumberg, Jane Li, Teresa Mao, Alanna Robinson, MacKenzi Simpson, Qingyang Xu (25 minute presentation, 10 minute discussion)
What it is: Code+ is a new, 10-week summer internship program available to Duke undergraduates to apply their education while gaining real-world experience on technology-related projects. This summer, six students are working with Hugh Thomas in OIT’s Enterprise Mobile Applications team, on Park Duke: a mobile app to improve the parking experience at Duke.
Why it’s relevant: This internship provides an opportunity for students to work with a dynamic team of technology professionals, while making an impact on Duke’s digital landscape. In particular, this year’s mobile app project introduces them to innovative mobile technologies. The interns will join us to share their experiences with the program.
The Code+ program, modeled off Data+, included summer interns who were all rising juniors or rising sophomores who had not known each other before this experience. Six interns participated with a project goal to conceptualize an app for parking at Duke. The interns were asked to identify what the parking experience was like and how it could be improved through the use of an app. The audience included faculty, students, and staff as well as visitors to Duke Hospital and Duke Campus. Expectations of the results where modest but the interns exceeded them. The initial thought was that the Code+ team would go and interview people, establish requirements, define the features, and then perhaps create a PowerPoint presentation or slide deck that would illustrate how such an app would work. During the third week of the program, the Code+ team attended a class on an introduction to programming on iOS. The interns decided to try to build the app.
The team members were able to go beyond design to actually build an almost fully functional iOS app. App functionality goals included the purchase of permits in the app, analysis of historical parking trends data, the use of push notifications in the app, and even the identification of new revenue by selling excess parking capacity on campus. Over the past eight weeks, the students have been exposed to a variety of modern app development tools and techniques including the use of Photoshop to create the logo, AutoCAD, Git which was used to store the integrated app files, Ruby on Rails which was used for the server on the back-end, and Xcode which was the integrated development environment used to code the app.
The end result is an app called “Park Duke”. For hospital patients or visitors, there is a button exclusively for this user community. When selected, the visitor can indicate the Duke destination which shows an overlay of all of Duke Hospital. This is populated using Duke Hospital's online map which marks all the entrances, bus routes, and parking options available. The parking options also show the driving distance from the user’s current location and walking distance from the parking garage to the Duke facility. The user can click a “Start trip” button which activates the Maps app for directions to Duke. The developers also want to set off a notification to hospital staff for any special needs such as wheelchairs. The part of the trip that is driven is in one color and the walking directions another. Entrances that offer valet service are also identified. There is an emergency button the user can click and if the “Start trip” is tapped, it will take you from your current location to the Emergency department.
“Duke Park” also provides options for Duke faculty, staff, and students as well as campus visitors. The developers hope to support Shibboleth authentication but that feature is not currently implemented. Users in this part of the app are taken to a map page. The developers believe that the category of user would dictate what they see. Campus visitors would only see parking available to them while students and those employed by Duke would see other restricted parking areas. Duke parking permit holders would see only lots that are accessible to the permit they purchased. The developers are still working with the data in order to support this feature.
Clicking on a parking garage pin will generate a call-out window showing a total capacity and the number of available spaces including driving time information from the current location to the garage. Users can click a button to see parking trends for that location based on data from the Duke Parking office (i.e. days of the week that are less busy). There is also a button for payment options. A user can provide vehicle information, indicate the amount of parking time needed, get information on the cost, and pay on the device using a credit card or Apple Pay. The app includes a function similar to Waze in that users can report time-stamped information for other users such as “spaces available on the third floor”. “Duke Park” includes an option to call Duke Parking from your phone or send them an email. You can also submit a report (for example, report potholes or obstructions including pictures). In the future, these reports could even include location information. There is also an option to report bugs or submit feature requests.
The app includes push notifications generated by a server running Ruby on Rails. This infrastructure allows faster communication between Duke Parking and its users. Currently, most communication is done over email or posted on the Parking website. “Duke Park” users can see alerts that have already been submitted and can subscribe to other kinds of notifications. For example, a Blue Zone customer could get notifications for parking conditions on a game day.
There is also a possible revenue-generating feature: for the parking lots that are restricted by permit but have available spaces either visible or based on trending data, app users could purchase a temporary pass. Envisioned originally for permit holders, the app developers believe this could be expanded for campus visitors. The transaction will take place on the device without needing to go to the Parking Office. The purchaser would end up with a QR or “quick response” code admitting them into the lot. Of course, parking enforcement could be an issue where a user’s time expires but this would be something handled by the Parking Office.
For permit holders, “Duke Park” provides account information based on records from Duke Parking including registered vehicles. The app allows users to update vehicle information. The QR code for the permit is also available with parking locations for that permit highlighted. “Duke Park” also has a process for citations. A common complaint from users is that they are not aware they received a citation or are unsure why they were cited. This information could be made available in the app including payment options and steps for appealing.
In summary, the Code+ interns learned about the variety of different technologies that modern day developers use and were able to implement Duke-specific features to their app. They were able to uncover new revenue opportunities such as selling excess capacity of parking lot. Two more weeks are left in the program and the participants are eager to continue working and feel passionately about what they have done and how it could improve the experience of the Duke community. The team sponsor indicated the team's progress was a huge surprise and they managed to produce an excellent proof-of-concept product with little-to-no coding experience. Finally, there are discussions with Duke Parking for feasibility of the app.
Q: What is DTech?
A: This is an acronym for Duke Technology Scholars, a program that was started around four years ago with a goal of placing Duke female students in STEM internships initially in Silicon Valley, then expanding to Chicago, and RTP. It was a way to address retention problems. Silicon Valley interns all lived in the same housing. Even though they worked in different jobs, they were having shared experiences.
Q: Would the parking options change depending on the time of day?
A: Yes. We have some exciting ideas around changing the color of the pins based on if the lot permits parking after hours.
Q: For the notifications, would this data come from Duke Parking?
A: Yes, they would control this information.
Q: Could notifications also warn when your parking window is about to expire?
A: Yes. We would love for the app to do this. Much of the payment information was outside of our skill level since so much of it was dealing with credit card information and Apple Pay. But we envisioned the app allowing a user to even buy more time using the app.
Q: For the QR code and permits, does this mean I could give my physical pass to someone else and we could both be parking at the same time? Does the system know the pass has already been scanned?
A: The Parking Office already tracks this and if a user attempts this, one of them could end up stuck in the lot or be denied entrance.
Q: What if I utilized a “universal pass” for coming to campus? Would the app show me all the places I could park?
A: Yes, this information would show up when you select permit information.
Q: For GoPass users, could the app address situations where a GoPass holder who normally bikes to work has to drive? This would not be a purchase.
A: This is an excellent idea for the developers. It is important to note that the interns were able to do far more than they expected. This is a proof-of-concept app. There is much opportunity to build even more interesting features.
Q: Why not build a web application? Why develop for the iPhone?
A: Notifications are difficult in a web application. A spreadsheet ranked the features and requirements for the app and notifications was near the top; not just alerts but social media as well. A native app addressed this and payment features.
We have had multiple incarnations of a mobile app for Duke. We started with a named app and then went to a web-based tool in a browser using responsive design but have come back to native apps because there is too much integration that is needed that is not available in a web-based app.
The promise of a single web app that could meet all platforms has not materialized at this time.
Q: For the social media aspect, do you anticipate these posts expiring? And what about including verification or “likes”?
A: Yes and yes. Initially we thought daily.
4:40 – 5:05 – Open Science Grid and Globus, Steffen Bass, Charley Kneifel (15 minute presentation, 10 minute discussion)
What it is: The Open Science Grid (OSG) is a service for running calculations on a variety of systems across the US that takes advantage of unused compute capacity. Another service, the Globus project, has been running for more than 20 years and has brought us services like Grid FTP, the general concept of grid computing, and Globus Connect for moving large files reliably and securely across the internet.
Why it’s relevant: This presentation will give you an idea of the capabilities of Globus, how OIT has implemented an initial set of services, and how both OIT and DHTS plan to implement them in support of shared campus research missions. Professor Bass will discuss how his lab has taken advantage of OSG to harvest hundreds of thousands of hours of compute time to help move his research into the first microseconds of the universe forward.
The Open Science Grid or "OSG"(opensciencegrid.org) is community contributed computing services. It is funded by the National Science Foundation and the Department of Energy. It is a software stack that makes it easy to run anywhere. A wide variety of national resources have been committed and it takes advantage of those resources but also uses underutilized resources at national labs and any of the community who are participating. The OSG is a virtual organization designed to fund things like high energy physics, structural biology, and community virtual organizations such as ALICE at CERN. There are also member contributions of staff time, computing resources, and storage resources. The OSG is made up of 125 institutions with 500,000 computing jobs per day and 925 million CPU hours last year (equivalent to 3000 servers with 36 core CPUs).
Duke is using the OSG for research including the study investigating the form of matter as it has existed in the known universe. By colliding heavy atomic nuclei at near the speed of light, we can create temperatures and pressures that equal those of the early universe in a controlled lab environment. The way to study this is with particle accelerators such is the large Hadron collider which sits underground in Geneva between Switzerland and France. There are four dedicated experiments running on the collider, one of which is the ALICE experiment. Pictures of the tracks of the particles that emerge from the collisions are taken. The scales of the problem are such that you are not observing the collision itself but the tracks of the remnants from the collision. The difficulty comes in identifying and analyzing the track of each particle remnant.
Duke is running computer models for these collisions and creating Physics simulations. A few seconds of simulation took an hour to run on a single CPU. Rendering was also done with the actual physics calculation taking over an hour. These are time scales and size scales that cannot resolve with any kind of camera system. To gain insight, if you compare the outcome of the measurement to the outcome of the simulation and if that compares and turns out well, then you can turn the clock back in your simulation to the hot and dense phase of the system with the properties you want to study.
There is significant data on one side that is several petabytes per year which is generated as output by these experiments and on the other side you have computer simulations. By comparing the outcome of these computer simulations to the data reduction, we can gain insights on the properties of the system. The problem that we have is that these computer simulations are complex with a dozen parameters or more reduced into observables (summary forms of these data) in multiple different ways. Each of the parameters act as "knobs" that can be turned in the computer simulation that will affect the simulation predictions on multiple of these observables. In the language of physics, the problem does not factorize. The researcher has to simultaneously tune all of the parameters in order to proceed. This is where the collaboration with the statistical sciences provides assistance. Previously we determined the model parameters, made a rough estimate for these parameters, ran the physics model, calculated the observables, and then compared the data. Now we essentially let the data inform us what the optimum parameters are. By doing an analysis that explores the full multidimensional parameter space simultaneously to find the optimum probability distributions for each of the parameters, we can simultaneously calibrate model parameters and determine the parameter values that best describe the data.
Only 15 years ago, this research was too complex and would have taken billions of years to complete. Statistical sciences was able to show the inefficiencies of this approach and help find techniques to reduce the CPU time needed for the simulations. The solution was Gaussian process simulators which drastically reduced the time needed. If we have different points in parameter space, it is a fast way to interpolate between those points as far as model output or at the observables that we calculate are concerned. Thanks to the Gaussian process simulators, we are down to 2 to 3,000,000 CPU hours per analysis which is still a lot of computing but which is manageable. This is where the OSG comes into play. In a year, over 109 million CPU hours were distributed over 132 million jobs. Of the total hours, Duke has used 10.6 million of them and is the second largest user in this virtual organization. This research is running on dozens of facilities across the United States who have hosted our jobs and have contributed to the CPU time that we have harvested from the OSG. Over the past five years, Duke has used 51.8 million CPU hours for an average consumption of 10 million hours per year. Duke is not directly paying for this usage and money is not coming from a grant (there are indirect costs). The initial application for usage of the OSG was written 10 years ago which granted Duke access and service is relatively low maintenance aside from adding or removing users. Additional applications have not had to be submitted.
One of the disadvantages of the OSG is reliable availability. If the computer clusters are being used, there is no space left. Jobs are processed when the resources are not in use or are being used lightly. While we are averaging 10 million CPU hours per year, this is not a constant stream and our demand fluctuates as a result. Duke sometimes has to wait a month or two to accumulate enough CPU hours. For reliability, you must use dedicated resources such as NERSC (National Energy Research Scientific Computing Center). NERSC requires you submit an application and demonstrate that you can run on multiple CPUs in an efficient way. This year's allocation for the researcher was 10 million hours, significant for a group in an organization. However, the OSG is as important to the researcher as NERSC in terms of resources. NERSC is a Cray with 5500 nodes and 24 cores each that maxes out at 2.5 petaflops. The researcher’s workflow is to submit a job that runs on 1000 nodes (48,000 cores) which runs simultaneously. They can run the 30 million events that they need for one calculation in a day. At NERSC on one particular day, the research was running in excess of 100,000 cores.
The usage profile of the research is that computations run multiple jobs on separate CPUs and accumulate the results without cross-talk between the jobs which is a perfect use case for distributed computing. In comparing OSG and NERSC, OSG allows for unlimited CPU usage depending on availability without the need to apply for each application usage and with very low administrative overhead. This is perfect for students who can experiment without penalties. We can average 10 million CPU hours per year which is comparable to a sizable NERSC allocation. NERSC requires a lot of administrative overhead and it is necessary to spread the allocation over the year. Every quarter, your usage is analyzed and if you haven't used a certain amount of it, NERSC will reduce your allocation and give it to the public pool. This requires planning, preparation, and rigorous testing so that you efficiently use your allocation. That said, this is an excellent, reliable and predictable resource. The OSG and NERSC are complementary. We rely on the OSG when we have used our NERSC allocation.
This research generates terabytes of data which must be transferred for analysis and Globus is the resource we use for the storage. A set of Globus nodes are being deployed on campus for data movement. The Globus project has a current focus on data management (transferring data). It can be used to share the data that you have generated, publish your data in a discoverable way, and develop applications and gateways for workflows. Globus is also migrating to APIs (application programming interfaces) which we are exploring for automation. We have a primary Globus Connect node in our Science DMZ which is connected to the Computing Research storage and Data Commons storage. We can do other connectivity as needed including an object store such as Amazon Blob or a NAS.
Duke has a Science DMZ. This is a network location that is not behind a firewall and is not subject to the normal inspection devices that slowdown transfers. When a user transfers files from outside of Duke to a lab machine, typically the user will encounter the IPS (intrusion prevention system) on the edge of our network as well as other firewalls. This can slow the transfer speed from a 10 Gb network connection to 2 or 3 Gb. The Science DMZ preserves the dedicated 10 Gb speed bypassing the IPS and other firewalls. Security is provided indirectly through intrusion detection services that check outside of the dataflow, thus preserving speed. The Science DMZ has connections to the Internet2's AL2S (a layer 2 service), the RENCI's BEN (breakable experimental network), a 10 gig network connection to UNC/RENCI/NCSU, and an SDN (software defined network) distributed throughout campus which keeps the data in the science DMZ and off the commodity campus network.
Globus is able to take advantage of the Science DMZ for data transfers at fast speed (9 gigabits per second or greater). For those who have issues with data transfer speeds on campus, we're working on a project to facilitate transfers from new facilities such is the CryoEM (electron microscope) in CEIMAS and the Light Sheet Microscopy Core which straddles the Campus and Health system (currently transferring data using portable storage such as thumb drives). We are investigating options to transfer data from the instrument as it is captured through a Globus connection into a repository that is accessible from both sides of the network. Data from these instruments is essentially digital movies meaning the files can be quite large.
There is a user-friendly GUI interface for Globus but there is also a command-line environment for automation. Users can create scripts that indicate data transfers should be done using the Globus client to move data around and delete the content or do computations on it. The Globus organization is working on making it so that you can use tokens for short-term transfer. There is also a Globus Connect personal client that can be installed and used to move data from a central repository to a local computer or vice versa with speeds of around 2 Gb. The Globus client also supports interrupted transfers, either because of an error or "as needed" (for example, the user can start a transfer in one location, close the laptop, and continue the upload at a different location). There is also support for encryption although data transfer speed is affected. And there is support for synchronization of data between locations. Anyone at Duke can access this service by going to Globus.org and selecting Duke.
Q: Do you use the OSG as a testbed for your NERSC jobs?
A: Somewhat. The environments are different and they don't use the same containers. For the science part of the workflow, yes this is possible.
Q: How secure is Globus for those not in our time zone?
A: The service Is protected by shibboleth authentication and the data is encrypted, either in transit or at rest. We are working with Duke Health to defined policies for transferring sensitive data.
Q: Is Globus available to all the campus?
A: Yes. We have a campus "Globus Connect" license with no direct cost to users. There is a BAA under development.
5:05 – 5:20 – Cellular Phone Plans and Improvements, Bob Johnson, David Mixon (10 minute presentation, 5 minute discussion)
What it is: Duke is rolling out improvements to Duke-owned cellular phone plans, including new international and unlimited calling options. Bob and David will discuss what’s next for Duke’s mobile phone offerings, as well as strategic plans for 5G enhancements.
Why it’s relevant: Though the number of university-funded cell phones has been greatly reduced, legitimate use cases continue to exist, and Duke’s global presence in particular creates demand for more flexible and economic plan options.
We recently renegotiated our Verizon contract for Duke-funded mobile devices. One of our goals was to increase data allowances on our pools. We have seen an increase in data usage and wanted to provide insulation from overages charges. Another goal was to simplify plans and features. We had one standard plan but if someone needed unlimited data or the “hot spot” feature, they had to select from other plans offered by Verizon. This could be overwhelming to the user and expensive to Duke. We also wanted to reduce overall costs to Duke, especially for non-standard rate plans for users who required unlimited data or other special features that were not included with our negotiated plan. These specialized plans had a significantly higher cost. Another pain point was overages in general, both domestic and international. International calling has become more important and end-users have been asking for help getting these costs down. There was demand for a clear and easy path for selecting international features.
There are about 300 users who have basic phones. Under the new plan, these users have unlimited domestic minutes and unlimited international minutes if they are calling within the US. They also have 100 MB data allowance to add to the Duke pool. This plan provided more services with a price break from the previous plan. For smart phone users, the existing plan included 3 GBs of data allowance shared in a pool among the Duke community. The previous plan also had 450 domestic minutes and unlimited domestic messaging. The new Standard plan has the same price point but now has 4 GB Data allowance. This plan also has unlimited domestic voice minutes and unlimited messaging.
Duke users on what we are calling “legacy plans” could be paying upwards of $100-$120 dollars a month. Duke now has a new unlimited plan which includes unlimited data, unlimited domestic voice, unlimited messaging, and support for mobile hotspots. That said, "unlimited" does have limits in that the speed may be throttled as part of network optimization, especially if there is network congestion. We want to get Duke users to switch to the new unlimited plan instead of continuing to pay for customized plans.
We also have better International features. Our existing plans could not utilize the Travelpass feature which charges a daily rate while you are abroad utilizing your existing data. Under the old plan, users paid as much as $85 for a 250 MB allowance. Under the new plan, this can also be automated or added “on demand”.
For tablets and "MiFi" devices, the previous plan included 5 GB of data. The new "custom unlimited" for the same price provides unlimited data (after 6 GB of usage, network optimization may apply). There is also a Business Unlimited plan that gives you unlimited data and 22 GB of usage before network optimization. There are smaller plans ranging from 100 MB data allowance to Mobile Broadband which includes 6 GB of data allowance.
We have been working with five test groups across the organization for the first initial migration. While the new plans appear to be simple, we want to make sure the process between the vendor and Duke operates smoothly. The test groups are working with Verizon and getting detailed information about accounts, each of which is analyzed to make sure the device is on the correct plan. Verizon will handle the migration. We will gather feedback from the test groups to identify and resolve any conversion issues. We can then make this available to the rest of the institution. There will be customizations on our portal to make it easier to locate these plans.
Q: Is there a timeline on the expansion of the new phone plans to the rest of the campus?
A: If testing with the initial groups goes well, we may be able to offer this within the next 2 to 3 months.
The DAS or “Distributed Antenna System” which handles the cellular signals on campus on a Duke-owned network and funded by the carriers is available in around 140 buildings. If a building was larger than 20,000 square feet, DAS equipment was installed because of the outdoor signal penetration into the building. Carriers have been adding sectors (a grouping of users) and we are up to 38 sectors, moving to 42. This is significant because the lesser users you have in a sector, the more bandwidth that is available.
The football stadium is reaching completion. There was crushed conduit under the stadium which was a long process to repair but it has been fixed. We are in the midst of installing the last equipment and will do the certification with the carriers. Once this is finished, the stadium will be live and working for Wi-Fi and for cellular communications for the first game of the year. This project has taken 2 1/2 years to complete.
T-Mobile is being added to the DAS. We have gone through the design phase which involves an approval process including not only T-Mobile but also the other carriers because they share the system. This has been completed. We will see significant progress in adding T-Mobile in the coming year. This is an incremental project in that the head-end is brought online and then we can begin activating T-Mobile on individual buildings.
The current data speeds on campus for the 4G network are 50 Mb download and 10 Mb upload which is acceptable but anticipation is high for 5G which could give us 10 Gb download (realistically closer to 1 Gb download). This is a challenge for the carriers outside Duke campus. Carriers are projecting late 2020/early 2021 as the earliest date for general availability. Within six months of a feature or upgrade being "general availability" within a region, the carriers must have this active at Duke. We would like to have 5G sooner and are working with the carriers to make Duke a pilot especially since there are some exciting features with 5G such as possibly replacing Wi-Fi networks.
There are some gaps in coverage around campus. We are investigating getting "macro towers" installed near the parking lot by Ninth Street where we have almost no coverage. This is also an issue between East and West Campus and near Swift Avenue. The macro tower will address this. If everything goes well with permitting, this should be completed in the first quarter of 2019 for AT&T and Verizon customers.
Q: What are compelling use cases for 5G to help justify why Duke would be a good pilot?
A: Outdoor coverage is especially problematic for Wi-Fi. 5G could address this need.
Mobile devices are currently not able to absorb a 5G feed since it has not been necessary to include this. This becomes a “chicken and the egg” question. Some laptops may be able to support 5G but handheld devices are not in a position yet to absorb data at those speeds.
There is a lot of installation and burying of fiber by both Duke and the carriers, all to support a 5G network. There are exciting times ahead but it will take a while to get there.