ITAC Meeting Minutes
March 27, 2014 Minutes
Tonight at 7pm the decisions will be released to the 30,000+ students who applied to Duke for 2,600 spots. Last year 10,000 people accessed the system during the first 5 minutes. There have been no issues since the first year. 9% of applicants are accepted under regular decision.
The new Director of Internal Audits - IT Terry Follmer was introduced. He was previously at Boeing in Seattle, WA.
II. Agenda Items
4:05 - 4:20 – Splunk, Richard Biever, Phillip Batton, Jeremy Hopkins, Eric Hope (10 minute presentation, 5 minute discussion)
What it is: Splunk is an application used for management and analysis of server information found in logs. Duke has begun using Splunk to facilitate the review and investigation of logs from various systems, devices and applications.
Why it’s relevant: We will demo how the teams with OIT and the Security Office have been using Splunk to identify, troubleshoot and remediate issues affecting the University.
What is Splunk? It is a log analysis tool that allows you to search logs from multiple servers and services to correlate and find data in a quick and efficient manner. It was purchased last summer and has since been put to use by a variety of OIT, ITSO and campus groups.
How has it been used in the last 3 months?
The Enterprise Internet Services (EIS) group in OIT supports email for the university and health system.
- With Splunk, EIS was able to see that over time there were consistent spikes in usage and delays which led to the procuring of additional servers to accommodate the load. The infrastructure changed from 6 machines to 16, 4 are used for international delivery.
- EIS has also been able to use Splunk to help pinpoint connections from bad actors who have compromised Duke mail accounts.
- Lastly, Splunk has been used to catalog what phishing messages Duke has received.
- ITSO is using Splunk to tell who has received phishing messages (based on the information above). The next step was to identify who fell for the phishing message by determining the source of the messages and see what Duke NetIDs are logging in from the known "bad" IP addresses. This method is largely dependent on our community letting us know when they receive a phishing message.
- Other types of things that we look for include failed login attempts or multiple logins from a single IP address.
Questions and Comments:
Concerns raised about privacy related to NetID search. Currently only used for purposes of investigating security incidents by a very small group of people. The same rules apply as if someone were requesting to see someone else’s email per the provisions of Duke's AUP. Also working on a way for users to view their own login activities within OIT Self-Service page.
How to notify ITSO of phishing? Users should contact their local IT, service desk or ITSO. The current email to contact ITSO directly is email@example.com(link sends e-mail). If you contact them, please include as much information as possible including full email headers if available. Suggestion was made to create firstname.lastname@example.org(link sends e-mail) and/or email@example.com(link sends e-mail).
Can this tool be used for other types of logs? Yes. It can use pretty much any data source you can think of. It’s very useful for both security and operational (e.g. debugging) purposes. The current Splunk license covers research use as well.
4:20 - 4:45 – Duke Alumni Network, Scott Greenwood, Brett Walters (15 minute presentation, 10 minute discussion)
What it is: Duke is building a new online alumni network to replace the current Alumni Directory. The project is being co-led by Alumni Affairs and OIT, and will launch in Fall 2014. The new site will offer expanded features, including a faceted alumni searches, group pages, event setup and registration as well as integrate with social networks. The new network will include current DAA sites, rebranding the entire site as “Duke Alumni” to be inclusive of all professional school alumni.
Why it’s relevant: The network will leverage the new Duke social login being developed by OIT’s Identity Management Group, connect with the SAP Customer Relations Management Module using a web service layer, and be hosted and supported by OIT.
Vision: To create the world’s most connected and engaged alumni network that effectively and efficiently delivers value, impact and measurable connections.
- Most of the alumni are not located locally (approximately 10% living in the Triangle).
- Currently have an alumni directory, but the contract with the company that supports that is expiring. Duke Connect and Event Registration will also go away very soon.
- Goal is to serve all undergraduate, graduate and professional alumni under one umbrella.
- Connecting to other institutional resources.
- Current contract and services ending
- Current system was not sophisticated enough to meet our growing needs
- Huge demand for services that we were unable to deliver with the existing system
- 15,000 interviewers across the country could not be integrated well in the current system; a lot of manual intervention
- Interest in improving the end-user experience
- Unique Value that doesn’t exist outside of Duke with LinkedIn or Facebook; Duke profile that isn’t a copy of what you might find on LinkedIn
- Activities they were in while a student as well as current activities
- Events (25,000 alumni and friends attend 400-500 events per year); allow alumni volunteers to create and push out registration forms as needed
- Groups for regions, professional schools, user-created
- Single-sign on (Social SAML, Oracle Identity Manager and Shibboleth)
- Conscious of privacy, transparency and ease of changing settings
- Measuring success of the network
- Flexibility as needs change (recent request for LGBT Mentoring launched in January is mostly manual process)
- Improved communication with and between alumni (4.5 million emails sent this fiscal year); moderation is not required for all inter-alumni communications
- Collaboration with students (DukeConnect)
- Mobile Apps (e.g. campus check-in)
- Recommender System (“You might be interested in…”); currently undifferentiated communications, but would prefer to match alumni interests and background with events and opportunities that are more specifically targeted and pertinent to them
- Social network integrations (use LinkedIn to populate initial profile, Duke-wide license for Gigya which manages all the APIs for various social networks), badges (e.g. rewarding volunteerism, event attendance, etc.)
- August 2011 project was started, goals and outcomes defined
- November 2013 received approval to proceed with project
- Build phase has been kicked off
- Beta testing in Summer 2014
- Launch in September 2014
Questions and Comments:
Do Alumni get to keep their NetIDs? The current policy is they can keep their NetIDs (and email address) for a year. At the end of that year, they receive a notification of termination of their NetID and are allowed to create an alumni.duke.edu email alias.
Would there be any benefit in retaining the NetID for life? Forwarding ID for life is becoming more common among most institutions. Maintaining username/password for users in perpetuity presents an organizational and security concern.
ACES integration – students have need to log in after they have graduated to request transcripts.
Currently there are 32,000 forwarding email accounts with about 2,000-3,000 additional created each year.
How will success be measured? % of profiles completed, time on site, overall traffic, tracking of attendance at events and activities. There will also be some feedback forms for users.
How will events be linked with the Duke Calendar? Some of the categories were borrowed and the calendar events are shown as a feed. A lot of the activities are not in the Triangle so some things will fall outside this scope.
Connection with the student records is important, especially for faculty who would want to follow their students through the years and what they’re up to after they leave Duke.
5:15 – 5:30 – Archival Storage, Charley Kneifel (10 minute presentation, 5 minute discussion)
What it is: OIT is working with various researchers to implement storage systems that support large amounts of data at a low cost per year. A recent National Institute of Health (NIH) grant has received favorable review and the library has expressed interest in storage that will allow them to store large data sets as well. This solution is optimal for data that does not change after it is collected or curated.
Why it’s relevant: In addition to the other storage offerings from OIT (SAN/NAS/Archive), this will offer a lower cost option that will allow researchers to store large quantities of data. The solution will also meet data retention policies required by the granting agencies or Duke.
Multiple storage options:
- Tier 1 storage for SAP which does replication of all writes between two different data centers. This is very costly.
- Tier 2 SAN is block level storage, replication between data centers
- Network Attached Storage (NAS)
- Common Internet File System (Windows, NFS)
- Secure NAS for the protected network
- Research storage: High performance storage, large repositories and Tier 4 archival storage
NIH S10 (equipment) grant for storage infrastructure was re-submitted about a year ago. Request was made for 1.5 PB (petabyte) of storage (disk and tape, EMC Isilon) for a data commons for data producing core facilities within the Life Sciences. Currently this data is being stored in multiple drives belonging to each of the faculty members in this group. This data is spread out, but would be preferable to have this data co-located that could be cross indexed.
HPC needs (Duke Shared Cluster Resource has 30 TB of storage) – NetApp storage is hard and costly to scale up and is currently about 10 times too small for the current needs of Duke researchers.
A lot of departments and schools are still hosting their own storage. As long as the cost per gigabyte is reasonable relative to what is possible through OIT, they will most likely continue hosting their own. Most of this data is internal, day-to-day files, not high performance data.
Library doesn’t have high performance computing need, but this is offset by the need to keep things forever. Digitizing of materials may net 1 PB within 3 years and then 1 PB each year thereafter. Materials may not need to be available at all times, but should be available if and when they are.
Prefer dynamic, as-needed storage provisioning (e.g. 100 TB in 3 years broken down to automatic provisioning in 10 TB chunks as needed).
Currently have 1.7 PB in the backup environment (complete duplicate). Streaming media is not compressable or dedupable. Current capacity is 4-6 TB per tape.
Working on object stores now: file storage without the file system. This allows moving applications into the cloud easier. There is relatively low cost to doing this (lots of cheap disks).
Questions and Comments:
Is it possible to have a system where less frequently used data is migrated to slower storage? Yes, but this can be tricky because other bottlenecks typically emerge.
Cheap archival solutions are critical for freeing up space that can be used by other projects.
3 year life cycle for disk arrays (can be replaced by something cheaper that’s twice as large)
How do you locate things when data storage capacity is constantly increasing? How do you make sense of what is stored there? Do we need to store everything (e.g. sequence data)? It may be cheaper to reproduce the data set than it would be to store the data. Some users and groups don’t have or want to take the time to weed through their data to determine what needs to be kept and what does not.
4:45 - 4:55 – BOX, Tracy Futhey, Billy Willis, Charley Kneifel (5 minute presentation, 5 minute discussion)
What it is: BOX is a cloud based service that allows data storage across all clients, including Web, IOS, Macs and Windows.
Why it’s relevant: Duke plans to offer BOX.com(link is external) services to all campus and medical system users. Features of BOX include easy upload of content, organizing data into folders, sharing links to files and managing file/folder permissions.
After a two year process, received final BAA from Box and the contracts have been taken to the university and hospital’s respective legal councils. Currently working out pricing models with Box. Effective April 1st, we will have an active contract with Box to provide institution-wide (campus and health system) storage.
Currently about 2,000 accounts with a duke.edu email on Box that will be migrated into the Duke account. Email will be going out to these users to notify them of this change and request they move their account to a personal email if they do not want to be migrated for any reason. Shibboleth has already been configured. Still need to work out user quotas and account provisioning. Student and class sharing might be available for the summer term.
A check of procurement card purchases for last year show many individual departments were using Dropbox, so as we migrate to Box, please inform the other departments and faculty that it is available. Box has Shibboleth authentication (single-sign on) and is encrypted both in transit and at rest, so it is a more secure option over Dropbox.
Questions and Comments:
Is there an efficient way to covert Dropbox data to Box? Should be able to drag and drop files from one to another, but there probably isn’t an export utility. There might be an issue with folder security when doing this.
Concerns that Mac OS X clients are not as good as the Windows client available and no clients are currently available on Linux.
4:55 – 5:15 – NC Next Generation Network, Elise Kohn (10 minute presentation, 10 minute discussion)
What it is: North Carolina Next Generation Network (NCNGN) is a consortium of 4 universities and 6 communities working together to encourage the private sector to deliver affordable, gigabit speed broadband to their communities. This presentation will provide a brief update on the project and discuss how NCNGN efforts fit with other recent announcements about potential gigabit speed network deployments in the Triangle or other parts of North Carolina.
Why it’s relevant: Affordable, gigabit speed broadband connections can allow Duke faculty and students engaged in data-intensive projects to work more seamlessly between the classroom, office, lab, and home.
We ran out of time so this topic has been postponed for another meeting.