Duke ITAC - May 22, 2014 Minutes
Duke ITAC - May 22, 2014 Minutes
ITAC Meeting Minutes
May 22, 2014 Minutes
Previous meeting minutes have been circulated. Please let us know if anything requires correction. Otherwise, they’ll be accepted as a true and accurate record.
II. Agenda Items
4:05- 4:20 – SOURCEfire, Richard Biever, Nick Tripp(10 minute presentation, 5 minute discussion)
What it is: SOURCEfire is an Intrusion Prevention product offered by Cisco. The ITSO is working to implement the SOURCEfire product that will provide real-time threat prevention and the ability to deeply analyze the intrusion, such as which systems are impacted, how extensive is the outbreak and what actions are necessary for recovery.
Why it’s relevant: We will discuss the timeline for the implementation as well as the expected benefits for using this product.
- We currently run IPS Intrusion Prevention Systems in our network. The old version, Tipping Point, was comprised or four devices capable of handling 5Gbps. Q: What does an Intrusion Prevention System do? The Intrusion Prevention System helped to stop malware and bad traffic from coming into Duke. We had it positioned such that it was looking at traffic coming in from the Internet and traffic crossing the network VRFs (network segments). We had an IPS, firewall services and general switching occurring at the VRF layer. If data is sent from one site to another and crosses the VRF, it crosses through both the IPS and the firewall. This same process occurs when data is sent from a site to the Internet. Since all traffic was getting inspected, more load was placed on the IPS devices. In January, a compromised system caused a collapse in the IPS device functionality. Attackers used a compromised host to mine lite coins (a variation of bit coins) and initiate Denial of Service attacks outbound. The IPS devices were consequently overloaded, allowing bad traffic to leak through and start impacting the remainder of the network. Q: One compromised machine impacted tens of thousands of machines? Yes. Lots of small packets were sent and the IPS devices attempted to inspect each individual packet, causing a failure.
- A few weeks later, there was a flaw in the NTP (Network Time Protocol) allowing attackers to use it to initiate Denial of Service attacks using our NTP servers, resulting in IPS devices failing multiple times a day. We were restarting the IPS devices about every 6 hours. Denial of Service traffic was overwhelming the IPS devices even after receiving replacement equipment from Hewlett Packard, moving from 5Gbps to 20Gbps, we still saw spikes of 14-15Gpbs. During one of the attacks, traffic even bypassed the IPS devices because the upper limit was reached. In addition, we had concerns about Tipping Point and their ability to deliver quality definitions and updates to the devices. Q: When you say quality definitions and quality updates, what do you mean? There were several instances where the operating system required an upgrade. In one case, the upgrade did not go as planned and caused an abnormally long outage window to restore normal functionality. In other cases, the vendor shipped definitions to us that impacted network traffic. Q: So there software wasn’t as reliable as expected? No, there were definite concerns surrounding their Q/A (Quality Assurance) processes.
- After upgrading the equipment to 20Gbps, we also partnered with a new Cisco company, SOURCEfire. SOURCEfire makes Intrusion Prevention and Intrusion Detection appliances. Intrusion Prevention blocks bad traffic while Intrusion Detection identifies bad traffic. The equipment was installed initially at the edge of our network to compare its findings versus those of Tipping Point. As the OIT Communications & Data Center Services team brought up the new core network, the devices were transitioned into prevention mode to block bad traffic. The new Tipping Point devices worked as expected. The SOURCEfire devices could be configured in prevention or detection mode and exceeded Tipping Point’s capabilities of malware detection. Once the SOURCEfire devices were moved to the new core and departments were added, we identified 14 compromised hosts within the first week, and now have about 20 that Tipping point was previously unable to detect. SOURCEfire allows us to take advantage of customizations. We are a member of a higher education information sharing group that keeps a list of blacklisted Internet sites colleges and universities have encountered. While it took an excessive amount of time to add this information in Tipping Point, SOURCEfire makes lists like this easy to include.
- After a few weeks of running SOURCEfire devices at the edge or the network, we found our students are very interested in Netflix, Google Video, and Twitch TV (a video site to watch other people play video games. Google bought them this week for $1B) were among the top 10. The top browsers on the network were Chrome, Firefox, and Safari were among the top browsers even though we are most commonly instructed to use Internet Explorer for enterprise applications. There were some security implications surrounding the position of Dropbox. During one of the snow events of 2013, people downloaded almost 4TB of Netflix data. Q: Why did you say you were concerned that Dropbox was being used so heavily? Dropbox has a history of security issues. There have been 2 public cases in the past 2-3 years where a configuration change was made. If someone else knew what your Dropbox account was, they could login to it without your password. People are discouraged from using Dropbox if they can find an alternative. We are in the process of deploying what we know is a much more secure product called Box, which has encryption at rest, but it’s just being deployed.
- Many faculty members use Dropbox today. We are well aware of its high usage across the university, but until a more secure alternative is available for offering, we have not publicly stated to discontinue its use. Dropbox’s data is not encrypted at rest and the company can see everything users do. Another example of security concerns is Dropbox was scanning stored data for content. If a song was found that matched their search criteria for downloaded songs, it was removed from a user’s Dropbox storage.
- The new design will consist of moving the IPS functionality to the top of the network, sitting above our connection to the Internet. Instead of internal network traffic being sent to the Data Center and inspected by the IPS, it will move through the normal switching, firewall, and VRF core switching. We’ll then mirror the traffic through an IDS detection sensor; giving us the visibility to detect a security issue without impacting traffic flow. Q: We won’t stop it and we won’t delay other transmissions on campus? Correct
- Why do we like SOURCEfire from a security perspective?
- SOURCEfire gathers bandwidth and usage statistics. The product provides us with actionable information out of the box. The priority 1 events displayed here are actual infections, malware or compromised machines. Q: There are 5,000 of them? The numbers increased drastically after wireless traffic was added to the new core. This information is representative of 5,000 security risk events on the network in the last day. For example, some of these consist of chatty malware which may only actually be stemming from a few machines. However, each event is most likely a different infection. We are able to drill down into a specific host. In this example, we know the host is communicating with a known command and control server for a botnet that the malware is using. We can now view the hosts’ profile. SOURCEfire creates a profile for every host it sees for the network we tell it to issue watch. We can now see the NetBIOS name, associate MAC addresses, the O/S it’s running, and sometimes a DNS name depending on what SOURCEfire captures. This functionality allows us to see a machine’s location and its owner, allowing us to contact them. Tipping Point only provided an IP address, which has proven difficult to translate into actionable information on campus.
- SOURCEfire stores a small amount of connection data, allowing us to see network connections as their happening in real-time – not just intrusion events. This information is crucial when an incident occurs to assist us with troubleshooting what has been compromised, why, and where the traffic is going. This information did not existing in Tipping Point.
- We also purchased a product called FireAMP (Advanced Malware Protection)  that looks at files as their coming across the wire and detects malware, disassemble it, create a virtual host instance and provide information on what the malware is, what it does and where it’s contacting. We did not have visibility into this information previously; only the ability to block a hit. Q: Is this sort of a honey pot on demand, being able to watch what the malware actually does? Yes. This helps a lot with false positives, where Tipping Point previously identified something bad, and a machine was removed from the network and/or an account was locked. AMP tells us not only that something bad occurred, but why it was bad and allows us to make more informed decisions on which action to take next. If this product was in place during the incidents in January, we would’ve been able to identify the infected machine much more quickly.
- Another desirable feature of SOURCEfire is the idea of roles and views. For example, as the number of security events climb, rather than higher a staff of 20 people to parse them, different views can be created to say the OIT Service Desk and/or different departments to display where network problems are occurring, allowing a quicker resolution.
- Another reason we like the SOURCEfire solution is the flexibility to create rules for routing network traffic. For example, since all the Netflix network traffic is routed through a single MCNC appliance/proxy, we’ll use custom rules to route the traffic around the IPS devices. The feature allows us to eliminate some of the noisy traffic and create more meaningful analysis around the other high usage applications on the network.
- Q: What should we be afraid of? Many of the attacks being catalogued are not being aimed at servers. Part of that is because we do not have the Data Centers behind the new core network. The 5,000 events displayed here are all related to end-user devices. What that tells us is the attackers are going after laptops and desktops. For example, if a Trojan infects an end-user device, it eavesdrops and captures passwords that could be used to login to a server.
- Q: You mentioned SOURCEfire is being used to monitor the wireless traffic. Is it safe to assume, the Visitor network is also being monitored using SOURCEfire? Out of those 5,000 events, is there a way to distinguish which ones are from visitors? We will put rules in place to block bad traffic to end-user devices. We’re treating the Visitor network as a hostile network and it will exist on the outside and all Visitor network traffic will pass through the IPS. We are able to determine which direction attacks are coming from since the Visitor network appears like any other Internet host.
- Q: How can we educate members of the Duke community on how to make their machines more secure when connecting to the various networks available when traversing between the University Visitor and Medical Center Guest networks? Is it a security issue that internal Duke users who cross between the Medical Center Guest network and the University Visitor network? The Visitor network is not treated as an internal network because users are not easily identified. The Visitor network does not currently have the same controls as internal networks. For example, we are looking to introduce WPA2 (Wi-Fi Protected Access 2) which encrypts wireless network traffic, later this year. In effect, the Visitor network is equivalent to that in a public library, hotel, coffee shop, etc. Our recommendation is to be prepared to connect to the Duke VPN to ensure data encryption and integrity.
- We’ve talked about so many of these devices being attacked and infected are end-user devices. Historically, we have been fairly flexible in our approaches to security and based how high the threshold is on how sensitive the data is. Since we’ve already seen multiple examples of how a single infected host can compromise the network and beyond, there may come a time when we’ll need to be more vigilant and intensify expectations for security and the rigor of updates and the like even on devices for people not using sensitive data.
- The Box agreement was signed April 1st. A technical group has been working through account creations and ensuring default settings are suitable and consistent for the Health System, along with examining the configuration and documentation changes needed for deployment. We have reached out to current Box users, and are developing a product information sheet, along with links to the website content and provisioning process. We are looking to auto-provision folders and sub-folders related to classes through service accounts to ensure they are not revoked when a student or graduate student who might’ve provisioned them previously leaves the institution. Q: Will a “Box for Dropbox Users” document be created? We have the names of some of the Dropbox big users/spenders in the last year. Q: It’s been 7 weeks since the contract was signed. Are we half-way through? We have been working to get policies well documented and are planning to do soft launches with internal pilots at the Law School and other places. We are expecting the product to be in production in the next 4-6 weeks.
- As the new core network is brought up, we have some infrastructure to monitor throughput in various places on campus. We’ve had some issues with inter-VRF traffic on the IPS devices as we transition to the new network.
4:20- 4:35 – Secure Data Policy, Richard Biever, (5 minute presentation, 10 minute discussion)
What it is: The Secure Data Policy formalizes data classifications, responsibilities of data stewards, managers and users as well as other key data security requirements.
Why it’s relevant: We will review the final draft policy with ITAC. The policy is attached to this email and will be distributed at ITAC.
- Since we’re contending with laws, regulations research contracts, and data in different areas, the purpose is to formalize the framework for data classifications, responsibilities, and users.
- Data is classified as sensitive, restricted, public and people and should be made available to anyone who has to use it or who require its use as part of their job. We are defining the responsibilities for data consumers, Duke as a steward of the data, and researchers as stewards of their data.
- Q: Has the Duke Data Classification Standard Document been moved? The link is broken. A Google search will bring up the correct document.
- Q: From a faculty perspective, this comes up every time we apply for a grant. Are you in contact with them? We work closely with Lorna Hicks to ensure information is communicated. Please email Richard Biever (email@example.com) with questions, comments or concerns.
4:35- 4:55 – Ivy+ Update, Charley Kneifel (Infrastructure), Chris Meyer (Administrative Systems) (10 minute presentation, 10 minute discussion)
What it is: Representatives from Ivy League schools meet on an annual basis to discuss and share information in various areas. Topics range from overall university directions, budgets, projects, online learning tools and daily operations.
Why it’s relevant: Sharing experiences and discussing challenges with our peers helps to provide a collaborative environment where ideas are formed and problems are solved. OIT recently participated in the Ivy+ meetings for Infrastructure and Administrative Systems and would like to provide an update on those meetings.
- Almost all the participating schools are being subjected to Phishing attacks and are implementing mandatory multi-factor authentication for employee self-service. Even the schools using workday in the Cloud are being impacted.
- Most schools are migrating from Remedy to ServiceNow after failed Remedy upgrade attempts
- Cornell reported that the Cloud and on premise SharePoint solutions are different. Migrating on premise SharePoint to the Cloud has proven difficult at best.
- Princeton is planning to create a center for excellence for reporting and data warehousing.
- MIT’s Project Atlas has found a way to replace the dated SAPweb and SAPweb Self Service. They have a pool of developers that have written SAP Gateway services to expose underlying data within SAP, along with Java front ends.
- Schools are gradually moving from Blackboard and Sakai to Canvas, another open source solution.
- Workday continues to gain traction among peer institutions. The trend appears to be engaging IT earlier in the project lifecycle. Reporting remains a challenge in the absence of a data dictionary and scheme within Workday.
- The two predominant student systems are PeopleSoft Campus Solutions and Ellucian Banner. Workday is developing a cloud solution for students, but it’s so far down the roadmap, most schools are not considering it as a viable option.
- PeopleSoft upgrade roadmap – The pain points around the product are the upgrades and the user interface. PeopleSoft has decided to address some of this in the 9.2 release. The fluid U/I is a refresh of their user interface and enables personalization by roles. The U/I will be available as part of the Human Capital Management and the Financial and Supply Chain Management modules, but will not a part of Campus Solutions. Updates today have been labor intensive, requiring manual intervention. PeopleSoft is addressing this issue with the PeopleSoft Upgrade Manager (PUM) Customers using Tools 8.5.4 and PeopleSoft 9.2 can select either parts or an entire upgrade from cumulative upgrade images released every 10 weeks. Major cumulative releases will be available every 3 years. People Tools releases will be made available every 6-18 months. PeopleTools 8.5.4 will be made available in July 2014. The earliest Duke would be able to perform a Tools upgrade would be summer 2015. Campus Solutions 9.2 will be released in Q4 of 2015 with PUM, but will not have the fluid U/I. Perhaps the earliest Duke would upgrade to version 9.2 would be summer of 2016. PeopleSoft has not yet committed to a roadmap for releasing the fluid U/I for Campus Solutions. They are planning to start with the self-service functions. They have committed to releasing what’s called feature packs after the upgrades and are expected to be available in early 2016.
Charley Kneifel - Duke hosted the Ivy+ Infrastructure Group last week
- Representatives from all the schools were in attendance. (Duke, MIT, Chicago, Stanford)
- There was a large amount of change in senior leadership, at many of the schools. When there were new CIOs or new EVPs of Finance, there was more discussion about cost assessments, and cost recovery visibility. The majority were moving toward allocation-based models rather than detailed models, with the exception of a new EVP wanting to drill down into significant costs analysis.
- Network virtualization continues to grow. MIT has virtualized their PeopleSoft environment and is running the virtualized network on top of their core network.
- The automation update we gave was well-received with many schools expressing interest but are still determining whether they will use it. Q: You’re referring to Clockworks? Yes Cornell
- Data Center Growth – some schools have built out large research data centers.
- VDI – Virtualized Desktop Infrastructure –
- There were disconnects between networking and service delivery.
4:55- 5:15 – CSG Update, John Board, Mark McCahill (10 minute presentation, 10 minute discussion)
What it is: The Common Solutions Group works by inviting a small set of research universities to participate regularly in meetings and project work. These universities are the CSG members; they are characterized by strategic technical vision, strong leadership, and the ability and willingness to adopt common solutions on their campuses.
Why it’s relevant: CSG meetings comprise leading technical and senior administrative staff from its members, and they are organized to encourage detailed, interactive discussions of strategic technical and policy issues affecting research-university IT across time. We would like to share our experiences from the recent May 2014 meetings.
- Identity Landscaping: What is the next big thing Higher Ed should do to enable shared identity around (listen to recording) - Trusted Identity in Education Research Reference IdM SAN Box and creating facades to make it easier for cloud vendors to talk to. Increasing inclination of identity proofing. Federated attribute release for research work – an end-user interface for controlling. Harvard Catalyst – Sees who is collaborating
- Cloud Migration Campfire Stories – What are the common lessons learned? Stanford is heavily invested in AWS – Amazon Web Services . HIPAA leak –
- Topcoder – cimilar to Co-Lab gone pro – a way to through a project out to the world of programmers and the winners of the project submissions will
- Analytics – Accessibility: Faculty members are no longer permitted to produce media unless they produce a corresponding version that will be accessible. Notre Dame is working on a POC called Data driven decision making – however, if you didn’t have access, it tells the end-user what kind of access is required to gain access. Is there a difference between public and private since the funding is driven by performance, student attendance and the like?
- Teaching and Learning Spaces – UC Berkeley did surveys by walking through the space with a camera.
- Were there discussions around when Cloud Vendors go out of business?