Duke ITAC - November 3, 2016 Minutes

Duke ITAC - November 3, 2016 Minutes

I.   Announcements

  • The minutes for the September 22, 2016 and October 6, 2016 meetings were approved as written.
  • Ricky Bloomfield, Director of Mobile Technology Strategy (DHTS), recently accepted a position with Apple.  Dr. Bloomfield gave an ITAC presentation at the September 9, 2015 ITAC Meeting on a new screening tool for Autism developed at Duke using Apple HealthKit.
  • DDOS Update:  Mirai Malware is malware that turns Internet of Things (IoT) devices -also known as smart devices - into “bots” that can carry out large-scale network attacks.  Mirai has been used in 3 recent large-scale attacks including the September DDoS attack on the krebsonsecurity.com site, the October attack on DNS service provider Dyn that affected high profile websites like Twitter and Netflix and the latest attack on the African nation of Liberia.  These attacks can be followed on Twitter @MiraiAttacks.

II.   Agenda Items

4:05- 4:25 – July Security Review Richard Biever (10 minute presentation, 10 minute discussion)

What it is:   This past summer, the IT Security Office engaged 2 Chief Information Security Officers and a former Chief Information Officer from peer universities along with a security industry expert from Cisco to review Duke’s security program, assess the current state, and recommend strategic initiatives to consider in the coming years.

Why it’s relevant: The assessment provided immediate feedback on recommended steps the security program can take to improve Duke’s security posture, as well as confirmed how the current initiatives in flight will continue to provide an appropriate level of security to Duke’s network, systems, and data.

IT Security Review:  Duke took part in an external IT security review this past July.   The review team evaluated Duke’s framework for critical security controls as well as higher education information security control framework. 

 

Questions and Discussion

Question:  There seems to be a lack of awareness among faculty regarding which cloud services they can use and which ones they should stay away from.  Can this be communicated in a simple straight-forward way?

Answer:  Duke Security offices have developed a guide to help users determine which services can be used based on the sensitive nature of their data.  The guide is located at https://security.duke.edu/policies/duke-services-and-data-classification.  The IT Security Office can discuss this guide in greater detail at a future meeting.  In the meantime, we will continue to look at ways to better communicate this to the Duke community.

Question:  People are not satisfied with Box due to interface issues.  Are there alternatives to Box?

Answer:  We are urged to use Box rather than Dropbox, as Duke has a contract with Box and it has been approved for use with sensitive data.

Question:  How should Duke use the cloud?  Should we be using it more than we are? 

Answer:  The reviewers weren’t recommending that we use it more.  We just need a strategy for how we use it, and the security implications that follow. 

4:25- 4:50 – Computational Journalism: iCheck: Jun Yan, Ph.D., Brett Walenz (15 minute presentation, 10 minute discussion)

What it is:  In the past, we have come to rely on traditional news organizations for investigative reporting to hold governments, corporations, and individuals accountable to society.  With technological advances, movement towards transparency, and the decline in the support of traditional investigative reporting, there has been a profound impact on the well-being of democracy.  Computational journalism aims at developing computational techniques and tools to increase effectiveness and broaden participation for journalism to help preserve its watchdog tradition.

Why it’s relevant:  Are the claims you hear in the media based on good analysis of data? Is there a difference between “correct” and “right”? Just in time for Election Day, join Jun and Brett to learn how they built a system called iCheck – with the help of many wonderful OIT services—to combat dubious claims on congressional voting records that make you say, “yea but…”

The Problem with Numbers:  Brett Walenz, a graduate student from the Department of Computer Science working in the area of computational journalism, said that numbers are used in a lot of areas in journalism to drive up viewer interest and engagement.  And while facts and numbers can be correct, they can also be misleading.   

In politics, numbers are used to make facts appear more indisputable and to help persuade or support a point.  During this election season, we’ve seen data and claims proliferating, leaving many who lack the skills and patience for analyzing and critically interpreting results to accept the points being made. 

Computers Can Help:  Computational fact-checking is a way to formalize intuition in a mathematical framework using computer algorithms.  Brett and Dr. Jun Yan created iCheck (http://icheckuclaim.org), a new tool to make fact checking easier, faster and more accurate.  iCheck lets users quickly explore data and compare how legislators have voted under different contexts-- over time, among groups of peers and for “key votes” identified by lobbying/political organizations.

The Making of iCheck:  iCheck is based on “Perturbation” analysis – evaluating the effect that parameters such as time have on conclusions when all other data is fixed.  Data and Visualization Services provided helpful advice and feedback in the design of the website which is hosted on the public site rtoolkits.web.duke.edu.  The source code lives in gitlab.oit.duke.edu and development takes place within the vm-manage.oit.duke.edu environment 

Demonstration:  Brett gave a demonstration of iCheck, showing how changing a parameter such as time, can tell a complete story.  By simply dragging and/or resizing the time bar on the Trend View tab, Brett showed how cherry picking a specific time can result in a conclusion that supports a specific view.  Viewing the data over a larger timeframe showed a different conclusion.

 

Questions and Discussion

Question:  How heavily is the site being used?  Answer:  The site was finished late in the election so it was not used extensively. There were some views by individuals from the Library of Congress. 

Question:  Who funded the project?  Answer:  It was funded by a grant.  The project has a broad range of applications, not just political – which was used as proof of concept. It is about data analysis. 

Question:  There isn’t a lot of coverage on voting at the state level.  Does the data exist?  Answer:  Yes, but the data would require extensive cleanup.

 

4:50- 5:10 – Research Toolkits, Mark McCahill (10 minute presentation, 10 minute discussion)

What it is:  Research Toolkits provides Duke faculty researchers and their designees with self-service, on-demand access to Virtual Machine (VM) environments and scratch storage. Research Toolkits also allows faculty to set up projects, groups and their members and define the roles of each person involved in a project. This allows per-project management of access to IT resources. 

Why it’s relevant:  The Research Toolkits services has moved from a pilot to production status, and all Duke regular rank faculty can claim a basic allocation of computing and storage via the Research Toolkits website.  The basic VM/storage allocations can be augmented for those who need more resources, and faculty can pool their resources for joint projects.  There are plans to add additional services to Research Toolkits, including more data analysis suites, data sets, and applications.

What is Research Toolkits:  Research Toolkits is a service that recently moved from pilot to production, giving Duke Faculty researchers and their designees (exceptions are granted by Mark DeLong) access to on-demand short-term Virtual Machine (VM) environments and scratch storage.  Specifically, Research Toolkits allows users to:

    • Define projects and people’s roles within projects.
      • Faculty can define as many projects that they want and name them whatever they like.
      • Faculty can define roles once in a project and apply those roles across all resources.
      • Roles are set up as groups which populate Grouper/LDAP/Active Director and can be used outside of Research Toolkits.
      • Allocate resources such as compute and storage.
      • Data commons scratch space - SMB/CIFS file store which can be mounted on Windows, Mac or Linux.
      • RAPID (Research Accelerating, Preconfigured, Individual, Dynamic Machines) – A VM reservation service, with project role-based permission management via Research Toolkits.
      • Share resource allocations across projects.  
      • There is a base allocation, but faculty can pool resources (started in September).

Question:  How are resource allocations handled?  Answer:  Faculty have the primary role of allocating resources; however, they can delegate that role to others on their team.  There is a base allocation of resources.  The base allocation is 4 cores, 16 GB RAM and 25 GB hard drive storage.  Additional resources can be requested by contacting Mark DeLong. 

Getting Started:  Faculty can set up a new project by logging into https://rtoolkits.web.duke.edu, creating a new project, adding a project description and agreeing to the terms of the service.  The new project will show up in the project list.  Roles can be defined and resources allocated by clicking on the new project.  Faculty can manage access by adding or removing members and teams and by delegating administrative rights to others.  A demo of the service can be viewed at https://warpwire.duke.edu/w/uvoAAA/.

Service Usage:

    • 230 users, 95 projects
    • 207 total administrative roles named across all projects
    • 3 projects where faculty pooled allocations

Resource Usage:

    • 29 data commons scratch storage instances
      • 24 instances of 100 GB, 4 instances of 50 GB
    • 84 virtual machines instances
      • 60 (Ubuntu 14.04), 9  (Ubuntu 16.04), 6 (Win10), 4 (RHEL7), 3 (RStudio), 1 (Jupyter), 1 (Galaxy)
      • 4 core VM (>50%), 8 core (~40%), plus 1, 2, 16, 24 core instances
    • 2 Jenkins continuous integration service instances

What’s Next:  There will be more VM templates and applications.  Other possibilities include common datasets, genome databases, and integration with Duke Data Services to document provenance of analysis tools and data.  Duke Data Service (DukeDS) is a service that provides a secure, central data store for researchers to access research data. If there are other needs in addition to the ones listed here, email research_toolkits@duke.edu. 

 

Questions and Discussion

Question:  Docker is becoming popular both nationally and on campus.  How can we leverage our local technology with what is happening nationally? Answer:  If you start using Docker outside Duke, we will find a way to run it on campus. We are already using Docker for many applications.

Question:  Can faculty invite collaborators from outside Duke?  Answer:   Once you deploy a VM, you can give access to others outside Duke because you have administrator rights on the VM and because you can create/manage accounts local to that VM.  The provisioning account the faculty member uses to create the VM must be a Duke account, but accounts you create on the VM can (optionally) be accessed by collaborators outside of Duke and without a Duke account.

 

5:10- 5:30 – HackDuke Update, Yixin Lin, HackDuke Director (10 minute presentation, 10 minute discussion)

What it is:  HackDuke is about collaboration, exploring the intersection between technology and social good, and giving back.  Undergraduate and graduate students from across the country are divided in to teams of up to 5 and are challenged to merge technology with social tracks of impact.  This year’s tracks were Inequality, Energy & the Environment, Health & Wellness, and Education.

Why it’s relevant:  HackDuke is not just about building meaningful projects.  It’s an open forum to discuss, share and bring to life ideas that aim to make a positive impact on social issues.  The annual event challenges students to think beyond the classroom to make a difference in the lives of others by “Coding for Good.”  Yixin will discuss the overall goals and progress for HackDuke 2016 as well as the expansion of HackDuke’s community presence as an umbrella organization.

HackDuke Update:  Yixin Lin, junior and HackDuke Director, gave an update on HackDuke.  HackDuke is a student run organization started in 2012 that pushes the boundaries of technology use for social good and giving back to the community.  This year’s 24-hour hackathon takes place on November 19th-20th.  The entire event will be free and will include everything needed to complete projects including all hardware.  The winning team of each track will be awarded prize money that they will donate to a nonprofit of their choice.  This year’s tracks include:

    • Energy & Environment
    • Health & Wellness
    • Inequality
    • Education

Participants:  Hundreds of students from all of the country and sometimes the world come to Duke to participate in the hackathon for a technology festival and competition.  This year, we are aiming for 50% Duke students and 50% non-Duke students.  Teams of 5 members will compete in one of the four tracks.  Target attendance at this year’s hackathon has been scaled back from 800 in previous years to 500-600 this year.  Attendance was growing exponentially, and we didn’t want the size to affect the event’s core goals.  

Goals:  One of the goals of the hackathon is to showcase Duke engineering, computer science and related majors to sponsoring companies and non-profits.  Through their participation and sponsorship, companies are able to market their products and recruit.  Another goal is to partner with non-profits to provide speakers and technical mentors to run workshops throughout the year so that everyone at Duke has the opportunity can learn or improve their technical skills.  Other initiatives include reaching out to the Durham community to be an advocate for technical skills.

Sponsors:  The sponsors for this year’s event are impressive.  They include Google, Facebook, Duke Student Government, and coinbase to name a few.  The full list of sponsors can be viewed at https://hackduke-2016.devpost.com/.

 

Questions and Discussion

Question:  What did you do to scale back attendance?  Answer:  We get thousands of applications each year.  We limited the number of schools, busses and travel reimbursement.  The application process is more rigorous this year.

Question:  Which disciplines are sponsors most interested in recruiting?  Answer: Software engineers.

Question:  What is budget this year?  Answer: $70,000.