Austin Schedule

September 19, 2018

8:30 - 9:15am

Breakfast and Registration

9:15 - 9:25am

Welcome and Opening Remarks

9:25 - 10:55am

View Presentation

Workshop: A Data Science Playbook for Navigating Interpretable vs. Predictive Models

This workshop will go way beyond, “Hey, you should use LIME.” Josh will share concrete tips for how to navigate the need to balance interpretable and predictive modeling in the wild. Specific goals for the workshop are to: 1) Explain the difference between interpretable and predictive models, and the difference between local and global interpretability. 2) Cover a few concepts about data collection and data preparation related to interpretable vs predictive models. 3) Dive into which algorithms predict well, which interpret well, and which do a bit of both. 4) Share a decision flow chart showing how to approach a project based on the need for interpretability and/or predictive power. 5) Review Python code examples of what it means to navigate between interpretability and predictive models (yes, LIME is included in the examples). Perhaps the most pervasive argument for model interpretability today is that model consumers and model stakeholders need to trust model recommendations and understand how to incorporate them into their decision making. Without trust and understanding, your model runs a real risk of becoming shelf-ware or being misused. Because of this reality, an understanding of the concepts discussed in this talk are absolutely vital to the success of a data scientist.

11:15 - 11:45am

View Presentation

Product(ive) Data Science

Data scientists are the people behind the scenes, helping others deliver better, smarter results in their daily work. This is especially true for product data scientists who must hone their craft to determine what things are working for a given product, where do we want to take it next, and how can we make product decisions aligned with company is trying to build? Cross-functional communications are critical to success in this role; you need to be able to craft a message that is born out of math to make compelling arguments that are digestible by stakeholders across the business. In this session, we’ll define Product Data Science and discuss contributing factors to success.

12:10 - 1:15pm

Lunch

1:15 - 1:45pm

View Presentation

Detecting and Preventing Cybersecurity Threats Using Machine Learning

In this talk, Stefano will outline some of the challenges involved in solving one of the most difficult problems in cybersecurity: automated threat detection. Without an adaptive model, detecting and blocking malicious actors is exceptionally difficult because of the wide variance of both normal and suspicious behavior. The Duo Security Data Science team built a novel machine learning pipeline that leverages Apache Spark to process authentication data, extract usage and behavioral information, and match it against both historical data and threat models. Stefano will discuss examples of the current state of the art of threat detection, why it is a challenging problem to tackle for most traditional anomaly detection methods, and some of the future directions to improve detection rates.

1:45 - 2:15pm

View Presentation

Panel: Creating a Data Science Flywheel

As data science teams scale, they’re constantly generating new insights and knowledge that aren’t often adequately captured, stored, or leveraged. This leads to re-work and missed opportunities for research breakthroughs that frustrate data scientists and can tarnish the team’s ability to make a business impact. The leaders of data science teams are tasked with building and retaining a team of rockstars, while implementing systems and processes that will help them deliver meaningful results at scale. They must figure out how to create a data science flywheel. In this panel, we will discuss best practices for instilling knowledge management into the data science team’s culture. Attendees will leave with practical advice to help them build a team that accelerates its output with scale, rather than succumbing to complexity.

2:15 - 2:45pm

View Presentation

Fostering Collaborative Learning Environments and Loyal Data Science Teams

Part of being a data scientist means learning all the time. Additionally, aspiring data scientists are already part of many organizations looking to grow their data science expertise. With previous experience as faculty, a co-organizer of Women in Data Science ATX, and mentor to junior data scientists at Dell, Randi has had the opportunity to help craft collaborative learning environments in many organizations. She will share tips and tricks for how to enable people to hone their skills and increase confidence in their contributions to data science within the business.

2:45 - 3:00pm

Break

Part of being a data scientist means learning all the time. Additionally, aspiring data scientists are already part of many organizations looking to grow their data science expertise. With previous experience as faculty, a co-organizer of Women in Data Science ATX, and mentor to junior data scientists at Dell, Randi has had the opportunity to help craft collaborative learning environments in many organizations. She will share tips and tricks for how to enable people to hone their skills and increase confidence in their contributions to data science within the business.

3:00 - 4:30pm

View Presentation

Workshop: Best Practices for Managing the Data Science Lifecycle

In this workshop, we will walk through a framework for successfully managing data science in the enterprise that covers people, process, and technology. We will step through the key stages of the data science lifecycle, from ideation through to delivery and monitoring, discussing common pitfalls and best practices in each based on Domino’s experience working with leading data science teams. Attendees will be provided with examples of Domino’s Lifecycle Assessment and be guided through an interactive exercise to evaluate the bottlenecks in their own organizations. They will leave with a customized physical artifact that can be used to prioritize investment in hiring, process management, or technology acquisition.