The xBK team is responsible for security trading at Mellon Bank. As a part of that we analyze execution costs. Execution data is observational rather than experimental. For a variety of practical, legal, and ethical reasons, it is not possible to perform randomized trading experiments. Without recourse to experiment, we must contend with confounding among the variables in our data. In this study, we use Causal Random Forests, an extension of tradition Random Forest algorithms, to control for the confounding relationships and to estimate treatment effects. Details of the motivation, algorithm, and outcome are presented.
Although there are an increasing number of commercial AutoML products, the open-source ecosystem has been innovating here as well. In the early days of the AutoML movement, the focus was on those looking to leverage the power of ML models without a background in data science – citizen data scientists. Today, however, AutoML tools have a lot to offer experts too. In this presentation, Domino Chief Data Scientist Josh Poduska will dive into popular open source AutoML tools such as auto-sklearn, TPOT, MLBox, and AutoKeras. He will walk through hands-on examples of how to install and use these tools, and highlight special features of each while providing Jupyter notebooks so you can start using these technologies in your work right away. Those who wish to follow along interactively during the presentation or download the notebooks can do so by signing into Domino’s trial version. Create a free trial account here.
Over the last few years the application of Machine Learning (ML) has proliferated. The application of ML in multiple problem domains is a broad topic. In this session, together we will cover multi-variant applications of various ML models in the consumer and industrial space. At the very least, Manimala will discuss usage of conversational ML, image recognition and diagnostics ML. If there is more time, Manimala will add arbitrage ML, speech recognition, search and scheduling in the discussion. She will share a glimpse of use cases, problem-solution approach, tools, categories of problem types, commonalities and differences. You will leave with a bird’s-eye view of the applied ML in action.
The relationship between Data Science and Engineering, IT, and other organizations can be complicated. The end goal is aligned — to help the business win through model-driven innovation. Data scientists drive innovation by building models that automate or inform business processes. Technology groups provide and manage the technology landscape that makes it all possible. Business stakeholders must take Data Science outputs and make them actionable.
All stakeholders need to partner on the journey to become model-driven. In particular, Data Science and Technology groups must create a mutually beneficial environment — a shared space that brings together infrastructure, data and tooling to foster efficient model building, testing, validation, deployment, and monitoring. But there’s tension between fueling innovation with an open environment that embraces the most cutting edge tools, and providing a place to work that is safe, governed, cost-controlled, scalable, and compliant.
In this session, our speakers will share their experiences, lessons learned, and even some battle scars as they address questions around successes and failures they’ve seen partnering with stakeholders across the enterprise for Data Science programs. The goal of this engaging session is to help attendees learn from collective past experiences in order to establish clearer communication lines between Data Science and other groups moving forward.
Determining the best organizational structure for data science teams is often debated. Should the data science team be an independent function, reporting into a head of Analytics or IT, providing support to various lines of business? Should data scientists be federated, reporting into business units and operating somewhat independently from each other? Or does a matrix organizational structure provide the best of both worlds? One approach that’s becoming increasingly common is to establish a Data Science Center of Excellence — a core team of data science experts that sets the course for a company’s data science strategy, best practices and operating procedures, and technology ecosystem. They can offer guidance to citizen data scientists, data scientists reporting directly into business functions, and other stakeholders leveraging data science across the organization. But how do you go about establishing at Data Science Center of Excellence? During this workshop, we’ll walk through organizational designs, a how-to guide for getting started, and will share best practices and lessons learned based on first-hand experience building a CoE in the field. We’ll discuss what capabilities belong in a COE; whether / how to balance deep ML expertise vs. broad analytics capabilities. Attendees will leave with an actionable plan to get started on this path.
Many organizations today are challenged to teach the broad range of professionals in their workforce the skills necessary to thrive in a data science and/or business analytics environment. This talk will offer some key ideas to how to teach data science and analytics. The focus will be teaching non-technical users to understand the basics of data science using case studies, in a hands on student-led environment. Special attention will be paid to infusing a data science culture into traditionally non-technical business functions and how to use analytics in business decisions and operations.