Computable content: Notebooks, containers, and data-centric organizational learning
Director, Learning Group at O’Reilly Media
Known as a “player/coach” data scientist, he has led innovative Data teams building large-scale apps for several years. As a recognized expert in distributed systems, machine learning, and Enterprise data workflows, Paco is also an advisor for Amplify Partners. He has 30+ years technology industry experience ranging from Bell Labs to early-stage start-ups. Newsletter and “official” web site: http://liber118.com/pxn/
“Computable Content”:https://bids.berkeley.edu/events/computational-thinking-and-pedagogy-computable-content was described by Dr. Lorena Barba at a “2015 lecture”:https://bids.berkeley.edu/events/computational-thinking-and-pedagogy-computable-content at the UC Berkeley Institute for Data Science. That leverages “Jupyter notebooks”:https://jupyter.org/ to make learning materials more powerful by integrating compute engines, data sources, etc.
O’Reilly Media extended this approach, “publishing notebooks”:https://www.oreilly.com/ideas/jupyter-at-oreilly from authors along with video timelines to create a new “Oriole”:http://www.oreilly.com/oriole/index.html online tutorial medium. A free public tutorial, “Regex Golf”:https://www.oreilly.com/learning/regex-golf-with-peter-norvig by Peter Norvig demonstrates what is possible with this technology integration to create a new learning medium.
Each user session launches a “Docker container”:https://www.docker.com/ on a “Mesos cluster”:http://mesos.apache.org/ for fully personalized compute environments. The UX is entirely browser-based. It is also instrumented for data collection and analytics, for use as an _assessment_ platform.
This talk will present:
* the system architecture based on Jupyter as middleware, plus Thebe, Docker, Mesos, Nginx, etc.
* data analytics and project experiences based on delivering _computable content_ at scale
* supporting theory for this pedagogical approach, including Knuth’s _Literate Programming_
* media production techniques that use the video as _subtext_
We will also consider the use of notebooks (Jupyter and others) in an organizational context: how do notebooks help teams share and learn? what impact might notebooks have on developer collaboration that is currently focused on IDEs? The resulting medium provides highly effective tooling for a data-centric organization.