Launching a JupyterHub Instance

In this post I walk through the steps of creating a multi-user JupyterHub sever running on an AWS Ubuntu 18.04 instance. There are many ways of setting up JupyterHub including using Docker and Kubernetes - but this is a pretty staight forward mechanism that doesn’t have too many moving parts such as TLS termination proxies etc. I think of this as the baseline setup. Note that this setup has a few pros or cons depending on how you look at them....

October 9, 2019 · 8 min · 1644 words · Benjamin Bengfort

Exception Handling

This short tutorial is intended to demonstrate the basics of exception handling and the use of context management in order to handle standard cases. These notes were originally created for a training I gave, and the notebook can be found at Exception Handling. I’m happy for any comments or pull requests on the notebook. Exceptions Exceptions are a tool that programmers use to describe errors or faults that are fatal to the program; e....

November 21, 2016 · 10 min · 2105 words · Benjamin Bengfort

SVG Vertex with a Timer

In order to promote the use of graph data structures for data analysis, I’ve recently given talks on dynamic graphs: embedding time into graph structures to analyze change. In order to embed time into a graph there are two primary mechanisms: make time a graph element (a vertex or an edge) or have multiple subgraphs where each graph represents a discrete time step. By using either of these techniques, opportunities exist to perform a structural analysis using graph algorithms on time; for example - asking what time is most central to a particular set of relationships....

November 4, 2016 · 6 min · 1178 words · Benjamin Bengfort

Text Classification with NLTK and Scikit-Learn

This post is an early draft of expanded work that will eventually appear on the District Data Labs Blog. Your feedback is welcome, and you can submit your comments on the draft GitHub issue. I’ve often been asked which is better for text processing, NLTK or Scikit-Learn (and sometimes Gensim). The answer is that I use all three tools on a regular basis, but I often have a problem mixing and matching them or combining them in meaningful ways....

May 19, 2016 · 11 min · 2208 words · Benjamin Bengfort

Building a Console Utility with Commis

Applications like Git or Django’s management utility provide a rich interaction between a software library and their users by exposing many subcommands from a single root command. This style of what is essentially better argument parsing simplifies the user experience by only forcing them to remember one primary command, and allows the exploration of the utility hierarchy by using --help and other visibility mechanisms. Moreover, it allows the utility writer to decouple different commands or actions from each other....

January 23, 2016 · 9 min · 1753 words · Benjamin Bengfort