Decorating Nose Tests

Was introduced to an interesting problem today when decorating tests that need to be discovered by the nose runner. By default, nose explores a directory looking for things named test or tests and then executes those functions, classes, modules, etc. as tests. A standard test suite for me looks something like: import unittest class MyTests(unittest.TestCase): def test_undecorated(self): """ assert undecorated works """ self.assertEqual(2+2, 4) The problem came up when we wanted to decorate a test with some extra functionality, for example loading a fixture: ...

May 22, 2017 · 1 min · 184 words · Benjamin Bengfort

In Process Cacheing

I have had some recent discussions regarding cacheing to improve application performance that I wanted to share. Most of the time those conversations go something like this: “have you heard of Redis?” I’m fascinated by the fact that an independent, distributed key-value store has won the market to this degree. However, as I’ve pointed out in these conversations, cacheing is a hierarchy (heck, even the processor has varying levels of cacheing). Especially when considering micro-service architectures that require extremely low latency responses, cacheing should be a critical part of the design, not just a bolt-on after thought! ...

May 17, 2017 · 5 min · 877 words · Benjamin Bengfort

Unique Values in Python: A Benchmark

An interesting question came up in the development of Yellowbrick: given a vector of values, what is the quickest way to get the unique values? Ok, so maybe this isn’t a terribly interesting question, however the results surprised us and may surprise you as well. First we’ll do a little background, then I’ll give the results and then discuss the benchmarking method. The problem comes up in Yellowbrick when we want to get the discrete values for a target vector, y — a problem that comes up in classification tasks. By getting the unique set of values we know the number of classes, as well as the class names. This information is necessary during visualization because it is vital in assigning colors to individual classes. Therefore in a Visualizer we might have a method as follows: ...

May 2, 2017 · 5 min · 963 words · Benjamin Bengfort

Measuring Throughput

Part of my research is taking me down a path where I want to measure the number of reads and writes from a client to a storage server. A key metric that we’re looking for is throughput — the number of accesses per second that a system supports. As I discovered in a very simple test to get some baseline metrics, even this simple metric can have some interesting complications. ...

April 28, 2017 · 5 min · 965 words · Benjamin Bengfort

OAuth Tokens on the Command Line

This week I discovered I had a problem with my Google Calendar — events accidentally got duplicated or deleted and I needed a way to verify that my primary calendar was correct. Rather than painstakingly go through the web interface and spot check every event, I instead wrote a Go console program using the Google Calendar API to retrieve events and save them in a CSV so I could inspect them all at once. This was great, and very easy using Google’s Go libraries for their APIs, and the quick start was very handy. ...

April 20, 2017 · 4 min · 719 words · Benjamin Bengfort

Gmail Notifications with Python

I routinely have long-running scripts (e.g. for a data processing task) that I want to know when they’re complete. It seems like it should be simple for me to add in a little snippet of code that will send an email using Gmail to notify me, right? Unfortunately, it isn’t quite that simple for a lot of reasons, including security, attachment handling, configuration, etc. In this snippet, I’ve attached my constant copy and paste notify() function, written into a command line script for easy sending on the command line. ...

April 17, 2017 · 2 min · 398 words · Benjamin Bengfort

A Benchmark of Grumpy Transpiling

On Tuesday evening I attended a Django District meetup on Grumpy, a transpiler from Python to Go. Because it was a Python meetup, the talk naturally focused on introducing Go to a Python audience, and because it was a Django meetup, we also focused on web services. The premise for Grumpy, as discussed in the announcing Google blog post, is also a web focused one — to take YouTube’s API that’s primarily written in Python and transpile it to Go to improve the overall performance and stability of YouTube’s front-end services. ...

March 23, 2017 · 7 min · 1489 words · Benjamin Bengfort