Online Distribution

This post started out as a discussion of a struct in Go that could keep track of online statistics without keeping an array of values. It ended up being a lesson on over-engineering for concurrency. The spec of the routine was to build a data structure that could keep track of internal statistics of values over time in a space-saving fashion. The primary interface was a method, Update(sample float64), so that a new sample could be passed to the structure, updating internal parameters. At conclusion, the structure should be able to describe the mean, variance, and range of all values passed to the update method. I created two versions: ...

August 28, 2017 · 3 min · 480 words · Benjamin Bengfort

Rapid FS Walks with ErrGroup

I’ve been looking for a way to quickly scan a file system and gather information about the files in directories contained within. I had been doing this with multiprocessing in Python, but figured Go could speed up my performance by a lot. What I discovered when I went down this path was the sync.ErrGroup, an extension of the sync.WaitGroup that helps manage the complexity of multiple go routines but also includes error handling! ...

August 18, 2017 · 6 min · 1206 words · Benjamin Bengfort

Buffered Write Performance

This is just a quick note on the performance of writing to a file on disk using Go, and reveals a question about a common programming paradigm that I am now suspicious of. I discovered that when I wrapped the open file object with a bufio.Writer that the performance of my writes to disk significantly increased. Ok, so this isn’t about simple file writing to disk, this is about a complex writer that does some seeking in the file writing to different positions and maintains the overall state of what’s on disk in memory, however the question remains: ...

August 3, 2017 · 5 min · 1065 words · Benjamin Bengfort

Event Dispatcher in Go

The event dispatcher pattern is extremely common in software design, particularly in languages like JavaScript that are primarily used for user interface work. The dispatcher is an object (usually a mixin to other objects) that can register callback functions for particular events. Then when a dispatch method is called with an event, the dispatcher calls each callback function in order of their registration and passes them a copy of the event. In fact, I’ve already written a version of this pattern in Python: [Implementing Observers with Events]({% post_url 2016-02-16-observer-pattern %}) In this snippet, I’m presenting a version in Go that has been incredibly stable and useful in my code. ...

July 21, 2017 · 3 min · 468 words · Benjamin Bengfort

Lazy Pirate Client

In the [last post]({% post_url 2017-07-13-zmq-basic %}) I discussed a simple REQ/REP pattern for ZMQ. However, by itself REQ/REP is pretty fragile. First, every REQ requires a REP and a server can only handle one request at a time. Moreover, if the server fails in the middle of a reply, then everything is hung. We need more reliable REQ/REP, which is actually the subject of an entire chapter in the ZMQ book. ...

July 14, 2017 · 2 min · 331 words · Benjamin Bengfort

Simple ZMQ Message Passing

There are many ways to create RPCs and send messages between nodes in a distributed system. Typically when we think about messaging, we think about a transport layer (TCP, IP) and a protocol layer (HTTP) along with some message serialization. Perhaps best known are RESTful APIs which allow us to GET, POST, PUT, and DELETE JSON data to a server. Other methods include gRPC which uses HTTP and protocol buffers for interprocess communication. ...

July 13, 2017 · 2 min · 355 words · Benjamin Bengfort

PID File Management

In this discussion, I want to propose some code to perform PID file management in a Go program. When a program is backgrounded or daemonized we need some way to communicate with it in order to stop it. All active processes are assigned a unique process id by the operating system and that ID can be used to send signals to the program. Therefore a PID file: The pid files contains the process id (a number) of a given program. For example, Apache HTTPD may write it’s main process number to a pid file - which is a regular text file, nothing more than that - and later use the information there contained to stop itself. You can also use that information (just do a cat filename.pid) to kill the process yourself, using echo filename.pid | xargs kill. ...

July 11, 2017 · 3 min · 432 words · Benjamin Bengfort