FUSE Calls on Go Writes

For close-to-open consistency, we need to be able to implement a file system that can detect atomic changes to a single file. Most programming languages implement open() and close() methods for files - but what they are really modifying is the access of a handle to an open file that the operating system provides. Writes are buffered in an asynchronous fashion so that the operating system and user program don’t have to wait for the spinning disk to figure itself out before carrying on. Additional file calls such as sync() and flush() give the user the ability to hint to the OS about what should happen relative to the state of data and the disk, but the OS provides no guarantees that will happen. ...

January 26, 2017 · 5 min · 921 words · Benjamin Bengfort

Message Latency: Ping vs. gRPC

Building distributed systems means passing messages between devices over a network connection. My research specifically considers networks that have extremely variable latencies or that can be partition prone. This led me to the natural question, “how variable are real world networks?” In order to get real numbers, I built a simple echo protocol using Go and gRPC called Orca. I ran Orca for a few days and got some latency measurements as I traveled around with my laptop. Orca does a lot of work, including GeoIP look ups, IP address resolution, and database queries and storage. This post, however, is not about Orca. The latencies I was getting were very high relative to the round-trip latencies reported by the simple ping command that implements the ICMP protocol. ...

November 2, 2016 · 5 min · 987 words · Benjamin Bengfort

Computing Reading Speed

Ashley and I have been going over the District Data Labs Blog trying to figure out a method to make it more accessible both to readers (who are at various levels) and to encourage writers to contribute. To that end, she’s been exploring other blogs to see if we can put multiple forms of content up; long form tutorials (the bulk of what’s there) and shorter idea articles, possibly even as short as the posts I put on my dev journal. One interesting suggestion she had was to mark the reading time of each post, something that the Longreads Blog does. This may help give readers a better sense of the time committment and be able to engage more easily. ...

October 28, 2016 · 4 min · 659 words · Benjamin Bengfort

Visualizing Distributed Systems

As I’ve dug into my distributed systems research, one question keeps coming up: “How do you visualize distributed systems?” Distributed systems are hard, so it feels like being able to visualize the data flow would go a long way to understanding them in detail and avoiding bugs. Unfortunately, the same things that make architecting distributed systems difficult also make them hard to visualize. I don’t have an answer to this question, unfortunately. However, in this post I’d like to state my requirements and highlight some visualizations that I think are important. Hopefully this will be the start of a more complete investigation or at least allow others to comment on what they’re doing and whether or not visualization is important. ...

April 26, 2016 · 5 min · 1020 words · Benjamin Bengfort

Lessons in Discrete Event Simulation

Part of my research involves the creation of large scale distributed systems, and while we do build these systems and deploy them, we do find that simulating them for development and research gives us an advantage in trying new things out. To that end, I employ discrete event simulation (DES) using Python’s SimPy library to build very large simulations of distributed systems, such as the one I’ve built to inspect consistency patterns in variable latency, heterogenous, partition prone networks: CloudScope. ...

April 15, 2016 · 4 min · 823 words · Benjamin Bengfort