Compression Benchmarks
One of the projects I’m currently working on is the ingestion of RSS feeds into a Mongo database. It’s been running for the past year, and as of this post has collected 1,575,987 posts for 373 feeds after 8,126 jobs. This equates to about 585GB of raw data, and a firm requirement for compression in order to exchange data. Recently, @ojedatony1616 downloaded the compressed zip file (53GB) onto a 1TB external hard disk and attempted to decompress it. After three days, he tried to cancel it and ended up restarting his computer because it wouldn’t cancel. His approach was simply to double click the file on OS X, but that got me to thinking – it shouldn’t have taken that long; why did it choke? Inspecting the export logs on the server, I noted that it took 137 minutes to compress the directory; shouldn’t it take that long to decompress as well? ...