Building a fast config parser for large files

Recently I was asked to develop a config parser, or more specific, a config converter, in Python. The purpose of this tool was to parse a particular config file and feed it into an API. This relatively simple task was complicated by a few parameters:

  • The input file sizes are in the range 10-15G
  • The input is a undocumented proprietary format
  • The output needs to be inserted in a certain order because of dependencies
  • This process needs to be as fast as possible

This is a broad overview of how I attempted to solve some of these issues.

Continue reading “Building a fast config parser for large files”