In-Memory reference-data MDM/Lookup system
(Performed while engaged by United Bank of Switzerland / UBS)

The first of the following two code samples is an Abstract-Base-Class (named Lookup) that defines an in-memory Master Data Management (MDM) and lookup system. It is stored in the lookup.py module. Using that Abstract-Base-Class, Concrete-Lookup-Classes such as FileLookup (./file_lookup.py) and DatabaseLookup (./db_lookup.py) can be implemented.

The second code sample, therefore, is a Concrete-Lookup-Class that implements DatabaseLookup.

Consumed by downstream analytics applications, this MDM/Lookup system loads reference data stored in some permanent data store (Oracle, Postgres, CSV file, etc), and initializes a vetted In-Memory representation of that data: Performs various normalizations (e.g. currency, country, nulls, etc); Links Synonym data; Deduplicates records; Evicts ambiguous records (i.e. records having the same key but different values), and so on. Once this initialization is complete, the object exposes a Lookup API that downstream analytics consumers rely on for vetted, cleansed, conflict-free reference data.

The below were early MVP / POC / R&D Lab snapshots, with actual releases incorporating further changes and refactoring. The reader will notice lots of in-line comments, which facilitated context-switching between this and other project areas.

Abstract-Base-Class: Lookup (stored in ./lookup.py)

Concrete-Lookup-Class: DatabaseLookup (stored in ./db_lookup.py)