MD5 or SHA1 (or similar) encoding in Adrenaline Dataflows

Sometimes users will use MD5 or SHA1 encoding to create a surrogate key for application in dimensional modelling-informed workflows.

in our use case we needed to create a dimensional table by perform a set of operations on a distinct set of values spread across several columns.

the dataset itself was 50M+ rows, the number of distinct values was around 20k. we need MD5 or SHA1 encoding to create a surrogate key between the transactions and the set of distinct values.

Currently this pipeline starts in Magic (generate the surrogate key) -> Adrenaline (perform the ETL -> Reprocess the data in Magic so we can Accumulate history using Partitions

