Using latest record for daily monitoring

Hi,


So in a credit customer database, I basically have two tables:

  • Customer/portfolio table -> this would record daily update of portfolio movement (outstanding amount, fees, days past due, etc.)
  • Credit limit usage table -> this would record changes in customer's credit limit utilization, meaning it would get updated only if there's change

What I want to do is simple: by joining credit limit usage table to the customer/portfolio table, I'd like to add a column that'd show a customer's current credit limit for each day. If there's no change in credit limit, it would simply show the latest one.

Since the credit limit table doesn't update daily while customer table does, how am I going to join the two tables?

Thanks a bunch.

Answers

  • MarkSnodgrass
    MarkSnodgrass Portland, Oregon 🟤

    In Magic ETL, I would do a LEFT JOIN with the customer table to the credit limit table so that you would have rows for every date. Next, use the Rank & Window tile and use the LAG function pull the credit limit from the previous date. You would then replace nulls with the last value

    This will get you close, but will be an issue if there are multiple days with no credit limit. Here's the KB article on Rank & Window.

    https://domohelp.domo.com/hc/en-us/articles/360042922814-Magic-ETL-Tiles-Rank-and-Window

    Hopefully, this will help and you can find a way to handle the multiple days with nulls.

  • @MarkSnodgrass, thanks for your suggestion. I'm not quite familiar with LAG function so I'll be sure to explore and see if it'd solve the issue. Though by how you explain it, it seems like it would, but I'm not sure I understand what you mean by "will be an issue if there are multiple days with no credit limit". Wouldn't these be the nulls? And they would be replaced with last value?

    If you mean by it that for certain customers there are no credit limit to begin with, that wouldn't be the case as every record in the customer table must have a limit as first entry.

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

    @MarkSnodgrass on paper you're probably right, but i think i would try to ascertain the granularity (what one row in each table represents) of @pangeran_zuko 's dataset.

    IF the granularity of Credit Limit Usage is one row per day per customer IF the credit limit has been used, then yes, the LAG() approach would absolutely be appropriate. If you PARTITION BY the User instead of just Order By Date desc then you avoid issues where credit limit usage is null if a row wasn't reported for yesterday.

    IF the granulairty of Credit Limit Usage is one row per customer (and the table gets updated every day) then the LAG approach would not be appropriate.


    In both cases, it's still a simple JOIN. Heck, maybe instead of LAG (and filter on rowNum = 1) you can just do GROUP BY Customer No, and take the Last Non Null Value. You can't explicitly set the sort order on your data, so technically that is risky.