Comparing Data Sets

If I have a data set on an inventory of items and a second data set on the sale of those items, how might I compare those data sets with Domo to discover 1) which items in our inventory have never been sold and 2) the average length of time before items added to our inventory are sold?​

 

Thanks!

Best Answer

  • quinnj
    Accepted Answer

    In order to compare datasets and build cards in Domo, you'll need to combine the datasets in meaningful ways. There are a few different ways to combine datasets including DataFlows, Datafusion, and Magic ETL. Each has its own strengths depending on your level of comfort with SQL. For your points, I can imagine a solution in DataFlow involving the following:

     

    1. In order to see items that have never sold, you can do an SQL LEFT JOIN. Something along the lines of:

     

    select a.* -- all the columns from our inventory table

    from inventory a

    left join sales b on a.item = b.item

    where b.item IS NULL

     

    Basically, we're trying to match the two tables on item numbers, but in this case, we're also adding a filter to only leave inventory items where there is no match in our sales table.

     

    2. This one is a little more involved and depends a little on having the right columns in your datasets, but one solution could look like the following:

     

    select

    a.item, a.item_created_date, MIN(b.sales_date) as 'first sales date'

    from inventory a

    join sales b on a.item = b.item

    group by 1, 2

     

    In this case, we're again joining the "inventory" and "sales" tables, but we're then reducing the dataset to be unique by item (the `group by 1, 2` clause). What we're joining from the sales table is the "sales_date" and we're returning the earliest (i.e. MIN) sales_date for the item. This leaves us with a dataset with all the information we need to get what you want. At the card level, I could plot the "item_created_date" along with the "first sales date" column. We could also create a beast mode column that was the length of time between the two dates: DATEDIFF(`item_created_date`, `first sales date`)