Joining Datasets with different numbers of rows
We’re looking to analyze staff utilization data. There are two data sets:
Data Set 1: Raw data from a timekeeping system with daily time entries from each staff member, there are multiple rows per person per month.
Data Set 2: The total available hours per person per month, there is one row per person per month.
We’re looking to calculate utilization per month per person by summing all of their hours entries for a specific month from Data Set 1, and dividing by their corresponding available hours for that month in Data Set 2.
When we joined the two data sets, using their unique ID# and the identifier, the output includes the monthly available hours value for every row in which there is a raw data entry. We need to create a Beast Mode that is the sum of all raw data hours per month divided by just one instance of the available hours per month.