Want a dataset that allows individual row writes. Does such a thing exist?

I have a job that runs on one of my services and this writes data to S3 around 10k times per day. It is pretty bursty in traffic. I used to use an S3 connector to get this data into Domo but those broke after months of working just fine and support has been really slow to work with (each question to them takes around 1 week to get an answer and thats just too long).

I have looked at the Stream API which could work for what I am trying to do, but requries a fair bit of extra work on my side. An easier solution would just be a dataset I can write individual rows to as often as I want. I'd probably just do this by having a super simple lambda spin up on each object creation in S3 and having that lambda just straight up write it to the dataset in Domo. I can't tell, but it seems like such a dataset type doesn't exist. I feel like this is a really basic thing that should exist though so I ask here just in case I am blind and simply can't find it. Thanks!

Comments

  • Hi,

     

    In my knowledge, write to a dataset or preparing a dataset is up to the conditions you have mentioned while creating it. If its from workbench, and you have given code in source tab, coming from some SQL statement(using views or procedure) you can add condition to pull one record at time and schedule it every minute or as required. Other work around is, you can create a dataflow against that dataset which is already present and add a condition there to get one row, scedule it to run as you required.

     

    **Say "Thanks" by clicking the "heart" in the post that helped you.
    **Please mark the post that solves your problem by clicking on "Accept as Solution"

     

    Thanks,

    Neeti

  • jaeW_at_Onyx
    jaeW_at_Onyx Budapest / Portland, OR 🟤

     

    I'm not sure if i follow your requirement exactly but i think there's a couple things you might consider.

     

    I don't think Domo really shines in use cases where you try to push individual rows frequently, insofar as you're not sending data directly to a database... the data gets collected in a data lake-esque environment first (Vault) before it's loaded into Adrenaline (the database layer).    A better approach might be to think of processing incrementally smaller batches of CSVs.

     

    Domo probably wouldn't like it if you push update 10k times a day, but you could

    1) create a loop where, you aggregate rows in a CSV on-prem (or in a VM) as they come in from your streaming source.

    2) Then periodically push that data into Domo (ex every 15 minutes) via a Workbench Job or a JavaCLI job.

    3) clear all the rows in that CSV it's successfully pushed into Domo. 

     

    If you're using a straight APPEND instead of a REPLACE you should have a reasonable experience.  Alternatively, if you're using JavaCLI or Workbench it should be relatively easy (partner with your CSM) to use UPSERT to push the data into Domo if there's fear that you'd have duplicate rows across different pushes.

     

    As an extension off this, to get faster loads into Domo, if you're using the JavaCLI, aftere you push the data into Domo you could ping the dataset and ask 'are you ready yet'?  as soon as it's ready, you could kick off the next job.

     

    You CAN use the APIs on developer.domo.com but i strongly recommend you try using the JavaCLI to interact w/ Domo.  Support, labs and PS will be way happier (and better equipped) to support a JavaCLI problem, whereas they might not be willing / able to read your CURL request.