Giter Site home page Giter Site logo

grouparoo / grouparoo Goto Github PK

View Code? Open in Web Editor NEW
726.0 18.0 115.0 56.54 MB

🦘 The Grouparoo Monorepo - open source customer data sync framework

Home Page: https://www.grouparoo.com

License: MIT License

Shell 0.03% JavaScript 95.00% TypeScript 4.96% SCSS 0.01% Procfile 0.01%
marketing marketing-automation marketing-tools marketing-analytics nodejs typescript events email push-notifications communication

grouparoo's Introduction

Grouparoo

Sync, Segment, and Send your Product Data Everywhere

Grouparoo is an open source framework that helps you move data between your data warehouse and all of your cloud-based tools. Learn more at www.grouparoo.com

Grouparoo Data Bowtie

This is the Grouparoo Monorepo, containing the source code for @grouparoo/core and many plugins. If you are looking for an example about how to run or deploy Grouparoo, please visit https://github.com/grouparoo/app-example

Documentation and Guides

  • 🦘 Ready to Try Grouparoo?
    • Grouparoo is Open Source, and easy to run on your laptop or in the cloud.
    • β†’ View the Getting Started Docs.
  • πŸ“š Want to learn more about how to configure and use Grouparoo?
  • πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘§ Want to collaborate with the Community to enhance Grouparoo?
    • Join other Grouparoo Community members to share best practices and tackle problems.
    • β†’ Join the Community.
  • βš™οΈ Want to learn more about how Grouparoo works?
    • Grouparoo is Open Source, and we welcome community contributions. You can add your own plugins to connect to new Sources and Destinations.
    • β†’ View the Development Guide.

Running a Grouparoo Application

This is an abbreviated version of the "Grouparoo Installation Guide". The full version can be found here.

Run Locally with Node.js

Use the Grouparoo CLI to initialize a new Grouparoo Project:

# Assuming you have node.js (https://nodejs.org) v12+ installed
npm install -g grouparoo
grouparoo init .
grouparoo config

This will generate a package.json and .env file and launch our Config UI for you to begin configuring your Grouparoo instance.

🦘

grouparoo's People

Contributors

andyjih avatar bleonard avatar boardfish avatar dependabot-preview[bot] avatar dependabot[bot] avatar edmundito avatar evantahler avatar grouparoo-bot avatar krishnaglick avatar mwflaher avatar nigelkibodeaux avatar parthiv11 avatar pauloouriques avatar pedroslopez avatar rwfeather avatar seancdavis avatar tealjulia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grouparoo's Issues

Guard better against missing required ENV variables.

@evanΒ master seems to crash for me locally. Let me know if there’s something I should do differently?

[1:51 PM] code: 'ERR_HTTP_INVALID_HEADER_VALUE'
}
TypeError [ERR_HTTP_INVALID_HEADER_VALUE]: Invalid value "undefined" for header "X-GROUPAROO-SERVER_TOKEN" code: 'ERR_HTTP_INVALID_HEADER_VALUE'
}
TypeError [ERR_HTTP_INVALID_HEADER_VALUE]: Invalid value "undefined" for header "X-GROUPAROO-SERVER_TOKEN"

CSV Sources can handle CSV changes with different data

How do I update my CSV in a CSV source without deleting and recreating Sources, Properties, Groups -> Destinations?

Switching to another file in the Source UI

  • Validates that all profile property rules are present in the new csv
  • Doesn't necessarily require all the same columns (new ones could be present or old ones that weren't used)
  • A run is kicked off at that point
  • Schedule might not be necessary

Below the Sources MVP line because we'll wait for customers to give us more details on requirements

Profile property rule preview does not take filter into effect

I can't quite figure out how to see the preview of this rule that takes the filter in effect.
The 10.38 in the screenshot is without the filter. Everything I can change in the filter and add another one has no effect on the preview.

It works when I save it and go back in. Ideally, it works before then.

Rather than chain events from profile -> import -> export, rely on timers

This might be an optimization that would slow down the happy-path case, but reduce the duplication of import and export jobs. It would slow down the happy-path case as we would need to "wait" between each of the steps for the next processor to come around, but this "waiting" would let us be smarter about what we do in batch.

In the events table, we have timestamps for associatedAt, importedAt, exportedAt, etc. We could have a single recurring task that looks for events that need to be processed via this table rather than chaining the events together.

There are a number of possible heuristics we could use to determine that an event is "ready" for the next step:

  • It's been X seconds since the last step?
  • It's been X seconds since this or any other event for the same profile was processed on the previous step
  • The schedule or group run which created the event is done
  • There are no other runs running
  • etc...

Each event would need to be "claimed" for processing (see how delayedJob does it)

Destination connections can declare their parallelism and rate liits

There are multiple facets to this:

  • daily rate limits
  • per-second/per-minute rate limits
  • parallelism

It's likely that a plugin either knows this statically (Mailchimp: parallelism of 10) or can be determined from a single option on the plugin's App (Hubspot: API usage tier). This implies 2 new methods for a plugin: getRateLimitOptions({appOptions}) and getRateLimit({app, appOptions, rateLimitOptions})

// mailchimp
getRateLimit() {
  return {
    daily: Infinity,
    minute: Infinity,
    second: Infinity,
    parallel: 10
  }
}

Then, it's up to Core to keep track of incrementing and decrementing a few stats for each app (in redis):

  • grouparoo:app:{appGuid}:rateLimit:daily:{date}
  • grouparoo:app:{appGuid}:rateLimit:second:{second}
  • grouparoo:app:{appGuid}:rateLimit:minute:{minute-second}
  • grouparoo:app:{appGuid}:rateLimit:parallelRequets

Every call to import or export a profile starts with checking each of the limits. Then, if allowed, incrementing and then decrementing the values above. Appropriate TTLs should be set on each of the above of 2*period.

If a rate limit would be violated, the task is re-enqueued to try later (with a configurable delay in settings). Waiting 10 seconds seems like a good starting default.

Outstanding Question: How does this work when exporting to or importing from multiple sources? Does the whole task fail and then retry? Do we need to make per-app tasks?


Mailchimp has a concurrency limit of 10
Is the fact that tasks will retry and backoff enough?

To improve connections and experiences for all our users, we use some connection limits when we see suspicious activity or overload. Each user account can have up to 10 simultaneous connections. You will receive an error message if you reach the limit. We do not throttle based on volume. Note: currently there are no options to raise the limit on a per-customer basis.

https://mailchimp.com/developer/guides/get-started-with-mailchimp-api-3/

http://localhost:3000 not working

Hi Grouparoo team,
I have followed the instruction to run app and go to step 4:
Run npm start to start the server and visit http://localhost:3000 to get started. Follow the on-screen instructions to create your account and first team.

Here is the result:

@grouparoo/[email protected] start /mnt/d/Project/grouparoo/apps/local-public
cd node_modules/@grouparoo/core && GROUPAROO_MONOREPO_APP=local-public ./api/bin/start

2020-06-01T09:12:27.613Z - info: registering grouparoo plugin: @grouparoo/core/manual
2020-06-01T09:12:27.615Z - info: registering grouparoo plugin: @grouparoo/core/events
2020-06-01T09:12:27.617Z - notice: pid: 424
2020-06-01T09:12:27.618Z - notice: environment: development
2020-06-01T09:12:27.619Z - info: *** Starting Actionhero ***
2020-06-01T09:12:27.622Z - info: using path "/mnt/d/Project/grouparoo/apps/local-public/node_modules/@grouparoo/core/api/files/development" for Grouparoo file storage
2020-06-01T09:12:27.633Z - info: actionhero member 192.168.1.53 has joined the cluster
2020-06-01T09:12:27.689Z - notice: server ID: 192.168.1.53
2020-06-01T09:12:27.690Z - notice: *** Actionhero Started ***

When opening the brower to access http://localhost:3000 but nothing happend and get the message: This site can’t be reached

Any step that i am missing please help to correct?

Tks,
Cuong Tran

Aggregation for event profile property seems off

SELECT sum(CAST("value" AS FLOAT)) AS "value" FROM "eventData" AS "SequelizeEventData" INNER JOIN "events" AS "sequelizeEvent" ON "SequelizeEventData"."eventGuid" = "sequelizeEvent"."guid" AND "sequelizeEvent"."type" = 'itemAddedToCart' AND "sequelizeEvent"."profileGuid" = 'pro_0eb34610-3fd7-464f-866c-92f56ee595fe' WHERE "SequelizeEventData"."key" = 'price' GROUP BY "value"

loading pattern needed to prevent confusion

In general the UI shows something while we are fetching data.
This can be confusing if it is sort of like an error case.

For example, the source/mapping tab shows the attached picture while it is fetching example data. It would be better if this said it was loading until there was an error (or it really didn't allow previews).

In general, we could look for other spots with the same possible confusion.

Table Sources should have way to sort

Let's say I have a table called Purchases where a user has many purchases with a data column (i made this in test biquery database).

In the table property rule builder, I should be able to create a rule that says "the value of the name of the most recent purchase" -> Apple

In SQL terms, this is adding an "ORDER BY x DESC/ASC" to an exact query.

Note that it's not just the most recent date or something. It's another value when sorted by that date. It could also be "most expensive purchase" which would sort by price.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.