Giter Site home page Giter Site logo

forcedotcom / comdagen Goto Github PK

View Code? Open in Web Editor NEW
7.0 9.0 10.0 21.32 MB

COMmerce DAta GENerator will build a Commerce Cloud site import file tailored to your specification

License: BSD 3-Clause "New" or "Revised" License

Shell 0.23% Java 1.55% Kotlin 98.10% CSS 0.12%
commerce-cloud data

comdagen's Introduction

comdagen

Comdagen (_Com_merce _Da_ta _Gen_erator) is a tool that will build you a site import file tailored to your specification.

Comdagen is developed primarly by Moritz Müller ([email protected]) and Matti Bickel ([email protected]). Version 1.0 was written almost entirely by Oskar Jauch (@ojauch). Initial ideas were brought by Rene Schwietzke (@rschwietzke). Martin Klaus and Matti Bickel contributed to earlier versions.

Note: historically, this project has been known as undagen (Unified Data Generator). But we want to very specific about the type of data this project is able to produce - you'll get a full fledged Commerce Cloud data set. Nothing more, nothing less.

Build

This tool uses Maven to build. From a command line, switch to your checkout directory and issue the package command:

    $ mvn package

The executable JAR file will be placed in target/comdagen-<VERSION>-SNAPSHOT.jar.

Usage

Execute comdagen by calling one of:

    $ java -jar target/comdagen-1.1-SNAPSHOT.jar --zip
    $ ~/comdagen/bin/comdagen.sh --zip

This will automatically generate a zip file containing site data in the output folder.

You can tweak how comdagen runs in two ways:

Command line parameters

  --config               : Use this config file to specify which sites to generate content for 
                           (default: $configDir/sites.yaml)
  --configDir            : Generate xml files for all configs in this dir (default: ./config)
  --output               : Generate xml files in this directory (default: ./output)
  --zip-output           : Where to put the final site import zip file (default: ./output/generated.zip)
  --zip                  : Whether to create a zip file at all (default: false)

If you give a configDir option but no config, the directory must contain a sites.yaml. All other configuration files will be read from the sites.yaml file (see the next section).

Config files

The main config file is sites.yaml. This source comes with a default one in the config directory. Please see it's comments for more information.

By removing the respective Config entries from the site entry in sites.yaml you can avoid generating data you don't need. For example:

sitesConfig:
  sites:
    - regions: [Generic, German, Chinese]
      currencies: [USD, EUR, CNY]
      outputFilePattern: "site.xml"
      outputDir: "sites"
      customerConfig: "customers.yaml"
      pricebookConfig: "pricebooks.yaml"
      catalogConfig: "catalogs.yaml"

Will generate a customers, pricebooks and corresponding catalog.

On the other hand, if you are after a specific element, consider:

sitesConfig:
  sites:
    - regions: [Generic, German, Chinese]
      currencies: [USD, EUR, CNY]
      outputFilePattern: "site.xml"
      outputDir: "sites"
      customerConfig: "customers.yaml"
#      pricebookConfig: "pricebooks.yaml"
#      catalogConfig: "catalogs.yaml"

Commenting out the config files will result in only a site and customer xml file to be generated. You can find the output in the output directory.

Open and edit the respective config files to adjust the generation result. Two entries are present everywhere:

  • the initialSeed attribute controls the generated content - comdagen will generate the same content as long as the seed stays the same

  • the elementCount attribute controls how many elements get generated. Note that the first X elements will be the same if you just increase this value without changing initialSeed.

Data sources

Comdagen uses data files to pick realistic names, cities or zip codes for the generated data sets. You can find those files in your checkout under src/main/resources/contentfiles. They are not meant to provide accurate data, the main goal is to get character and word distribution correct for our internal localized indexers.

We used data from the following web-sites to compile our files:

For the "generic" locale, we use randomly generated words/sentences.

Wiktionary sites are subject to CC-BY-SA. Chinese name list by Chih-Hao Tsai, used with permission by the author.

For longer text excerpts we use snippets from copyright-free books downloaded from Project Gutenberg. The headers are left intact, so you can see which books are used.

comdagen's People

Contributors

aeshangparikh avatar dependabot[bot] avatar mj-mueller avatar snyk-bot avatar stephenpeng111 avatar svc-scm avatar wundrian avatar xc-rvogel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

comdagen's Issues

Test case are data file dependent

I changed the data files such as books* to reduce data variance and the test cases start to fail after that. Please make the test cases independent of the base data files.

Use more meaningful ID for a site

Sites typically have IDs such as FoobarUS. The tool creates a site ID of 1 just now. Because this is also in the url of the site, it is a little off from reality.

Generate minimum and maximum value of range

This is a historic proposal from @ojauch:
When generating product sets the user defines a range how many products should be part of each product set. For example:

productSets:
    # number of product sets to generate
    elementCount: 10

    # define range of products per product set
    minSetProducts: 3
    maxSetProducts: 8

In this configuration the user configured that 10 product sets should get generated with 3 to 8 products each. In the current implementation it is not guaranteed that comdagen generates at least one product set with the minimum number of products and one with the maximum number of products.

Should this get changed that the result of the product set generation is more predictable? The same applies to product bundles, variation products, etc.

Name catalogs

Please use more meaningful names for ids of catalogs to see what is master and navigation catalog more easily. Right now we get only long numbers, likely the hash or seed.

Include ContentAsset on Homepage so it's not completely empty

Currently, on SG sites, the Homepage consists of a search bar and a footer. The main display is via some content assets that are normally part of the initial site import.
Because Comdagen doesn't include content assets, the Homepage looks broken, even if it isn't, technically.

To fix this perception, we'll include a single content asset that states that this site is using Undagen generated data (a kind of "Hello World" message).

If we want to get extra fancy, include information on the configuration, comdagen version number and seed used to generate this particular site import.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.