Giter Site home page Giter Site logo

mozreport's Introduction

mozreport Build Status PyPI

mozreport is a CLI tool that intends to help streamline the process of preparing an experiment report.

Installing mozreport

Homebrew users can install mozreport with brew install mozilla/mozreport/mozreport.

You can also install mozreport using pip3 install mozreport. (You might consider using pipsi.)

Mozreport requires Python 3.6+.

Using mozreport

$ mozreport --help

Usage: mozreport [OPTIONS] COMMAND [ARGS]...

  Mozreport helps you write experiment reports.

  The workflow looks like:

  * `mozreport setup` the first time you use Mozreport

  * `mozreport new` to declare a new experiment and generate an analysis
  script

  * `mozreport submit` to run an analysis script on Databricks

  * `mozreport fetch` to download the result

  * `mozreport report` to set up a report template

  The local configuration directory is /Users/tsmith/Library/Application Support/mozreport.

What's a template?

A report template is any collection of code that operates on a file named summary.sqlite3 in the current working directory, and renders a report. To add a template, add a folder to the mozreport/templates folder in this repository, or the templates folder inside your local configuration directory (see the bottom of mozreport --help).

You may wish to adopt the convention of including a script named build.py that performs the necessary steps to render the report.

Hacking on mozreport

To run unit tests only:

tox -- -m "not integration"

To run all tests, including integration tests that hit our live Databricks account:

  • Run mozreport setup once
  • tox

mozreport's People

Contributors

mozilla-github-standards avatar robhudson avatar tdsmith avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mozreport's Issues

Write an example ETL notebook

We should generate ETL notebooks instead of flat scripts. We should decide what the basic ETL notebook should look like. (It'll probably look like the ETL script, except values will be baked in, instead of passed as arguments.)

Write a README

  • add travis swag
  • document running mozreport setup to run tests locally
  • link to design documents

Wiki changes

FYI: The following changes were made to this repository's wiki:

These were made as the result of a recent automated defacement of publically writeable wikis.

Make it easier to test an analysis script

Ben mentioned that it's hard to revise the ETL script iteratively; submitting to Databricks and getting an opaque error back isn't as nice as hitting enter in a notebook and getting an exception on the screen.

One possibility is to implement the script in a way that facilitates copypasta debugging, where you can iterate on a thing in a notebook until it works, and then save it to the script.

Accept a friendly cluster name instead of a slug

mozreport submit accepts a --cluster_slug argument that allows you to specify which cluster to submit a job to, like 1003-151000-grebe23. We should add a --cluster_name argument that supports looking clusters up by name, like shared_serverless.

Get a better versioning story

  • The version should be single-sourced / shouldn't live in setup.py
  • The version and git revision should be present in all generated files

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

  1. Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
  2. Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].

(Message COC001)

Handle file-exists errors in `report`

When we run mozreport report, emplace throws a FileExistsError if it would overwrite files.

Because the CLI doesn't a priori know which files would be overwritten, the CLI just check if the files exist before doing the copy.

What we can do:

  • try running emplace
  • catch the FileExistsError
  • prompt the user whether they want to overwrite their template or not
  • if they do, pass a overwrite=True argument to emplace

Support different ETL templates?

@rjweiss suggested that mozreport should be able to support performing different kinds of analyses out of the box.

What are some concrete examples of these, and how do we snap these in to the UI?

mozreport.experiment.generate_etl_script should generate an ETL notebook instead of an ETL script

mozreport.experiment.generate_etl_script should generate an ETL notebook instead of an ETL script.

  • Decide what the ETL notebook should look like: #52
  • Decide how to represent the ETL notebook in the repository: as a .ipynb file directly? As a Python source file with blocks separated by ###? We can use nbformat to help us write a notebook on the fly if we prefer. The latter sounds easier to look at in our repository.

For a Python script, we could pass values into the script "on the command line" when we run it in Databricks. That won't work for the notebook; we'll have to embed certain values (e.g. experiment slug) into the notebook document we upload to Databricks. We can do that by embedding special tags (like e.g. _MOZREPORT_EXPERIMENT_SLUG_) in the template that we can search-and-replace for when we generate the notebook.

Blocks #50.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.