github / scientist Goto Github PK

View Code? Open in Web Editor NEW

7.4K 418.0 438.0 226 KB

:microscope: A Ruby library for carefully refactoring critical paths.

License: MIT License

Ruby 98.02% Shell 1.98%

refactoring ruby scientist rubygem

scientist's People

Stargazers

Watchers

Forkers

calavera muthhus banzaiman natesholland aroben supermarin s0enke tyamagu2 wumingla juhomi ferdinandrosario neurogenesis brettrann qyrra rubynuts yanlinaung mlkunnath nishanthvasudevan mesutcan desireco cartersgenes alexyanai fbatroni nickmerwin mausvargas uslic001 anandhs joeygo23 opensourceprojects ajitsing jjnkruma daylerees bryonglodencissp tomokane hawkinsunlimited leroyg wolfhesse madcapjake viniciustorves danquanmiao vamsirajendra cniweb danishack mpcabd venvictor scheetz lee1987 deluxebrain kelursheldon headlinerfm nanne007 kumarh1982 piperniehaus naveenwashere jiaelee uldza mlogix hooptie45 spongeboy pamgaea leostera curiouself p-lambert kleopatra999 loxadim klippx almog sffej zachlungu lukeasrodgers threetreeslight khutchi djrodgerspryor modulexcite digideskio connorshea jaeg131194 mattlk13 tonweight yamachi bschmeck michoss fighterleslie yanzms xy0517 kore-development uday9944 vinceano yibj tanyakovalenko mgiacomini yileiliu chriszacny awsonline fazaton zfqgit w3ss samuelkdavis karseyr kaiuhl

scientist's Issues

What do you call the high-level concept of scientist?

Scientist is fantastic for Ruby development, but how would I find a similar library for other languages? I'm looking to do this in PHP and JS.

Is there a name for this technique or pattern of development? Having a control and a candidate and running both under load, to analyze the results - this is a pattern I have also used, but without good terminology to quickly communicate to others.

The Control / Candidate Pattern?
Candidate Driven Development?

If there is a common name for this, it would be good to add it to the README.

Support error comparison

Performance impact

Hello and thanks y'all for all the awesome work you do.

I wanted to share something I started looking into after @brntbeer pointed out Scientist's usage, during the last workshop here in Stockholm.

During the presentation, I started writing an erlang port[0] that would do the same, and assuming Scientist ran the candidate and control concurrently, went ahead and did exactly that.

But wanting to make sure, and after a quick github-search+grep for fork and other rubyisms for spawning processes, I ran a test: https://github.com/ostera/scientest/tree/master/test.rb

It's a very naive test, but the results are less than promising:

Running 3 tests with factor 100...
{:name=>:candidate, :time=>6.157206773757935}
{:name=>:candidate_no_try, :time=>0.7435848712921143}
{:name=>:control, :time=>0.7296631336212158}
>> Fastest run:
{:name=>:control, :time=>0.7296631336212158}
done.

With the version that runs the try block being unreasonably slower on every single run.

Maybe around here we can fork off the candidates and results publishing and immediately return the value of the control (or raise the exception if it has one)? I'll give it a shot tomorrow morning and see what I can come up with :)

[0] I'm aware of fuge, but what better way to learn than by hacking

Technical changes to secure protocol

CHECK

#131

Add diffing support to results

The internal GitHub developer tooling includes custom views of published mismatch data that calculates diffs of the values. It could be helpful to include that diffing support directly via Diff::LCS and generating diffs of pretty-inspected (and cleaned) values.

Ref: #84

A

Question: Is the Ruby 2.1 requirement required?

I tried installing the gem on ruby 1.9.3 and the gemspec says it requires 2.1. Is this something that could be relaxed?

Some doubts about scientist

Hi Guys,

Sorry if it isnt the best place for doubts, I read the documentation but I not found the answers below:

How scientist decides when execute the try block?
How often the try block is executed?
Does scientist always compare the try with use block?

Thanks

Updated changelog

Hi, I noticed the changelog isn't updated for the last releases. Is that just an oversight? Or did nothing meaningful change?

Thanks very much for contributing! Your pull request has been merged 🎉 You should see your changes appear on the site in approximately 24 hours.

Originally posted by @github-actions[bot] in github/docs#1697 (comment)

Support timeouts for candidates

For our use case, we're using Scientist to replace a portion of our application to a separate service. A problem we encountered is that the service we built died, so in our candidate portion of the experiments, the network just waited and eventually timed out, but the page sits for a long time until that point (which is our problem, not a concern of scientist).

So a cool feature would be supporting timeouts in candidates. Thoughts?

Version bump?

Looks like it's been almost a year since a 1.0.0, and there are some changes (especiallly #58) that would be nice to get into a new release. Obviously I can just use GH master if a release is not feasible at this time.

cheers

result mismatched eventhough the result is same

Hi guys, I try to compare sql query(using to_sql) of candidates and control, but the result is mismatched. When I try compare it manually, it matched. any clue for this problem?

really need help, thanks

Ruby 2.1 compatibility still mentioned in the readme

95c4bef removes the dependency on Ruby 2.1, right? Could you update that documented requirement at the bottom of the readme? Is it now Ruby 1.9 compatible?

Support incremental rollouts

Possibly make these (optionally) configurable via zookeeper

Exceptions in compare block shouldn't break normal execution

The great thing about Scientist is being able to test out new code without breaking your existing functionality. However, I recently discovered that if you have a custom compare block, unhandled errors in that block will get raised up. I feel like comparing your control and candidate should also not break the normal execution if something goes wrong (i.e. you want your experiment to have as little impact as possible). I think this could be fixed by updating these lines to:

if comparator
  begin
    comparator.call(value, other.value)
  rescue StandardError
    false
  end
else
  ....

I don't have time to open a PR for this right now, but was wondering if this was a good addition or not.

Scientist for PHP

Hi Github! Thanks for everything that you do for us!

Scientist looks amazing. Unfortunately, I'm somewhat attached to PHP still.

Don't worry though, I decided to port Scientist to PHP, and here it is:

https://github.com/daylerees/scientist

It's fairly similar, apart from the obvious quirks that arise from a difference in language.

I'd love it if you'd consider linking it as an alternative from your readme, and if not.. well.. thanks for the library anyway!

Have a great weekend.

Dayle.

How to get `Scientist.run` to run a candidate block?

Hi! I just tried to set up a quick experiment using the new Scientist.run block, but noticed that the try candidate block wasn't executing at all. I dug into the code a bit and noticed that the default experiment has been changed so that it doesn't run try paths anymore (as of #74)

This was quite confusing, and I only realized it because I wrote a paranoid RSpec example to make sure the try block was being called. It seems to me that #79 and #74 are incompatible changes, since I don't see a way to override enabled? for the experiment you get from running Scientist.run.

Thoughts? Is there a way to get an experiment running with Scientist.run that I'm missing?

maintenance

Scientists

Ideas for backwards-compatible solutions (none of them particularly clean):

ignore could pass more arguments to the block, which is backwards-compatible assuming clients use a proc.
Add a new method ignore2 which receives observation objects instead of values.
Pass a custom nil-value as observation value to ignore if an exception is raised. This special nil-value should expose the exception.

Support different return types and custom comparators between objects of different types

> > ### What is the current behavior?

What is the current behavior?

Some table does not appear to be correctly in some translated page.
ex: https://docs.github.com/ja/github/setting-up-and-managing-organizations-and-teams/repository-permission-levels-for-an-organization
When old commit( 01b5d89 ) seems no problem.

at en ( no problem)

at ja

at cn

What changes are you suggesting?

Add lost tail | and fix {% endif %} position (s/**X** \n/**X** |{% endif %}\n/).

ng (now)
| 公開済みリリースの表示                                                                                                                                                                                                         | **X** | **X**  | **X** |  **X**   |                                              **X** |{% if currentVersion == "free-pro-team@latest" %}
| [[GitHub Actions workflow runs](/actions/automating-your-workflow-with-github-actions/managing-a-workflow-run)] の表示                                                                                                 | **X** | **X**  | **X** |  **X**   |                                                                 **X** 
{% endif %}
(maybe) ok
| 公開済みリリースの表示                                                                                                                                                                                                         | **X** | **X**  | **X** |  **X**   |                                              **X** |{% if currentVersion == "free-pro-team@latest" %}
| [[GitHub Actions workflow runs](/actions/automating-your-workflow-with-github-actions/managing-a-workflow-run)] の表示                                                                                                 | **X** | **X**  | **X** |  **X**   |                                                                 **X** |{% endif %}
Fixed rendering sample ( my local env )

(I tought make fix pull-request. But protocols of fixing translate docs are seems not simple 🙄 )

Additional information
Originally posted by @hainusii in github/docs#4480 (comment)

Originally posted by @hainusii in https://github.com/github/opensource.guide/issues/2269#issuecomment-799700420

PR blitz

I'll be giving a talk on Science at Big Ruby in a couple of weeks. @jbarnette mentioned there might be something new to share around that time.

Let me know if there's any particular content you'd like me share in that talk. I'm pretty much just taking the README for the old project and giving it a candy coating of GitHub Flow.

Create an alternate interface for publishing results

In the making science useful section of the README, the following code is used to override the Scientist::Experiment#new method:

# replace `Scientist::Default` as the default implementation
module Scientist::Experiment
  def self.new(name)
    MyExperiment.new(name: name)
  end
end

This makes it possible for the experimenter to publish results, and without this change scientist isn't as useful.

I don't understand the inner workings of this gem, but it would be fantastic to have the ability to customize my experiment without having to override the class. Is it possible to create a class level method in Scientist::Experiment which turns on the experiment? something like create_this_experiment with: :name?

😬 I know this is a big ask, and I would be glad to help with developing a feature like this.

Allow setting raise_on_mismatch per Experiment

It would be really useful for a certain project I'm working on to be able to set the raise_on_mismatch setting at the Experiment level, instead of a class level attribute. Something like:

science("my-experiment") do |e|
  e.raise_on_mismatches = true
  e.control {}
  e.candidate {}
end

What do you think?

Update a little sir

Looks like it's been almost a year since a 1.0.0, and there are some changes that would be nice to get into a new release. Obviously I can just use GH master if a release is not feasible at this time.

cheers

stop makeing github and gooooooooooooooooooooooooooooooooooooooooooooooooo

hello

[email protected]

Build is failing

It seems that the gems required for running the tests are not installed.
No idea why though.

[Feature request] Allow `Scientist::Experiment` classes to not be default Scientist experiment class

I'm wondering what the original inspiration for making any class that includes the module the default experiment, code here. Is it possible to create a class that includes the Scientist::Experiment module and not make it the default scientist experiment?

Looking here it doesn't seem like it is possible.

From docs as well:

When Scientist::Experiment is included in a class, it automatically sets it as the default implementation via Scientist::Experiment.set_default. This set_default call is is skipped if you include Scientist::Experiment in a module.

The reason I ask is because recently we ran into an issue with our Rails app:

We had a class that was running Scientist via include Scientist with the default experiment, but was not publishing anything, oops.
We created a new class for that included Scientist::Experiment, but since our tests don't eager load our classes, it was a flaky test and not surfaced during our build.
When running in production, we eager loaded and overrode the default experiment and then surfaced issues.

So a few questions I was wondering:

Why default experiment, this seems a little aggressive IMO?
Can we actually find a way to either make default the normal behavior and have a way to pass a flag to override it?
Or is it that the way we were using it was wrong, and that the intention is to actually just include Scientist and override the default methods for that class? If so, the only reason I held back from doing so is better separation of responsibilities. It was going to look a bit messy to clog up one class's specs with experiment specs and I'd rather separate the two easily 🤔
Relating to point 3, is it the case that by the current implementation we can't be running multiple experiments at once?

I looked into it a bit myself, but it doesn't seem like there is a reasonably clean way. Although I think we can add a class method (something like):

class Foo
  include Scientist::Experiment

  default_scientist_experiment(false) # new
end

What do y'all think? If so - I can try to take a stab at it.

Per-experiment sampling?

When using the science method in the Scientist module, I want to supply an extra parameter that would be used inside the enabled? method to determine whether to run the experiment.

The intention is to be able to control, on a per-experiment basis, the percentage of time that the experiment will actually run.

Is something like this possible?

Specifying order for how the experiment runs

I have a use case where I'd like to run the experiment as it's written as opposed to it being random.

Would this be something that would be welcome as a PR if I created this feature?

Thanks!

Scientist::Experiment.new creates Scientist::Default instance even when it should have been overridden

Hi friends! Thank you for your great work with this gem ✨

I am having a problem in Rails apps where Scientist::Experiment defaults to the original Default object until the custom one is called - leading to some head-scratching about why the try block is not running even when enabled? is set to true.

I'm unsure whether this is a problem with the Rails load order because of how I've arranged my files, whether the examples in the README could be a little better, or whether there's genuinely a bug here.

$ bundle exec rails console
Loading development environment (Rails 5.1.6)
irb(main):001:0> Scientist::Experiment.new "something"
=> #<Scientist::Default:0x00007fd667bbcd40 @name="something">
irb(main):002:0> LdapExperiment.new(name: "something")
=> #<LdapExperiment:0x00007fd667b6ee10 @name="something">
irb(main):003:0> Scientist::Experiment.new "something"
=> #<LdapExperiment:0x00007fd667b34968 @name="something">

I've followed the instructions in the README, which are delightful and comprehensive.

# app/experiments/ldap_experiment.rb
require "scientist/experiment"

class LdapExperiment
  include Scientist::Experiment

  attr_accessor :name

  def initialize(name:)
    @name = name
  end

  def enabled?
    # ...
  end

  def publish(result)
    # ...
  end
end

module Scientist::Experiment
  def self.new(name)
    LdapExperiment.new(name: name)
  end
end

# app/models/whatever.rb
class Whatever
  include Scientist
  def do_something
    science "role lookup" do |e|
      e.use { do_one_thing }
      e.try { do_some_other_thing }
    end
  end
end

This is occurring in Rails 3.2.x and Rails 5.1.x applications, with version 1.2.0 of the gem.

Thanks for opening this pull request! A GitHub docs team member should be by to give feedback soon. In the meantime, please check out the [contributing guidelines](https://docs.github.com/en/contributing).

          Thanks for opening this pull request! A GitHub docs team member should be by to give feedback soon. In the meantime, please check out the [contributing guidelines](https://docs.github.com/en/contributing).

Originally posted by @WELCOME[bot] in github/docs#33590 (comment)

Allow setting raise_on_mismatches to base class level for tests

We are using a base experiment class to manage our publish logic. All the actual experiments are extending from this base class. However, if we set raise_on_mismatches to this base class the result doesn't take effect as the method is using class instance variables(instead of class variables). In the end, people occasionally forget to use this test helper for their newly created experiments.

It would be great if we can manage this test helper via a base class. To elaborate more on this, this is somewhat our structure;

class BaseExperiment
  include Scientist::Experiment

  def initialize
    # setting up some instance variables mainly needed for publish logic
  end

  def publish(result)
    # some custom logic
  end
end

class WidgetExperiment < BaseExperiment
  def initialize
    # setting up custom variables
    super
  end

  def enabled?
    # custom enabled logic
  end
end

class AnotherExperiment < BaseExperiment
  # similar context with widget experiment
end

What we want is to use BaseExperiment.raise_on_mismatches = true to ensure all child experiments are tested on tests without needed a separate test setup. Would such a need make sense to you?

No way to compare exceptions

If the control path and the candidate both raise an exception, there isn't a way to compare them today. The comparison is currently hardcoded here. Would there be any interest in accepting a patch that added the equivalent* method to experiments?

science "widget-permissions" do |e|
  e.use { raise "Foo" }
  e.try { raise "Bar" }

  # Only called when both paths raise an exception
  e.equivalent do |control_exp, candidate_exp|
    # Ignore the message contents
    control_exp.class == candidate_exp
  end
end

I'm not set on the name, please suggest anything better.

All exceptions — including signals — are caught during an observation

observation.rb does this:

def initialize(name, experiment, &block)
    ...

    begin
        @value = block.call
    rescue Object => e
        @exception = e
    end

    ...
  end

Which is a well documented anti-pattern. In particular, it means that any signals (eg. <SignalException: SIGTERM>) will be treated as-if they were just an error in the candidate code (which usually means logging and ignoring).

Note: There's no difference between rescue Object and rescue Exception because raising a non-exception (eg. a string) will either raise a <TypeError: exception class/object expected> or a <RuntimeError: Some string>.

I think the standard pattern — rescue StandardError — is correct here. That will catch everything except SignalExceptions and other things which aren't meant to be dealt-with as part of standard error handling.

Although users could filter-out all non-StandardError Exceptions themselves, this feels like a footgun (since signals will be relatively rare — especially in development — most users won't notice any problems until they happen in production).

Track memory usage

It'd be nice to be able to compare the memory usage of both the candidate and the control blocks.
Runtime Performance isn't always the optimization we're after.

Ignoring specific exceptions

Currently observation values and not observation objects are being passed to ignore. This means it's not possible to ignore transient exceptions that occur in either candidate or control. When an exception occurs, nil is passed as value to ignore for that behavior.

Example:

e = Scientist::Experiment.new "foo"
e.use { rand > 0.001 ? 42 : raise SomeConnectionTimedOut.new }
e.try { rand > 0.001 ? 42 : raise SomeConnectionTimedOut.new }
e.ignore do |control, candidate|
  # control/candidate is nil if exception occurred in that behavior
end

It would be great if ignore received an observation object instead of only the observation value. That would allow examining the exception value.

Example:

# Proposed behavior! This is currently not possible.
ignore do |control, candidate|
  control.exception.is_a?(SomeConnectionTimedOut) ||
    candidate.exception.is_a?(SomeConnectionTimedOut)
end

I currently work around the above issue by rescuing inside try/use and return a sentinel value -- it's a terrible hack though. Would you accept a PR implementing this?

How can we use scientist to do experiment of function has side effect

First, thanks for the scientist gems, it really opens my eyes of maintain software reliability. I just have one problem, if the function to test has some side (for example, it query some data in a database and updated it), so if we do run the control and candidate code, it will definitively cause some problem, right? So is scientist only useful at testing functional style code, or how could we use it to test for function with global side effect?

Basic experiment is missing control behavior

I saw that y'all deprecated dat-science in favor of scientist. I want to update, but I immediately hit a roadblock. I couldn't get a basic experiment, like the one in the readme, to work. Here's what I tried:

require 'scientist' # 0.0.3
Scientist::Default.new('experiment') do |e|
  e.use { true }
  e.try { true }
end.run
# Scientist::BehaviorMissing: experiment missing control behavior

Based on the source, I guessed that I might have to use e.control (like with dat-science) instead of e.use. That didn't work either.

Am I doing something wrong?

Provide hooks for testing supporting code

Before anything else: thanks for this fantastic library! It's very thoughtfully designed, and the README is exemplary. The existence of this issue probably says more about me than about y'all. :) <3

I've used Scientist to test out a few code changes now. Rather than write to logs (which are annoying to search), I'm using a few ActiveRecord models to persist experiments, results, and observations. (Since my work doesn't operate at roflscale like GitHub does, I have the luxury of being able to write to the DB.) This is mostly pretty great, but I've had to commit a few sins in order to test some of my features.

Issues I've had so far:

If a specific experiment doesn't provide its own #clean block, I'd like to always use a default. However, calling experiment.clean.nil? has the side effect of setting the clean block to nil. Instead, I've had to override the #run method so I can run this code before calling super: @_scientist_cleaner ||= ->(value) { Experiment::DefaultCleaner.call(value) } Suggested change: add an accessor method that I can query instead of setting the ivar directly.
In order to test the code that compares control/candidate timing data, I have to stub Process.clock_gettime with a series of ordered values, then manually compute the difference between them. This works, but feels a bit unclean. Suggested change: add an internal method (call it #record_duration ?) that calls the timing method, yields, sets @duration, and returns, and call this method in Scientist::Observation#initialize. Including code can then stub this method to provide its own [canned, in my case] timing data.
Along with #2, I also have to provide branching logic in my tests, because I don't know whether the control or candidate will run first. Not only is this more difficult to read, it also introduces nondeterministic behavior to my test suite. 😱 Suggested change: extract the behaviors.keys.shuffle in Scientist::Experiment#run to an internal method (call it #wibble ?) that returns the list of behavior names to run in order. As with my suggestion for 2, this facilitates stubbing.

The following RSpec snippet illustrates items 2 and 3. (Note: the nested lists in the arguments to #stub_clock are purely for readability.)

# Scientist uses Process.clock_gettime for timing (as it should!)
def stub_clock(*ordered_values)
  allow( Process ).to receive( :clock_gettime ) \
    .with( Process::CLOCK_MONOTONIC, :float_second ) \
    .and_return( *ordered_values.flatten )
end

specify "the saved records contain timing data" do
  # The first run should take 0.5 seconds; the second 1.0
  # Again, note that order is random
  stub_clock [ 1.0, 1.5 ], [ 2.0, 3.0 ]
  Experiment.science "wibble" do |e|
    e.use { :wibble }
    e.try { :wobble }
  end

  case control.duration
  when 0.5
    expect( candidate.duration ).to eq(  1.0 )
    expect( result.time_delta  ).to eq( -0.5 )
    expect( result.speedup     ).to eq( -2.0 )
  when 1.0
    expect( candidate.duration ).to eq(  0.5 )
    expect( result.time_delta  ).to eq(  0.5 )
    expect( result.speedup     ).to eq( +2.0 )
  else
    fail "expected control duration to be one of the two stubbed values, but was #{control.duration.inspect}"
  end
end

I'm happy to submit a PR for these changes, but figured I should start a discussion here first, given that:

this mostly affects people working on [probably internal] Scientist-related infrastructure, and provides no benefit to people using Scientist as described in the README;
my suggested solutions for 2 and especially 3 could make it easier for users to do Bad Science, in that the existence of a control will always tempt someone to use it;
these changes would exist only for testability. (I personally don't mind that, but recognize the existence—and possibly even the validity—of other opinions. 😜)

List of ports in README?

Seems like a lot of people are writing ports of this library to other languages (and I'm planning on writing a Java port soon) - it would be nice to have links to those in the README.

Wiki?

I have some examples of how to use this tool that don't quite fit the README. I was wondering if yall could open up the wiki for additional documentation opportunity. I'd love to put our use case out there for more people to use as a code sample for inspiration.

github / scientist Goto Github PK

scientist's People

Stargazers

Watchers

Forkers

scientist's Issues

What is the current behavior?

at en ( no problem)

at ja

at cn

What changes are you suggesting?

ng (now)

(maybe) ok

Fixed rendering sample ( my local env )

Additional information

Recommend Projects

Recommend Topics

Recommend Org