Giter Site home page Giter Site logo

Comments (7)

BaxterEaves avatar BaxterEaves commented on May 21, 2024

Some of these items are needless. Suggesting numbers (which I assume means n_models and n_iterations) is data dependent. People should use ANALYZE t_cc FOR <n> MINUTES rather than FOR <n> ITERATIONS if time is a concern and use the diagnostics in bdbcontrib if they are concerned about convergence or multimodality.

I'm not sure what long-running background analysis means. Does this mean start a background analysis and continue with the tutorial? If this is the case, it seems a bad idea because the analysis will proceed before any useful answers can be got---potentially before a single ANALYZE iteration. If i'm wrong, please correct me.

As it stands now I'm going to ignore everything other than assertions, which I'll implement with .assert in the shell

bayeslite> CREATE TEMP TABLE tt AS
       ..>    ESTIMATE Name, PREDICTIVE PROBABILITY OF Expected_Lifetime AS f_life
       ..>    FROM t_cc;
bayeslite> .assert eq 'SELECT Name FROM tt ORDER BY f_life ASC LIMIT 1;' 'International Space Station'

or something along these lines. I don't really like the looks of that though...

from bayeslite.

riastradh-probcomp avatar riastradh-probcomp commented on May 21, 2024

Background analysis would be as in the old bayesdb, but combined with checkpointing so that the answers slowly get better as you wait.

I don't think assertions are actually appropriate. Better to write the expected outputs -- should be easy now that we have deterministic Crosscat -- and when compiling the documentation, fail if the outputs changed.

from bayeslite.

BaxterEaves avatar BaxterEaves commented on May 21, 2024

One of the things assert would be used for is to determine whether analyses are stable across seeds. The results we want to be talking about in the example analyses and tutorials should be easy to for bayeslite to find in the data. If they become not easy to find, that indicates a problem. For example, in the satellites analysis the ISS being the weirdest satellite by lifetime is constant across seeds and doesn't require many iterations to figure out; on the other hand, the ANALYZE time needed to ID the geosynchronous orbit period typos as anomalous is quite different across seed. In a case like this you might want to assert that an entry appears in the list somewhere. If on average, it takes too long to get at a results, we'd want to provide the database or otherwise not mention that result.

tl;dr Comparing bayeslite output w/ expected output does not help us determine whether results are stable across seeds; to do that, we need a more robust way to check properties of the output.

from bayeslite.

riastradh-probcomp avatar riastradh-probcomp commented on May 21, 2024

Not important for the tutorial: all we need for that is a quick test to make sure what the reader sees will match what is in the text, to alert us either to accidental bugs in each commit or to the effects of intentional semantic changes.

Tests of the distribution of results are a separate matter and should be part of long-running quality tests that we run occasionally.

from bayeslite.

gregory-marton avatar gregory-marton commented on May 21, 2024

What's the status on this? Is there a start on a tutorial somewhere?

Did assertions happen? ".assert eq 3 3" said "unknown command .assert", but maybe it's on a branch I'm not tracking? Are the things we wanted to assert instead things we should put in a test suite? I'd be happy to try to turn some of the tutorial examples into portions of a test suite.

What crosscat diagnostics were desired? Is that going anywhere?

from bayeslite.

riastradh-probcomp avatar riastradh-probcomp commented on May 21, 2024

The thing that we wanted to test was: if, in the tutorial material, we write

...and if you run this query you'll get an amazing result:

bayeslite> ESTIMATE PROBABILITY OF cake = 1 IN DESSERTS;
3.1415926535897932384626433832795

and when the user actually runs the query, she gets 2.718281828, that would be a bummer. Further, if it suddenly starts coming out as 3.140, that should call our attention to a recent numerical mistake in the system. So this is a combination of an automatic testing issue and a documentation issue -- the documentation should fail to build if the answers come out differently.

vkm has been talking about assertions for months and I don't know what other properties if any he actually wants, and just about any realization of assertions I can imagine into the query language or into shell commands makes approximately zero sense.

As for Crosscat diagnostics, I have no idea what that was originally. Maybe something about presenting Crosscat internals to the user in the bayeslite shell, which is supported by commands added in bdbcontrib.

from bayeslite.

axch avatar axch commented on May 21, 2024

The majority of this issue is settled by the Satellites and Malawi analyses being reasonably polished. I am separating the remainder out as #103 and closing this.

from bayeslite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.