Giter Site home page Giter Site logo

datopian / datahub Goto Github PK

View Code? Open in Web Editor NEW
2.2K 98.0 325.0 97.39 MB

๐ŸŒ€ Rapidly build rich data portals using a modern frontend framework

Home Page: https://datahub.io/opensource

License: MIT License

JavaScript 21.82% CSS 11.41% TypeScript 66.40% Shell 0.32% MDX 0.06%
data-portal data-portals ckan data-portal-frontends data-presentation data-fabric data-management-platform nextjs open-data-portal react

datahub's People

Contributors

abhishekgahlot avatar aliounedia avatar amercader avatar anuveyatsu avatar cotts avatar demenech avatar dependabot[bot] avatar djw avatar domoritz avatar github-actions[bot] avatar gutts-n avatar hychen avatar johnglover avatar johnmartin avatar krzysztofmadejski avatar lauragift21 avatar luccasmmg avatar mattfullerton avatar max-mapper avatar mihi-tr avatar mpolidori avatar olayway avatar pudo avatar risenw avatar rufuspollock avatar sleeper avatar smth avatar steveoni avatar tavareshansen avatar teosibileau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datahub's Issues

More Extensive View Tests

Currently have one view test. Should have more.

  • Switching in data explorer
  • DataTable rendering etc

Replace jquery-ui with Bootstrap modal

Replace jquery-ui with Bootstrap modal ... (think that is all we use jquery-ui for!)

Update:

  • JQuery UI is used for creating draggrable modal dialogs (modal dialog per se is just the css)

So question is: do we need draggability on our dialogs?

[super] Data Query Support

Query support would involve supporting things like:

  • size / limit (already supported in a limited way)
  • offset
  • order_by / sort
  • filter

This will clearly depend on backend capabilities.

Implementation

  • #49 - New Query model object attached to Dataset as query attribute (we make into a Model so we can have event handlers)
    • Dataset can then listen for changes to this object and trigger queries to backend as needed.
    • This also means other things such as a Query editor view within or outside of Recline can take take of updating this object
  • Support in Backends for various of the operations
    • Limit / offset
    • Sorting - Memory backend done in 1491ae5
    • Filtering
  • View support - query editor - #53
    • Limit and offset support (plus pagination?)
    • Simple search box
    • Full Query editor - #53
      ย * sort support in DataTable - done in 5fc4fa9

Query Object Proposal

NB: It would be up to specific backends how to implement and support this query object. Different backends might choose to implement things differently or not support certain features.

Propose to base directly ElasticSearch query language. Query object would therefore have following key attributes:

Additions:

  • q: either straight text or a hash will map directly onto a query_string query in backend
    • Of course this can be re-interpreted by different backends. E.g. some may just pass this straight through e.g. for an SQL backend this could be the full SQL query
  • filters: dict of fields with for each one specified a filter like term, terms, prefix, range
    • Value for a field can just be text in which case this becomes a term query on that field
    • E.g. `my-field: 'abc' - would only match results with abc in that field
    • This is a quick way to do filtering.

Examples

{
   q: 'quick brown fox',
   filters: {
     'owner': 'jones'
   }
}

References

Simple 'DIY' Proposal

A query object will have the following attributes:

  • q: a free text query string. It will be up to specific backend implementations how to interpret this.
    • Examples: could be a solr style free text query string
    • Example: could be SQL for an SQL style setup
  • limit: limit the number of items returned by the query
  • offset: offset for start of query results
  • filter: dict/hash of filters keyed by field (header) id. Each filter is a single text string which is used to match against that field name.

Dumping place for minor and miscellaneous items

2012-01-10 move notification from util into view, simplify and boostrapify.

2012-01-09 auto draw something in the graph view so it isn't blank when you start (or give instructions ...) - done in #85

  • Even better would be to try and auto-select columns to draw

2012-01-05 p:? Move bulk transform / update onto Dataset / Backend object ... WONTFIX for the present

2012-01-06 p:med generate basic code docs with docco - now in #33

2012-01-06 p:high Move to having richer header attributes (?)

2012-01-06 p:low remove unwanted vendor deps (sammy, microevent (?), jquery.couch2.js (?), traverse (?))

2012-01-05 p:low pagination ... (at least offset) - see #27 for pagination

2012-01-05 p:low Tests for costco especially TransformPreview - not done but now in #114

Update theme

Could be nice to update theme (e.g. using bootstrap or jquery UI as in old data explorer).

Better notifications of activity and errors

  • Loading data can take a while so show a notification for this so user knows what is happening
  • Show an error notification when error from a backend (e.g. on data load -- e.g. 404 for DataProxy)

Wrap any JSONP requests in timeouts to catch errors

At the moment jsonp requests in Backends don't raise errors (because of nature of JSONP). We may wish to wrap in timeouts as a crude way to catch these

    var timeout = 10000;
    var timer = setTimeout(function error() {                              
      callback({                                                           
        error: {                                                           
          title: 'Request Error',                                          
          message: 'Had timeout with Backend (probably error) ' + (timeout / 1000) + ' seconds'
        }
      });                                                                  
    }, timeout); 

   // later
    $.ajax({
      url: url,
      dataType: 'jsonp',
      success: function(data) {
        clearTimeout(timer);
        callback(data);
      }
    });

Local Data Backend via HTML 5 file support

This would likely focus on CSV file import since that is all we could easily parse in browser. Could extend to more complex formats via pushing file out to online conversion service (gut based or otherwise).

Richer Field / Header attributes for Dataset

  • Rename headers to fields.
  • fields are objects / hashes (see below)
  • fields are an attribute on base of dataset rather than in dataset
  • (?) fields are Backbone models (and fields is a Backbone collection)

Fields and field attributes

fields: (aka columns) is the array of fields (column) to display. Each entry in fields is a hash having:

  • id: a unique identifer for this field
  • label: the visible label used for this field
  • type: the type of the data
  • data_key: the key used in the documents (usually this will be the same as id but having this allows us to set more than one column reading the same field.

Do we need data_key?

Support for sorting

  • Sorting should be done in the backend (i.e. as part of the query, sorting over the entire result set)
  • Can be passed in as tuple (column, direction)
  • UI: make this a section of the dropdown - NOTE: this means not hiding the entire dropdown in read-only mode.

DataProxy Backend

Create Backend (and hence data source support) for DataProxy.

This will allow us to display data from any CSV or XLS file that is on the public internet!

Read-only mode

Support for read-only mode where all editing options are disabled

Allow users to hide/add columns

Make it possible to individually enable and disable the display of columns in the table.

UI: There should be another drop-down in the top-left corner of the table with a list of checkable column names.

Add CellRenderer option to DataGrid

This would allow for cell values to be innerHTML so we can have links, tooltips etc.

CellRenderer is a function that takes two arguments:

  • value: cell value (the value from our data object for this document and field)
  • document: the data for the entire row
  • field - field object for this field (column)

It must returns html or a jquery element object suitable for rendering into the view.

Datasets and Backends: finalize method for creation and use

We went with option 3

What is backend?

Backend as Backend / DataProvider
Backend as a single DataSource (e.g. a single spreadsheet or database)

Various options

Option 3 - Backend (for many DataSources)

var backend = MyBackend(backendOnlyConfig)
// optional - add to backend registry
recline.Model.backends[backend-id] = backend;

// either directly or some time later
var dataset = Dataset(metadata, backend);

// Note: backend may be a backend instance or a backend id (string)
// Info about where backend should get data will be in Dataset 'metadata' (e.g. the url field!)

Option 1

Somewhere else in the system:

var backend = MyBackend(backendOnlyConfig)
my.backends[backend-id] = backend;

// at some point later
var dataset = Dataset(metadata);
dataset.backendConfig = {
    type: 'my-backend',
    url: http://mybackend.org/api
    }

initialize:

    this.backend_config = backend_config;
    this.backend = my.backends[backend_config.type];
    // all subsequent code will just use this.backend 
    // of course backends can access model.backend_config and, of course, model (if they want!)

Option 2 (was Option 0 - sort of) - DataSource

var backend = MyBackend(backendConfig)
    What is backendConfig? = backendOnly, datasetRelated (e.g. dataset api url)

var dataset = Dataset(metadataConfig, backend)

metadata = id, title, url

Option 4

dataset = Dataset(metadata = {}, backend_config={})

[dataset = CKANDataset(metdata)]

JC: i construct a dataset (and I don't need to know about backends ...) -- why wouldn't sufficient information to create / find its relevant backend.

Either in metadata:

  url = backend url
  api_url = 
  backend_type 

Support multiple backends at once

At the moment only one backend can be in use at a given time. With a small amount of refactoring we can change this so sync switches on backend.

This would require Datasets to be configured with info about their relevant backend but this also a good idea in that we would then be able to do something like:

var dataset = new Dataset();
dataset.backend = {
  type: 'webstore',
  url: 'http://....'
}

Which is cleaner and more intuitive than current setup (where you set dataset endpoints in backend!

This change owes much to discussion with @jamescasbon

Export / Save of data

Export / save options including:

  • Export in JSON to a dialog
  • Save to CSV file on local file system (using File API). This functionality would go in new localcsv "backend" (not really a backend so much as convenience functions).
  • Export to the DataHub (or any other CKAN site)

Prompting a file download

You can use data URIs!

document.location.href = 'data:text/csv;charset=utf-8,' + encodeURIComponent(csvData);

Note IE8 & IE9 only allow data URIs for images: http://msdn.microsoft.com/en-us/library/cc848897(v=vs.85).aspx

Saving to filesystem

Can only do this with HTML5 file apis and then only to a specific area (not anywhere on disk).

This thread http://stackoverflow.com/questions/2897619/using-html5-javascript-to-generate-and-save-a-file outlines some useful options including a data:uri hack that works for files < 256kb in FF and chrome and a mention of a flash app that can help: https://github.com/dcneiner/Downloadify

Extras

  • Write to any of our existing writable backends.
    Is this needed given that most backends have this functionality (and done better than we can)? (maybe we just want support for linking to the relevant dump?)

Support for pagination and/or offset

Can already configure total number of rows / documents to show but also want to:

  • Configure offset (where to start)
  • Pagination

Questions:

  • Is pagination needed?
  • Do we implement proper pagination or just 'scroll' pagination

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.