Giter Site home page Giter Site logo

basedosdados / mais Goto Github PK

View Code? Open in Web Editor NEW
388.0 18.0 87.0 30.75 MB

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/

Home Page: https://info.basedosdados.org/links

License: MIT License

Makefile 0.01% Python 18.01% Stata 24.40% R 13.77% Shell 0.05% TeX 0.02% SQL 43.71% Dockerfile 0.03%
dados-abertos bigquery transparencia open-data govtech data-science sql r python hacktoberfest

mais's People

Contributors

allcontributors[bot] avatar arthurfg avatar aspeddro avatar crislanealves avatar d116626 avatar fernandascovino avatar filipemsc avatar folhesgabriel avatar gabrielle-carv avatar gustavoairestiago avatar gustavoalcantara avatar hellcassius avatar hevsouza avatar hsxavier avatar ingridsrabelo avatar joaocarabetta avatar jpdonasolo avatar laura-l-amaral avatar lucascr91 avatar lucasnascm avatar mavalentim avatar mfagundes avatar pedrocava avatar polvoazul avatar rdahis avatar rfdornelles avatar rlgabbay avatar tricktx avatar vmussa avatar vncsna avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mais's Issues

Add Storage class

Should support

  • init : initialize new bucket and folders
  • create : to create folders given dataset_id and table_id
  • delete : delete folders given dataset_id and table_id

Melhorar campo 'data_update_frequency' no table_config.yaml

  • Mudar os exemplos para letra minúscula: hora, dia, semana, etc.
  • Agora não está contemplando bases que mudam de forma 'recorrente', ou só quando acontece alguma mudança relevante. Ou seja, frequências irregulares e reativas a eventos.

Download pelo CLI permitir especificar só a pasta.

Nesse exemplo:

basedosdados download --dataset_id br_suporte --table_id diretorio_municipios ~/Downloads/

ele dá erro que Downloads é uma pasta. Ele podia baixar direto a table com o nome do BQ (nesse caso 'diretorio_municipios.csv'.

Erro com download em Python

Rodei:

import basedosdados as bd

bd.download(savepath="~/Downloads/test.csv",dataset_id='br_suporte',table_id='diretorio_municipios')

e deu

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-1-c0f5299e96d1> in <module>
      1 import basedosdados as bd
      2 
----> 3 bd.download(savepath="~/Downloads/test.csv",dataset_id='br_suporte',table_id='diretorio_municipios')

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/basedosdados/download.py in download(savepath, query, dataset_id, table_id, project_id, limit, **pandas_kwargs)
     58 
     59     if (dataset_id is not None) and (table_id is not None):
---> 60         table = read_table(dataset_id, table_id, limit=limit)
     61     elif query is not None:
     62         if limit is not None:

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/basedosdados/download.py in read_table(dataset_id, table_id, project_id, limit)
    124         raise Exception("Both table_id and dataset_id should be filled.")
    125 
--> 126     return read_sql(query)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/basedosdados/download.py in read_sql(query)
     88         Query result
     89     """
---> 90     client = bigquery.Client()
     91     return client.query(query).to_dataframe()
     92 

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/google/cloud/bigquery/client.py in __init__(self, project, credentials, _http, location, default_query_job_config, client_info, client_options)
    178         client_options=None,
    179     ):
--> 180         super(Client, self).__init__(
    181             project=project,
    182             credentials=credentials,

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/google/cloud/client.py in __init__(self, project, credentials, client_options, _http)
    247 
    248     def __init__(self, project=None, credentials=None, client_options=None, _http=None):
--> 249         _ClientProjectMixin.__init__(self, project=project)
    250         Client.__init__(self, credentials=credentials, client_options=client_options, _http=_http)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/google/cloud/client.py in __init__(self, project)
    201         project = self._determine_default(project)
    202         if project is None:
--> 203             raise EnvironmentError(
    204                 "Project was not passed and could not be "
    205                 "determined from the environment."

OSError: Project was not passed and could not be determined from the environment.

Criar padrão de nomeação para tables.

Precisamos de um padrão para tabelas que se repetem entre níveis de observação.

Sugestão: <nível>_<descrição>

Exemplo: dados de PIB do IBGE para municípios, estados e país. Dentro do dataset br_economia_IBGE, as tables poderiam ser municipio_PIB, estado_PIB, pais_PIB?

Where to put overall config?

  1. CLI deve rodar de qlqr lugar do computador
  2. secrets deve estar nessa pasta

lugar do config: ~/.basedosdados

config.yaml

metadata_path: 
bucket_name:
templates_path:

user:
   name:
   email:
   website:
   photo:

Criar comandos rápidos no CLI

Adicionar tabelas simples:

basedosdados quick add_table DATASET_ID TABLE_ID FILEPATH[FOLDERPATH] --publish

Adicionar novo dado no storage e atualizar tabela

basedosdados quick add_table_data DATASET_ID TABLE_ID FILEPATH

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.