Giter Site home page Giter Site logo

finances-bi's Introduction

Ronin Finances BI

Glueing together Python, Postgres, and Metabase.

Setup

Don't forget to delete postgres-data/ if you want to wipe everything and start fresh.

Questions

  • Transform data from multiple sources into a single source

  • Idempotent ingestion of data from sources

    Fetch by month:

    1. Drop all data from that month from that source
    2. Re-import data
    3. Re-clean/transform data
  • Data model? What should a raw[_umcu]_transaction -> transaction flow look like? What's the end goal data model?

Transaction:
  id: int
  account: str (or id)
  person: str (or id)
  amount: num
  new_balance: num
  date: datetime
  description:  # str
    second_party: str
    

Plan

0. Workflow or Orchestrate

Apache Airflow to schedule these as discrete steps Or just run them manually for now.

1. Scrape Transactions

Will need custom scrapers for each source. End result will either be in-memory data, JSON, or CSV.

2. Unify Transactions

Translate data into a common format. Must determine specification of data model.

# Transaction Sources
Id, SourceName, AccountName, Account#, AccountType
# Transactions
Id, SourceId, Date, Description, Amount, Balance, Note, Raw_Description, CategoryId
# Category
Id, ParentCategoryId, Name

2a. Clean Transactions

If the input data has extra jargon, try to remove the noise here.

    For Example: "Purchase TST* QUARANTINOS 555-555-5555 MI Date 05/06/24 12345678901234567890123 5555 %% Card 52 #4321 MEMO Balance Change -$37.48"...
    ...Should be reduced to something like "Purchase TST* QUARANTINOS 555-555-5555 MI"
    
    * Date is already tracked in `"Date"` column
    * Long numbers are not relevant (What are they?)
    * Card Info should not be duplicated on each transaction.
      * This info could be stored more efficiently in a separate account/card info table.
    * Balance Change is already tracked in `"Amount"` column
    

3. Store Transactions

Save the cleaned data to a store.

4. Categorize Transactions

Manually Or ML? Or Rules-based? (regex, keywords, etc.)

# Category Rules
Id, RuleJsonBlob
// RuleJsonBlob JSONSchema
{
  "type": "object",
  "properties": {
    "Name": {"type": "string"},
    "Description": {"type": "string"},
    "CategoryId": {"type": "integer"}
    "Rule": {
      "type": "object",
      "properties": {
        "Type": {"type": "string", "enum": ["regex", "keyword"]},
        "Value": {"type": "string"},
      },
    },
  }
}

5. Analyze Transactions

Metabase or Apache Superset or Others

finances-bi's People

Contributors

dill0wn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.