Giter Site home page Giter Site logo

mcsv's Introduction

MCSV

CSV-library in C++17 which is not a Microsoft compiler. Highlights:

  • Pandas-like manipulation
  • Extraction as Eigen::Array or Eigen::Matrix
  • header-only

Usage

A CSV-library, which is meant allow easy dataframe-manipulation inspired by the python-library pandas.

Loading and viewing the data

Simply load a csv-file by passing its path as std::string or std::filesystem::path to the utility-function mcsv::read_csv. The csv-file can then be printed e.g. with std::cout.

auto df1 = mcsv::read_csv("test.csv");

std::cout << df1 << std::endl;

Output:

col1 col2 col3 col4
1    2    3    4
10   20   30   40
100  200  300  400

If the number of columns of the csv-file is known at compile-time, we can pass it as a template-parameter to enable additional compile-time checks. The constructor of the dataframe checks if the number is correct and throws an exception otherwise.

auto df1 = mcsv::read_csv<4>("test.csv");

Some notes on the csv-file import:

  • All column headers must be unique
  • The number of columns is determined by the first row (the 'header')
  • If a row has less entries then the header, the row is filled with empty strings
  • If a row has more entries then the header, the row is simply cut at the end

Filtering the data

There exist several possibilities to filter rows and columns of the csv-file. The basic principle is the following: Each filter-operation returns a new dataframe-object. This works without copying the data, all dataframes originating in a certain file hold one std::shared_ptr to the actual data. The only things that are changed by these operations are the information, which columns or rows are active.

  • Filter columns::
auto df2 = df1("col2","col4");

std::cout << df2 << std::endl;

Output:

col2 col4
2    4
20   40
200  400
  • Filter rows with comparison-operators::
auto df3 = (df2 > std::tuple(10,10))

std::cout << df3 << std::endl;

Output

col2 col4
20   40
200  400
  • Combined filtering:
auto df4 = df3.select_rows( df3("col2") == std::tuple(20) );

std::cout << df4 << std::endl;

Output:

col2 col4
20   40
  • Filter with STL-containers:

Note: At the moment, this works only, if the dataframe has exactly one row.

std::vector<int> vec = { 1,2,3,4,5,6,7,8,9,10 };
auto df5 = df1.select_rows( df1("col1").is_in(vec) )

std::cout << df5 << std::endl;

Output:

col1 col2 col3 col4
1    2    3    4
10   20   30   40
  • Use logical operators:

Note: So fare, the logical operators only change the rows. The columns stay untouched.

auto df6 = df1.select_rows( df1("col2") < std::tuple(10) || df1("col3") > std::tuple(200) )

std::cout << df6 << std::endl;

Output:

col1 col2 col3 col4
1    2    3    4
100  200  300  400

Iterating through the data

The dataframe class provides an easy-to-use itable for range-based for-loops:

for( const auto &row : df1.row_iterable() )
    for( const auto &cell : df1.col_iterable(row) )
        do_something_with(cell);

Extracting data

  • Extract columns as std::vector. The result is given as a std::tuple< std::vector<T1>, ... >:
auto [col2, col4] = df1("col2","col4").cols_to_vectors<double, int>();
  • Extract rows as std::vector. The result is given as a std::vector< std::vector<T> >:
auto rows = df1( df1("col1") < std::tuple(100) ).rows_to_vectors<double>();

Extracting data to Eigen-Objects

  • If the header <Eigen/Dense> is found, export to Eigen::Matrix and Eigen::Array is enabled:
auto mat = df1.to_eigen_matrix<double>();
auto arr = df1.to_eigen_array<double>();

Extraction as fixed-size arrays is possible (df1.to_eigen_matrix<T,Row,Col>()), but then the number of rows and columns must be known at compile time.

Error handling

As mutch checks as possible are done at compile time. When filtering out columns e.g. with df1("col2","col3"), the number of columns is stored as a integer template parameter, which can be used du ensure the validity of subsequent operations. What remains is handled by throwing exceptions at runtime.

mcsv's People

Contributors

benjaminhuth avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.