rocketlaunchr / dataframe-go Goto Github PK
View Code? Open in Web Editor NEWDataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
License: Other
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
License: Other
Hi this looked really promising but it looks like it's not maintained or used any more. What do people use instead for pandas-style dataframes in golang?
Repo rocketlaunchr/dataframe-go
used the old path to import gotestyourself
indirectly.
This caused that github.com/gotestyourself/gotestyourself
and gotest.tools
coexist in this repo:
https://github.com/rocketlaunchr/dataframe-go/blob/master/go.mod (Line 20 & 40)
github.com/gotestyourself/gotestyourself v2.2.0+incompatible // indirect
gotest.tools v2.2.0+incompatible // indirect
That’s because the gotestyourself
has already renamed it’s import path from "github.com/gotestyourself/gotestyourself" to "gotest.tools". When you use the old path "github.com/gotestyourself/gotestyourself" to import the gotestyourself
, will reintroduces gotestyourself
through the import statements "import gotest.tools" in the go source file of gotestyourself
.
https://github.com/gotestyourself/gotest.tools/blob/v2.2.0/fs/example_test.go#L8
package fs_test
import (
…
"gotest.tools/assert"
"gotest.tools/assert/cmp"
"gotest.tools/fs"
"gotest.tools/golden"
)
"github.com/gotestyourself/gotestyourself" and "gotest.tools" are the same repos. This will work in isolation, bring about potential risks and problems.
Add replace statement in the go.mod file:
replace github.com/gotestyourself/gotestyourself => gotest.tools v2.2.0
Then clean the go.mod.
go: github.com/sjwhitworth/[email protected] requires
github.com/rocketlaunchr/[email protected] requires
github.com/blend/[email protected]: reading github.com/blend/go-sdk/go.mod at revision v1.1.1: unknown revision v1.1.1
It looks like v1.1.1
of github.com/blend/go-sdk
is missing. Are you seeing the same or am I taking crazy pills today?
When importing CSV and a field has empty values at some rows, import ignores the dictated data type and forces to interpret the field as string.
csvStr := `sometimes_empty,label
,"First"
2,"Second"
,"Third"
4,"Fourth"`
ctx := context.Background()
df, err := imports.LoadFromCSV(ctx, strings.NewReader(csvStr), imports.CSVLoadOptions {
DictateDataType: map[string]interface{} {
"sometimes_empty": int64(0),
"label": "",
},
})
fmt.Println(err)
fmt.Println(df)
This code produces the error:
can't force string: to int64. row: 0 field: sometimes_empty
How should I be able to create a dataframe from a CSV with empty values, that still follows the dictated type and have NaN instead on the empty vlaues?
Expected output should be:
+-----+-----------------+--------+
| | SOMETIMES EMPTY | LABEL |
+-----+-----------------+--------+
| 0: | NaN | First |
| 1: | 2 | Second |
| 2: | NaN | Third |
| 3: | 4 | Fourth |
+-----+-----------------+--------+
| 4X2 | INT64 | STRING |
+-----+-----------------+--------+
For ordered map:
When setting an existing value, should we remove old value and append new?
o.store[key] = val
o.keys = append(o.keys, key)
For Delete function, if key is not found, then return immediately.
Hi, This is a very helpful package, I was wondering do we have plans implementing Timeindex
for series / dataframe and implement resample just like pandas? Could be extremely useful. If no one is doing it I am happy to contribute.
Hi! At the moment I have managed to plot a separate dataframe column by this strange method:
func main() {
// all values of df are strings representing floating point numbers
df := df, err := imports.LoadFromCSV(ctx, r, imports.CSVLoadOptions{Comma: ';'})
s := df.Series[2] // trying to plot column 2
series := dataframe.NewSeriesFloat64("test_name", nil, nil)
i := s.ValuesIterator(dataframe.ValuesOptions{InitialRow: 0, Step: 1, DontReadLock: false})
for {
row, vals, _ := i()
if row == nil {
break
}
val, err := strconv.ParseFloat(vals.(string), 64)
if err != nil {
continue
}
series.Append(val)
}
Plot(series)
}
func Plot(ser *dataframe.SeriesFloat64) {
ctx := context.TODO()
cs, _ := wcharczuk_chart.S(ctx, ser, nil, nil)
graph := chart.Chart{
Title: "test_graph",
Width: 640,
Height: 480,
Series: []chart.Series{cs},
}
f, err := os.Create("graph.svg")
if err != nil {
panic(err)
}
defer f.Close()
plt := bufio.NewWriter(f)
_ = graph.Render(chart.SVG, plt)
}
Is there any simplier or more elegant method to do this job? And another question is if I can plot several columns on one plot? And if it is possible, how can I do this? Thanks in advance.
I'm trying to concatenate two columns in a dataframe and put it into a new column. The behavior is very inconsistent. Sometimes the strings are concatenated into the new column. Sometimes the value is just set to NaN
.
In this run, the value for concat_contact_number
in the resulting dataframe was correctly set to 97312345678
.
The map value for concat_contact_number
also reflects the concatenated value.
Expected output:
$ go run main.go
INFO[0000] In applyConcatDf: vals[contact_number_country_code]: 973
INFO[0000] In applyConcatDf: vals[concat_contact_number]: 973
INFO[0000] In applyConcatDf: vals[contact_number]: 12345678
INFO[0000] In applyConcatDf: vals[concat_contact_number]: 97312345678
INFO[0000] In applyConcatDf: vals: map[0:973 1:12345678 2:<nil> concat_contact_number:97312345678 contact_number:12345678 contact_number_country_code:973]
INFO[0000] In prepareDataframe:
INFO[0000] +-----+-----------------------------+----------------+-----------------------+
| | CONTACT NUMBER COUNTRY CODE | CONTACT NUMBER | CONCAT CONTACT NUMBER |
+-----+-----------------------------+----------------+-----------------------+
| 0: | 973 | 12345678 | 97312345678 |
+-----+-----------------------------+----------------+-----------------------+
| 1X3 | STRING | STRING | STRING |
+-----+-----------------------------+----------------+-----------------------+
INFO[0000] In main:
INFO[0000] +-----+-----------------------------+----------------+-----------------------+
| | CONTACT NUMBER COUNTRY CODE | CONTACT NUMBER | CONCAT CONTACT NUMBER |
+-----+-----------------------------+----------------+-----------------------+
| 0: | 973 | 12345678 | 97312345678 |
+-----+-----------------------------+----------------+-----------------------+
| 1X3 | STRING | STRING | STRING |
+-----+-----------------------------+----------------+-----------------------+
In this run, the value for concat_contact_number
in the resulting dataframe was incorrectly set to NaN
.
Same as with the correct run, the map value for concat_contact_number
is also set to the expected concatenated value.
Erroneous output:
$ go run main.go
INFO[0000] In applyConcatDf: vals[contact_number_country_code]: 973
INFO[0000] In applyConcatDf: vals[concat_contact_number]: 973
INFO[0000] In applyConcatDf: vals[contact_number]: 12345678
INFO[0000] In applyConcatDf: vals[concat_contact_number]: 97312345678
INFO[0000] In applyConcatDf: vals: map[0:973 1:12345678 2:<nil> concat_contact_number:97312345678 contact_number:12345678 contact_number_country_code:973]
INFO[0000] In prepareDataframe:
INFO[0000] +-----+-----------------------------+----------------+-----------------------+
| | CONTACT NUMBER COUNTRY CODE | CONTACT NUMBER | CONCAT CONTACT NUMBER |
+-----+-----------------------------+----------------+-----------------------+
| 0: | 973 | 12345678 | NaN |
+-----+-----------------------------+----------------+-----------------------+
| 1X3 | STRING | STRING | STRING |
+-----+-----------------------------+----------------+-----------------------+
INFO[0000] In main:
INFO[0000] +-----+-----------------------------+----------------+-----------------------+
| | CONTACT NUMBER COUNTRY CODE | CONTACT NUMBER | CONCAT CONTACT NUMBER |
+-----+-----------------------------+----------------+-----------------------+
| 0: | 973 | 12345678 | NaN |
+-----+-----------------------------+----------------+-----------------------+
| 1X3 | STRING | STRING | STRING |
+-----+-----------------------------+----------------+-----------------------+
It can be observed that in both cases the map value for 2
is always <nil>
. Is this expected?
Run this code several times to see deviances in the output. The issue may not show up immediately. Sometimes it takes 10x runs, sometimes only 2x run. Again the behavior is inconsistent.
Working code:
package main
import (
"context"
"fmt"
"strings"
dataframe "github.com/rocketlaunchr/dataframe-go"
"github.com/rocketlaunchr/dataframe-go/imports"
log "github.com/sirupsen/logrus"
)
// applyConcatDf returns an ApplyDataFrameFn that concatenates the given column names into another column
func applyConcatDf(dest_column string, columns []string) dataframe.ApplyDataFrameFn {
return func(vals map[interface{}]interface{}, row, nRows int) map[interface{}]interface{} {
vals[dest_column] = ""
for _, key := range columns {
log.Infof("vals[%s]: %s", key, vals[key].(string))
vals[dest_column] = vals[dest_column].(string) + vals[key].(string)
log.Infof("vals[%s]: %s", dest_column, vals[dest_column].(string))
}
log.Infof("vals: %v", vals)
return vals
}
}
// applySetupDataframe initializes the dataframe from a CSV string
func setupDataframe() *dataframe.DataFrame {
ctx := context.Background()
csvStr := `contact_number_country_code,contact_number
"973","12345678"`
df, _ := imports.LoadFromCSV(ctx, strings.NewReader(csvStr), imports.CSVLoadOptions{
DictateDataType: map[string]interface{}{
"contact_number_country_code": "",
"contact_number": "",
},
})
return df
}
// prepareDataframe applies the concatenation on the loaded dataframe
func prepareDataframe(df *dataframe.DataFrame) {
ctx := context.Background()
sConcatContactNumber := dataframe.NewSeriesString("concat_contact_number", &dataframe.SeriesInit{Size: df.NRows()})
df.AddSeries(sConcatContactNumber, nil)
_, err := dataframe.Apply(ctx, df, applyConcatDf("concat_contact_number", []string{"contact_number_country_code", "contact_number"}), dataframe.FilterOptions{InPlace: true})
if err != nil {
log.WithError(err).Error("concatenation cannot be applied")
}
fmt.Println(df)
}
func main() {
df := setupDataframe()
prepareDataframe(df)
fmt.Println(df)
}
How to remove duplicate rows in DataFrame?
python's pd.drop_duplicates()
Go modules makes it easier to work on multiple forked packages at the same time.
I suppose a simple go mod init github.com/rocketlaunchr/dataframe-go
would be sufficient to have it.
If you accept the proposal, I can submit a PR later.
Is this a fork on GOTA.
Licenses are code; tricky to the point of being analogous to cryptographic algorithms. Writing a custom one often creates unintended consequences. For instance, as currently worded, even with the best of intentions, a web developer might deploy the package believing themselves to be in full compliance with the license, then later learn that a site user is non-compliant; this new knowledge causes the developer to become immediately non-compliant.
Other than a return to a standard MIT license or one of the others listed at https://pkg.go.dev/license-policy, I don't have any good suggestions for what to do about this; I don't know of any standard open-source licenses that satisfy the full intent of the current license. That doesn't mean there aren't any -- the JSON license, for instance, is on the pkg.go.dev list, and is an MIT derivative that tries to do the right thing. Here's a related conversation on a stack exchange site that covers some of the issues in more detail: https://softwareengineering.stackexchange.com/questions/199055/open-source-licenses-that-explicitly-prohibit-military-applications
I would like to know if I can read a CSV file in batches. If it is possible, how can I do it? I looked through the docs and examples and could not find the required information. Any help would be appreciated.
Thank you.
Thanks for creating this library!
I can get this code to work:
ctx := context.TODO()
// step 1: open the csv
csvfile, err := os.Open("data/example.csv")
if err != nil {
log.Fatal(err)
}
dataframe, err := imports.LoadFromCSV(ctx, csvfile)
Here's the data that's printed:
fmt.Print(dataframe.Table())
+-----+------------+-----------------+
| | FIRST NAME | FAVORITE NUMBER |
+-----+------------+-----------------+
| 0: | matthew | 23 |
| 1: | daniel | 8 |
| 2: | allison | 42 |
| 3: | david | 18 |
+-----+------------+-----------------+
| 4X2 | STRING | STRING |
+-----+------------+-----------------+
I cannot get this code working:
s := dataframe.Series[2]
applyFn := dataframe.ApplySeriesFn(func(val interface{}, row, nRows int) interface{} {
return 2 * val.(int64)
})
dataframe.Apply(ctx, s, applyFn, dataframe.FilterOptions{InPlace: true})
fmt.Print(dataframe.Table())
Here's the error message:
./dataframe_go.go:36:22: dataframe.ApplySeriesFn undefined (type *dataframe.DataFrame has no field or method ApplySeriesFn)
./dataframe_go.go:40:11: dataframe.Apply undefined (type *dataframe.DataFrame has no field or method Apply)
./dataframe_go.go:40:44: dataframe.FilterOptions undefined (type *dataframe.DataFrame has no field or method FilterOptions)
Here's the code: https://github.com/MrPowers/go-dataframe-examples/blob/master/dataframe_go.go
Sorry if this is a basic question. I am a Go newbie!
Thanks again for making this library!
hi
i have one csv fiile ,has four fields [USERID ,MOVIEID,RATING, TIMESTAMP) ,LoadFromCSV default load
all fields data type are string ,I want to change it with float64 when load init ,so I create CSVLoadOptions
var csvOp imports.CSVLoadOptions
csvOp.DictateDataType =make(map[string]interface{})
csvOp.DictateDataType["USERID"]= float64(0)
csvOp.DictateDataType["MOVIEID"]=float64(0)
csvOp.DictateDataType["RATING"]=float64(0)
csvOp.DictateDataType["TIMESTAMP"]=float64(0)
ratingDf, err := imports.LoadFromCSV(ctx, file,csvOp)
but has load error ,I dont know why ,is use the CSVLoadOptions is not correct ?
Hi. I am new to Go. Can someone please guide me how to import this package in local.
the error is exception recovered: reflect.StructOf: field 0 has invalid name
at ompluscator/[email protected]/builder.go:192
export.csv
dataframe = pandas.DataFrame({
"A": ["a", "b", "c", "d"],
"B": [2, 3, 4, 1],
"C": [10, 20, None, None]
})
dataframe.to_parquet("1.parquet")
func main() {
ctx := context.Background()
fr, _ := local.NewLocalFileReader("1.parquet")
df, err := imports.LoadFromParquet(ctx, fr)
if err != nil {
panic(err)
}
fmt.Println(df)
}
panic: names of series must be unique:
goroutine 1 [running]:
github.com/rocketlaunchr/dataframe-go.NewDataFrame({0xc0001f8000, 0x3, 0xc000149a10?})
.../rocketlaunchr/[email protected]/dataframe.go:41 +0x33c
github.com/rocketlaunchr/dataframe-go/imports.LoadFromParquet({0x1497868, 0xc000020080}, {0x1498150?, 0xc00000e798?}, {0xc0000021a0?, 0xc000149f70?, 0x1007599?})
.../go/pkg/mod/github.com/rocketlaunchr/[email protected]/imports/parquet.go:110 +0x8ae
main.main()
.../main.go:13 +0x78
imports.LoadFromParquet
with empty namesgoName
didn't, may be it's the reason why can't not find a name from this mapThis's the first time I use golang to read parquet files. It is an error cause by parquet-go breaking changes or something else ?
Problem getting the package:
$ go get -u github.com/rocketlaunchr/dataframe-go
go get: github.com/rocketlaunchr/dataframe-go@none updating to
github.com/rocketlaunchr/[email protected] requires
github.com/blend/[email protected]: reading github.com/blend/go-sdk/go.mod at revision v1.1.1: unknown revision v1.1.1
Hello. Is there simple way to join two dataframes of same dimension? Something like df = append(df, another_df) or similar.
Name1 | Name2 | |
---|---|---|
0 | D | E |
1 | F | G |
and
Name1 | Name2 | |
---|---|---|
2 | D | E |
3 | F | G |
=
Name1 | Name2 | |
---|---|---|
0 | D | E |
1 | F | G |
2 | D | E |
3 | F | G |
Hello,
Are there any plans to support reading a Parquet file into a dataframe? I have a need for this and am evaluating this library to use in an application.
Thanks!
It's written in the README file that "Once Go 1.18 (Generics) is introduced, the ENTIRE package will be rewritten.", As Go 1.18 has been released for a while, I'm wondering if work has started on re-writing of the entire package. If so, how's the progress?
Greetings!
Just a minor suggestion, but if you have the time, it could be useful to expand the docs a bit more to cover some additional common operations applied to dataframe-like structures, where supported.
For example:
Further, one other thing I noticed when employing the package for the first time, is that many of the dataframe.xx()
function calls include a nil
as the first argument.
From looking at the code for dataframe.go
, these appear to be relating to an optional Options
struct, so it makes sense that this would be set to nil
in many instances. It may just be worth mentioning this explicitly in the examples for .Append()
in the docs.
Finally two other things that could be useful to consider including in the docs:
Thanks for taking the time to put together and share this really useful package!
container/list
package for float64 data instead of an []float64. (PUT IN XSERIES PKG): https://github.com/huandu/skiplistV(args ...interface{})
dot
)Headings
option)fn := funcs.RegFunc("sin(2*𝜋*x/24)")funcs.Evaluate(ctx, df, fn, 1)
I copy the followed code from readme, but the chart haven't value in a Xaxis and Yaxis
import (
chart "github.com/wcharczuk/go-chart"
"github.com/rocketlaunchr/dataframe-go/plot"
wc "github.com/rocketlaunchr/dataframe-go/plot/wcharczuk/go-chart"
)
sales := dataframe.NewSeriesFloat64("sales", nil, 50.3, nil, 23.4, 56.2, 89, 32, 84.2, 72, 89)
cs, _ := wc.S(ctx, sales, nil, nil)
graph := chart.Chart{Series: []chart.Series{cs}}
plt, _ := plot.Open("Monthly sales", 450, 300)
graph.Render(chart.SVG, plt)
plt.Display()
<-plt.Closed
Hi, I got an error to import csv, need help!
code from README.md
package main
import (
"context"
"fmt"
"strings"
"github.com/rocketlaunchr/dataframe-go/imports"
)
var ()
func main() {
csvStr := `
Country,Date,Age,Amount,Id
"United States",2012-02-01,50,112.1,01234
"United States",2012-02-01,32,321.31,54320
"United Kingdom",2012-02-01,17,18.2,12345
"United States",2012-02-01,32,321.31,54320
"United Kingdom",2012-05-07,NA,18.2,12345
"United States",2012-02-01,32,321.31,54320
"United States",2012-02-01,32,321.31,54320
Spain,2012-02-01,66,555.42,00241
`
fmt.Println(csvStr)
ctx := context.Background()
df, err := imports.LoadFromCSV(ctx, strings.NewReader(csvStr))
fmt.Println(df)
fmt.Println(err)
}
There is error to run
$ go run main.go
# github.com/xitongsys/parquet-go/parquet
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:631:16: not enough arguments in call to iprot.ReadStructBegin
have ()
want (context.Context)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:637:37: not enough arguments in call to iprot.ReadFieldBegin
have ()
want (context.Context)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:649:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:659:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:669:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:679:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:689:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:699:30: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:704:28: not enough arguments in call to iprot.Skip
have (thrift.TType)
want (context.Context, thrift.TType)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:708:15: not enough arguments in call to iprot.ReadFieldEnd
have ()
want (context.Context)
C:\Users\hellogo\go\pkg\mod\github.com\xitongsys\[email protected]\parquet\parquet.go:708:15: too many errors
When i run the below example to read in a df from a dummy string i get the below errors:
➜ learn-go git:(main) ✗ go run dev.go
# command-line-arguments
./dev.go:21:13: undefined: dataframe.LoadFromCSV
./dev.go:21:65: undefined: dataframe.CSVLoadOptions
package main
import (
"context"
"fmt"
"strings"
imports "github.com/rocketlaunchr/dataframe-go"
)
func main() {
csvStr := `colA,colB
1,"First"
2,"Second"
3,"Third"
4,"Fourth"`
ctx := context.Background()
df, err := imports.LoadFromCSV(ctx, strings.NewReader(csvStr), imports.CSVLoadOptions{
DictateDataType: map[string]interface{}{
"colA": int64(0),
"colB": "",
},
})
fmt.Println(err)
fmt.Println(df)
}
I'm not really sure what i'm doing wrong here. Was trying to use example from this sort of approach: https://github.com/rocketlaunchr/dataframe-go/blob/master/imports/infer_test.go
Here is the output of go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/andre/.cache/go-build"
GOENV="/home/andre/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/andre/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/andre/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build868960458=/tmp/go-build -gno-record-gcc-switches"
p.s. i'm new to Go so pretty sure it could be me doing something silly :)
I suspect that the library maintainers prepended "legacy-" to versions before changing the versioning scheme. At the least, this dependency should be updated to legacy-v1.1.1
.
hi there,
Hope someone can help me, how can I achieve multi index similar to https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html
Any help is much appraciated.
Regards,
Julio
content, err := ioutil.ReadFile(parsedDirectory, file.Name()) if err != nil{ fmt.Println(err) return } df, err := imports.LoadFromCSV(ctx, bytes.NewReader(content)) var writers io.Writer jsonErr := exports.ExportToJSON(ctx, writers, df) if jsonErr != nil{ fmt.Println("Json export error") }
The above script throws below error
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4e9f01]
goroutine 1 [running]:
encoding/json.(*Encoder).Encode(0xc000045e00, 0x5b83c0, 0xc0000601e0, 0x0, 0xc0002f22e8)
/usr/local/go/src/encoding/json/stream.go:231 +0x1b1
github.com/rocketlaunchr/dataframe-go/exports.ExportToJSON(0x62a220, 0xc000016080, 0x0, 0x0, 0xc00001a080, 0x0, 0x0, 0x0, 0x0, 0x0)
/go/pkg/mod/github.com/rocketlaunchr/[email protected]/exports/jsonl.go:83 +0x395
Example with ctx arguements don't work. and I wanted to know how to directly get the data from SQL to a dataframe and serve it as an API. I read your blog too. but that too had exactly the same examples.
This is a great package. thanks for working so hard.
To get more feature parity with Pandas, integrate with https://github.com/nfx/go-htmltable.
sks := []dataframe.SortKey{
{Key: "sales", Desc: true},
{Key: "day", Desc: true},
}
df.Sort(ctx, sks)
In this code you have provided in readme, ctx is not defined any where before that. I got it is coming from context but I think we need to initialize this ctx first before calling df.Sort(ctx, sks). Kindly guide me. Thanks in advance.
I want to use it ,but I found some problem ,You make the property about SeriesInt64 values private !!!,why ?
would you like tell how to convert dataframe to gonum dense matrix ?
and how to use LoadFromCSV(ctx,strings.NewReader(csvStr)),which ctx ,how to define the context.Context
The library is very useful and somewhat tries to imitate the pandas library of Python . It would be very beneficial if you could provide support to create a NewSeriesArray() method that would be beneficial to create a proper parquet file.
Can someone give me an example of how to write the code to export a dataframe to a parquet file (couldn't find one anywhere)? I have no idea how to define the writer inside the ExportToParquet.
Knowing that i have a dataframe df, i have this code inside main:
ctx := context.Background()
w, err := os.Create("output.parquet")
exports.ExportToParquet(ctx, writer.NewParquetFromWriter(w, df, 4))
Thanks again!
I wanted to know if there is a way to convert a series interface to get the original type of series (Float64/Int64/Mixed) underneath it.
I will describe mu use case.
After creating a dataframe, I am trying to use gonum to do some analysis. For eg. linear regression of two series from dataframe. But for this I have to iterate over the whole series(using ValuesIterator) to get back each element into a []float64, which is required by gonum.
ToSeriesFloat64 does not help since it is not implemented by Series.
Is there an easier way to access the whole underlying series into into corresponding concrete slice?
files, err := ioutil.ReadFile("device.json")
if err != nil {
fmt.Println(err)
}
var ctx = context.Background()
df2, _ := imports.LoadFromJSON(ctx, strings.NewReader(string(files)))
fmt.Println(df2.Table())
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.