Giter Site home page Giter Site logo

mds's Introduction

mds

GoDoc CI

This repository defines generic data structures in Go.

Data Structures

Utilities

mds's People

Contributors

creachadair avatar danderson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mds's Issues

mdiff: panic (slice bounds out of range) when using Unify()

I was using mdiff to compute a diff of a big file, and hit a panic. It seems to only happen when using Unify(), or possibly Unify() just alters the state to expose the crash, not sure.

Reproduction recipe:

git clone https://gist.github.com/107d080ac77504da650db60afa234c1a.git repro
cd repro
go run .

Result:

panic: runtime error: slice bounds out of range [:-1]

goroutine 1 [running]:
github.com/creachadair/mds/mdiff.UnifyChunks({0xc000120488, 0x69, 0x8f})
	/home/dave/go/pkg/mod/github.com/creachadair/[email protected]/mdiff/mdiff.go:254 +0x67c
github.com/creachadair/mds/mdiff.(*Diff).Unify(...)
	/home/dave/go/pkg/mod/github.com/creachadair/[email protected]/mdiff/mdiff.go:176
main.main()
	/home/dave/hack/psl/tools/fuck/107d080ac77504da650db60afa234c1a/main.go:23 +0x16a
exit status 2

Unfortunately the diff inputs are large, so definitely not a minimal repro. But hopefully is enough to pinpoint the source. I can also go digging, but I don't know mdiff at all and I'm in a hurry right now so thought I'd at least write it down for later :)

Should oset.NewFunc deduplicate equivalent values?

I reached for oset because I wanted a sort+dedup of some slices, and oset looked comparable to python's set(). This bit me, because oset specifically does not define equivalent elements as indistinguishable. That is, I incorrectly assumed the following:

var a, b SomeFancyType = mkFancy(), mkFancy()

// The values are distinct in some way, but their comparison function deems them equivalent.
Assert(a.Compare(b) == 0)

os := oset.NewFunc(SomeFancyType.Compare, a, b)

Assert(os.Len() == 1) // Oops, boom, os.Len() == 2

I'm not sure if this is a bug or not. oset does not say that it deduplicates anything. However, I think that also makes the type not a set, by the strict mathematical definition? You can add 3 to an oset twice, and get both back out when iterating. That makes it... something else, possibly a non-decreasing subsequence, which could probably be abbreviated to oseq.

I'm not sure what to do about this. I started writing a PR to document this behavior, but then started wondering if this going against the assumptions Go programmers have about sets and equivalent elements is intentional, whether it should be changed vs. merely spelled out more explicitly, and then a whole existential crisis about whether it's even a set at all if it allows duplicate members. So, opening this issue to spread the love, I suppose ๐Ÿ˜‚

Options I can think of:

  • Document that oset departs from the de-facto expectation of how sets behave in Go, and that it is strictly speaking not a set
  • Rename package to something else, e.g. oseq, to more precisely capture its "sorted list with more efficient insert than slices"-ish behavior.
  • Change oset to store only distinct elements, with Compare(a, b) != 0 being the definition of distinct.

Genuinely unsure which to pick, I think I ever so slightly lean to changing the semantics to be a proper set, possibly alongside making a separate package for the ordered sequence semantics? WDYT?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.