Comments (31)
I did a web search and it seems except for Scala, the name slice
implies resource sharing. So, here's another possible naming strategy:
- deprecate
slice(start: Int, end: Int)
and replace withsubSequence
orsubseq
(which do people prefer?) - Introduce a
slice(range: Range)
on those collections that can support it. These would include
mutable.Seq
andIndexedView
. Non-view immutable collections should probably not support
aslice
method. You should doc.view.slice
for clarity.
The result of slice
would be collection.Slice
and collection.mutable.Slice
respectively. These types support some subset of all collection operations (essentially those operations that
maintain the element type and do not require re-indexing).
from collection-strawman.
Ah, yes, I did misunderstand. I agree that views should be maximally flexible in that regard; unlike iterators which should promise to be as lazy as possible, views can do whatever is "best" with regards to evaluation of sequential operations.
And I don't think evaluation order should be the same save on Seqs and SortedWhatevers where there is already a natural evaluation order.
But I think the "niche, 0.1% use case" evaluation is only because we're making it so by providing no support for this way of working. Views are a key aspect of, for instance, image processing operations in ImageJ; the conceptual equivalent of views with indexed ranges are used all over the place both in Java (e.g. java.util.Arrays.sort(xs, i0, iN)
) and elsewhere (e.g. it's exceedingly common in Matlab).
One of the weaknesses of the current collections library is that it has relatively poor support for mutating operations. We needn't necessarily improve this greatly (people are used to it not being there), but it doesn't mean that the potential use cases are rare.
For instance, if one has a list of files and one wants to replace all the missing ones by default file names, you would normally do this with immutable collections:
xs.map(f => if (f.exists) f else default)
But although you could reassign everything in place mutably:
xs.mapInPlace(f => if (f.exists) f else default)
in some ways it's more natural to define a filtered view and fill it:
xs.viewWhere(! _.exists).fill(default)
Again, I'm not saying it's really important for us to provide this kind of functionality, but I don't think we should view this as for niche applications even if we have immutable alternatives as the standard now. (Except inasmuch as anything mutable is a niche application.)
from collection-strawman.
I think that mutable views would just be a completely different thing than immutable views, so I don't think they'd impact the hierarchy. We could call them something else also.
from collection-strawman.
For reference, here is an example given by @Ichoran on gitter about fusion:
Just did a quick test and
def jmap(i: Iterator[Int], f: Int => Int, g: Int => Int) = i.map(f andThen g)takes only 60% as long to run as does
def imap(i: Iterator[Int], f: Int => Int, g: Int => Int) = i.map(f).map(g)because boxing is a lot of the overhead, and functions are specialized.
from collection-strawman.
@julienrf, it's not only because of boxing, it's also because of reduction total memory footprint by 2x.
If you try functions on non-primitives(which eliminates difference with boxing), you'll still see a speedup.
from collection-strawman.
Still quoting @Ichoran:
If I use objects and just switch references around to which object I have, the jmap variant takes 75% as long as imap, so still a nice little win.
But using List instead of iterator slows things down by 5x, so you're right that avoiding intermediate collections can be the big win.
from collection-strawman.
Mutable views are nice to have for a variety of divide-and-conquer in-place algorithms. For instance, an in-place sort is easier to express if you don't have to worry about indices.
I don't think that dropping the order of evaluation is particularly useful for either mutable or immutable views. That is a different useful thing, but "I want to act on part of my collection without rebuilding it" is a different desire than "I want to parallelize my operations on this collection".
If we were to support only one of the two, we at least shouldn't call it a "view" because right now (and in most other languages I've seen) "view" means the former.
from collection-strawman.
Mutable views are nice to have for a variety of divide-and-conquer in-place algorithms. For instance, an in-place sort is easier to express if you don't have to worry about indices
I think this is a niche, 0.1% use case and we should drop it and not worry about it.
I don't think that dropping the order of evaluation is particularly useful for either mutable or immutable views. That is a different useful thing, but "I want to act on part of my collection without rebuilding it" is a different desire than "I want to parallelize my operations on this collection".
Not sure, but I think you might be misunderstanding Stefan's suggestion? It isn't about parallelization. He isn't suggesting leaving iteration order unspecified — allowing element 4 to be operated on before element 3, as can happen with parallel collections. Rather, he's suggesting to allow fusion. If you do .map(a).filter(b)
, it would be unspecified whether all the elements get mapped before any of them get filtered.
from collection-strawman.
Also note that evaluation order of views in the current strawman is already different from evaluation order of operations on the underlying collections. If you wanted the same evaluation order you would have to build intermediate results but the whole point of views is to avoid that. I am only suggesting that we do not guarantee the same evaluation order as that of Iterators, either.
from collection-strawman.
I think the "niche, 0.1% use case" evaluation is only because we're making it so by providing no support for this way of working
It isn't so much that I'm opposed to including support for it, the only thing I'm strongly opposed to is letting this class of use cases be something that drives top-level design decisions. If it can be accommodated without distorting the hierarchy+API, great. (Can it?)
from collection-strawman.
I would like to propose to take them one step further and make the evaluation order of view operations unspecified.
I don't think that's necessary. The specified evaluation order of views already demands fusing. An unspecified evaluation order could give you some extra facility, for instance if you want to batch some ops into segments before fusing. But I don't think views are the right place for that.
[EDIT] I might be convinced otherwise by benchmarks that show that there's a big win to be had by automatic batching.
I am also betting that we will have much better techniques to reason about effects in the future, which would let us optimize without compromising semantics.
from collection-strawman.
Regarding mutable views, one question is what operations would produce a view? If we do not want to re-index (and I think we should not), then it seems the only sensical operations are to change the start, the end, or the stride. So, instead of views we could use a general notion of slices. E.g., on mutable (indexed?) Seq[A]:
def slice(r: Range): Slice[A]
val xs: Array[Int] = ...
xs.slice(3 until xs.length)
xs.slice(5 to 1 by -2)
Normal, strict slices are collection transformers and would not be available on mutable collections. There's no danger of silently changing semantics because the new non-strict slice
takes a range as argument instead of two bounds.
WDYT?
from collection-strawman.
I just noted that there is a source of confusion for arrays.
xs.slice(1, 3)
creates a new, strict array, or at least it did so far. But xs.slice(1 until 3)
creates a view. Not sure what to do about that one yet - deprecate old-style slice
on Array maybe?
from collection-strawman.
@odersky Maybe we should suffix operations producing a view with “view” (e.g. xs.sliceView(1 to 5)
), to make it clearer that further operations will be fused.
from collection-strawman.
@julienrf I agree sliceView
would be clearer but it would also be longer and harder to remember. Also, it should not be ...View
since a slice is not a view, but supports different operations. Maybe some other name? But I am not sure which.
from collection-strawman.
I am wary of having too many different derived collection types. So far we have immutable, mutable, iterator, immutable view, mutable view, and now we're going to have slice also? One of the primary reasons that the existing views are not used much--aside from surprising behavior that arose from the lack of a clear distinction between mutable and immutable views and inherent bugs due to LSP issues--was that they were buggy because it was hard to maintain so many different parallel implementations.
I would therefore try quite hard to keep slices and views the same thing. A slice would be one particular way to get a view. If we're not going to do that I would like to have examples of methods that are very important to have on one but must not be on the other. I can't think of any offhand.
from collection-strawman.
It's not clear to me why we would need separate types for slices and views. If we distinguish between mutable and immutable views, slice
on a mutable collection could produce a mutable view and subsequence
on an immutable collection would create an immutable view. Array.slice
is a problem in this model because the operation we want on (mutable) arrays is slice
(with the new meaning), so we'd have to reuse the name with a new meaning (which IMHO makes this a non-starter; a change like this is far too dangerous).
Are there any other opinions on the method names? If they are used in this way in other languages and libraries it probably makes sense to do the same (modulo the Array.slice
problem) but I find them counter-intuitive. I expect slice
to actually slice off the part that I want without referencing the whole collection, which means copying in the case of Array.slice
. But maybe that's just because I've mostly worked with Scala for the last decade.
from collection-strawman.
How about def focus(r: Range): View[A]
?
from collection-strawman.
I would like to take advantage of this thread to ask whether we should keep View
extend Iterable
or not. The argument to not make View
extend Iterable
is that by doing so we have a clear separation between collections types whose transformations (filter
, take
, map
, flatMap
, etc.) are evaluated lazily (View
) or eagerly (Iterable
). Currently, taking an Iterable
as parameter gives us no clue about the way transformation operations will be evaluated.
from collection-strawman.
The same is true for other lazy collections like Stream
or LazyList
, and these do not only extend Iterable
but even Seq
and LinearSeq
. Maybe we should make the distinction between strict and non-strict collection types explicit with a marker trait instead of trying to assign meaning to the classes in the main hierarchy.
from collection-strawman.
I agree with Stefan, and I am also not sure whether we should have a marker trait or not. I would defer all that until we have dealt with parallel collections. The order of evaluation (strict/lazy/batched/parallel) can vary a lot and we should think carefully before we invest types to reflect that.
from collection-strawman.
Yeah, ok, I agree with your arguments :)
from collection-strawman.
Re: choosing names for in-place mutable operations (from the other thread #7).
The one thing I think is important is that we should not try to come up with creative new names for linear versions of immutable operations. My current favorite would be mapInPlace, filterInPlace, but if someone has a better idea, I am happy to change my preference.
What about appending _!
to the names of immutable versions? It does seem a bit alien at first, but after having used that in my code for some time, I find it appropriate: it makes it exceedingly clear what are the operations to "watch out for" because of their effects, similar to how ???
catches the reader's eye. For example:
val arr = ArrayBuffer(1,2,3,4,5)
// clearly, I'm doing imperative transformations here:
arr.map_!(x => x + 1).filter_!( _ % 2 == 0 )
// or:
arr map_! (_ + 1) filter_! (_ % 2 == 0)
There is precedence for this in Ruby, where it's somewhat nicer because there is no need for the _
. It would be nice if the lexer was modified and we could have:
arr.map!(_ + 1).filter!(_ % 2 == 0)
I would probably also push for marking methods not guaranteed to have the expected efficient implementation (such as size
on List
not being O(1)
) with a _$
as in ls.size_$
, marking option-returning methods with _?
as in ls.head_?
, and marking parallel operations with _||
as in val ys = xs map_|| f
, but that would probably be going to far
from collection-strawman.
SeqView
is very dangerous at the moment because it may violate rules for java.lang.Object.equals
.
https://issues.scala-lang.org/browse/SI-10201
from collection-strawman.
view == view
should not force the view.
I expect view0 eq view1
and view0 == view1
are always same.
from collection-strawman.
seq != view
should always be true
, given we always expect seq != iterator
, and view
is simply a wrapper of Iterator
.
from collection-strawman.
I agree. In the strawman we define equality at the level of Seq
and Set
. View
is yet another basic collection type which would never be equal to a Seq
or Set
. Reflexivity is a special case, it should be trivial to fix in either 2.12 or 2.13.
from collection-strawman.
I really expect LazyList or Stream add these powerful and convenient methods :
repeat/cycle, repeatedly/generate, iterate, unfold, resource, reject,
like in Elixir and Clojure: https://hexdocs.pm/elixir/Stream.html#unfold/2
This symbol #:: is too low,strange to verbose,unreadable.
Also, I hope add some xxxIndexed like in Kotlin Sequence: https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/index.html
I think, one of the programming trend is the Lazy Sequences .
ps: post here may right.
from collection-strawman.
My experiences with immutable.Seq/Iterable
vs collection.Seq/Iterable
vs Iterator
from scala-library and scala-reflect so far:
-
Many
Iterator
methods are commonly used but not defined in the strawman. In many cases callers abstract overIterableOnce
(orTraversableOnce
which is now aliased toIterableOnce
). We should at least have a shared implementation trait, possibly add methods back toIterableOnce
. I am defining deprecated extension methods for now. -
Consing for
collection.Seq
is sorely missing. We already have++
and++:
incollection.IterableOps
. We should also have+:
and:+
. These are only defined inimmutable.SeqOps
and can be easily emulated with++
and++:
but this prevents collection implementations from properly optimizing them (e.g.List.+:
).
from collection-strawman.
Many
Iterator
methods are commonly used but not defined in the strawman.
FWIW I added some of them in that PR: https://github.com/scala/collection-strawman/pull/200/files#diff-c0d7086690fd3b617f2eb319d5b832ce
Consing for collection.Seq is sorely missing.
In case you missed it: #205 (and #206).
from collection-strawman.
It looks like we've settled on a View
design by now so I'm closing this ticket.
from collection-strawman.
Related Issues (20)
- TraversableOnce is missing
- Rename `to` to `as` HOT 10
- `seq` operation has been removed HOT 3
- Rewrite `--` operation to `&~` or `diff` HOT 1
- Rewrite `retain` to `filterInPlace`
- ArrayOps does not have a `view` operation anymore HOT 1
- Modularize the migration rewrite rules HOT 1
- Reorganize repositories
- Bring back Iterator.duplicate
- Set.apply / Map.apply don't use SetN / MapN
- We need dev-oriented documentation for M4 HOT 2
- ArrayOps doesn’t provide a sliding operation
- Make scala.Iterable[+A] = scala.collection.immutable.Iterable[A] HOT 8
- Setup collections-compat release infra
- add `()` to `toArray` HOT 1
- resolve status of `scala.collection.immutable.useBaseline` HOT 6
- IterableOnce.scala: def toSet[B >: A]: immutable.Set[B] should be def toSet: immutable.Set[A] HOT 2
- Main criticism HOT 3
- close or transfer all open issues and archive the repository HOT 1
- Delete and archive Gitter room HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from collection-strawman.