Comments (2)
@scovich This also covers the partition filtering path. We include any data filters in the "remaining filters" that still need to be evaluated on the returned data. What we don't include are any partition filters that are applied.
from delta.
I'm a bit unclear on how this "remaining filters" thing works -- data skipping isn't perfect (may return extra rows), so wouldn't we anyway have to apply all predicates against the returned data? Or is this also covering the partition filtering path, which does have perfect pruning?
from delta.
Related Issues (20)
- [BUG] `columnMapping` readerFeatures is missing when icebergCompatV1(2) is enabled HOT 6
- [BUG] [Kernel] The DefaultParquetHandler read issue
- [QUESTION] How can one generate Uniform data using local standalone spark? HOT 1
- [Feature Request] Reverse UniForm
- [BUG][Spark] INSERT INTO struct evolution in map/arrays breaks when a column is renamed
- [Feature Request][Spark] Remove dropped columns from Parquet files in REORG TABLE (PURGE)
- [BUG] Error on import with new release HOT 1
- [Feature Request] Spark Connect support for the Python API HOT 1
- [Feature Request] Support for Spark Connect (aka Delta Connect)
- [BUG][Spark] Logs are not compacted HOT 1
- [Feature Request][SPARK] Make Vacuum DRY RUN list all the files along with the the size of the files
- [Feature Request][Spark] Add test coverage for clustering on generated columns
- [Feature Request] Consolidate optimizer suite logic across compaction and z-order
- [Feature Request][Spark] Relax check for generated columns and CHECK constraints depending on nested struct fields
- [Feature Request][Spark] Update UniForm related comment to include description to Hudi
- [Feature Request] Support additive change speculation for Delta source schema tracking
- [Feature Request] Support `failOnDataLoss` property for Delta source schema tracking
- [BUG][Spark] `DeltaTable.forName` method fails to parse fully qualified table name with catalog in Python API
- [kernel][Feature Request] Row based Parquet Writer with flush capability HOT 1
- [Tests] Some tests in `DeletionVectorsSuite` are flakey (timing out)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from delta.