Comments (7)
Thanks for trying segysak.
This likely means that your il and xl numbers are not in the standard header byte locations. You will need to scan the segy file, determing which bytes contain the il and xl numbers and set these as arguments to the loader.
https://segysak.readthedocs.io/en/latest/examples/example_segy_headers.html#Scanning-the-headers
Additionally, I suggest you scrape the segy headers into a df first, then you can avoid having to do this long task on the SEGY multiple times. Note that segysak will try to load this entire file into memory if you use the loader without any trace filtering, so you could also convert the voloume to NetCDF4 to facilitate faster lazy loading.
from segysak.
Hi @trhallam - thanks for the quick turnaround on this. Setting the il/xl byte locations has progressed the df loading: however, I am getting a uniqueness issue now when loading:
In [9]: seisnc_vol = segy_loader(segy_file,iline=189,xline=193,ix_crop=(133058,133058,759053,759548))
100%|...2.18M/2.18M [18:10<00:00, 2.00k traces/s]
...
Loading as 3D
Fast direction is INLINE_3D
Converting SEGY: ...
...
~/anaconda3/lib/python3.9/site-packages/xarray/core/dataset.py in from_dataframe(cls, dataframe, sparse)
5508
5509 if isinstance(idx, pd.MultiIndex) and not idx.is_unique:
-> 5510 raise ValueError(
5511 "cannot convert a DataFrame with a non-unique MultiIndex into xarray"
5512 )
ValueError: cannot convert a DataFrame with a non-unique MultiIndex into xarray.
This also happens when the xline range is fixed for ix_crop.
I will definitely convert to NETCDF going forward.
Thanks again for your help.
Eoghan
from segysak.
This error means that two or more traces in your volume share a common il and xl value pair. Are you trying to load gathers? If your data is not 3D, you will have to specify an extra header byte for each unique dimension.
from segysak.
@trhallam - yes, I am looking to extract shot gathers from a publicly available dataset. I have attached a plot of a few columns from the partial scan (10,000 100,000 and 1,000,000) when using segy_header_scrape to get the trace headers, including INLINE_3D and CROSSLINE_3D. There are 804 traces/record - based on these plots, I can't see where a conflict in il/xl is happening. How do I specify this extra header byte for each unique dimension to be safe?
Many thanks again!
from segysak.
If you look at the first image, the fact that you have flat steps on multiple traces for both il and xl indicates that you have duplicate pairs wherever this occurs, these will be the gather groups defined by offset or angle. Normal 3D data would have a saw-tooth type appearance for one of these fields and steps for the other.
If you look at the scan of the trace headers, you should see one byte range that has offset or angle values in it. Usually, this is byte 37. You will then need to specify the offset=
keyword argument to the loader, which takes an integer byte position matching the byte position where your offset dimension is kept.
If you have more than one extra dimension to consider (like azimuth and angle for 4D gathers). Then you will need to use the segy_freeloader
function, specifying a byte location and name for each dimension. The decomposition of the geometry into orthogonal dimensions (necessary for Xarray) means that each trace must be labelled by a unique coordinate-set (e.g il/xl for 3D, or il/xl/offset for shot gathers) defined by the n-Dimensional volume of traces.
EDIT:
Appologies I have confused the panels on your images. They do look ok, but hopefully the offset keyword argument advice should still be useful. If you plot a smaller number of traces, you might see what I described (just one inline for example).
from segysak.
Hi @trhallam , adding offset at byte loc 37 did indeed do the trick: and now I see (when I print seisnc_vol variable):
<xarray.Dataset>
Dimensions: (iline: 202, xline: 2, twt: 5001, offset: 906)
Coordinates:
* iline (iline) uint32 133058 133111 133123 133135 ... 141054 141105 141117
* xline (xline) uint32 759053 759055
* twt (twt) float64 0.0 2.0 4.0 6.0 ... 9.996e+03 9.998e+03 1e+04
* offset (offset) int64 -10199 -10198 -10186 -10174 ... -185 -173 -160 -147
Data variables:
data (iline, xline, twt, offset) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
Attributes: (12/13)
ns: None
sample_rate: 2.0
text: C 1 XXXX
measurement_system: ft
d3_domain: None
epsg: None
... ...
corner_points_xy: None
source_file: XXXX.sgy
srd: None
datatype: None
percentiles: [-54.639131383808774, -47.91040736807959, -14.473256...
coord_scalar: -100.0
. It looks like it has worked :) - this is only for a very small subset, though - I guess now is the time to move across to netcdf to access larger subsets.
Thanks for all your help!
E
from segysak.
Glad you got it sorted!
from segysak.
Related Issues (20)
- fail to convert a 2D sgy file into dataframe HOT 2
- dependencies import error - more_itertools HOT 1
- 0.3.1 segy_writer bug HOT 4
- Visualizing VSP data HOT 5
- Implement `pyzgy` for ZGY files as `segyio` drop in replacement. HOT 19
- Documentation: Add comments in FAQ and elsewhere regarding exporting of non-convertable values for header.
- Look into zarr and zfp support, HOT 1
- Improve support for sparse volumes HOT 1
- ValueError when writing 2D seismic data to file after reprojecting coordinates HOT 9
- segy_loader failing to broadcast input array shape during conversion HOT 4
- Documentation fro header re-map trace headers HOT 2
- suppress tqdm warning HOT 1
- zgy_writer HOT 8
- huge volume segy data loading HOT 5
- Get corners not working for segy_loader HOT 9
- Direct convert: segy to netcdf HOT 17
- convert segy to netcdf that could be used as an input in ArcGIS Pro to create voxel layer HOT 2
- Inconcistency in shapes (2D seismic) causes error when calling segy_loader HOT 2
- [Question] Is it possible to directly stream the data from Azure Blob Store HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from segysak.