Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Geometry issue/ pandas key error about segysak HOT 7 CLOSED

etotten93 commented on July 22, 2024

Geometry issue/ pandas key error

from segysak.

Comments (7)

trhallam commented on July 22, 2024

Thanks for trying segysak.

This likely means that your il and xl numbers are not in the standard header byte locations. You will need to scan the segy file, determing which bytes contain the il and xl numbers and set these as arguments to the loader.

https://segysak.readthedocs.io/en/latest/examples/example_segy_headers.html#Scanning-the-headers

https://segysak.readthedocs.io/en/latest/generated/segysak.segy.segy_loader.html#segysak-segy-segy-loader

Additionally, I suggest you scrape the segy headers into a df first, then you can avoid having to do this long task on the SEGY multiple times. Note that segysak will try to load this entire file into memory if you use the loader without any trace filtering, so you could also convert the voloume to NetCDF4 to facilitate faster lazy loading.

from segysak.

etotten93 commented on July 22, 2024

Hi @trhallam - thanks for the quick turnaround on this. Setting the il/xl byte locations has progressed the df loading: however, I am getting a uniqueness issue now when loading:

In [9]: seisnc_vol = segy_loader(segy_file,iline=189,xline=193,ix_crop=(133058,133058,759053,759548))
100%|...2.18M/2.18M [18:10<00:00, 2.00k traces/s]

...
Loading as 3D
Fast direction is INLINE_3D
Converting SEGY: ...

...

~/anaconda3/lib/python3.9/site-packages/xarray/core/dataset.py in from_dataframe(cls, dataframe, sparse)
5508 
5509         if isinstance(idx, pd.MultiIndex) and not idx.is_unique:
-> 5510             raise ValueError(
   5511                 "cannot convert a DataFrame with a non-unique MultiIndex into xarray"
5512             )

ValueError: cannot convert a DataFrame with a non-unique MultiIndex into xarray.

This also happens when the xline range is fixed for ix_crop.

I will definitely convert to NETCDF going forward.
Thanks again for your help.

Eoghan

from segysak.

trhallam commented on July 22, 2024

This error means that two or more traces in your volume share a common il and xl value pair. Are you trying to load gathers? If your data is not 3D, you will have to specify an extra header byte for each unique dimension.

from segysak.

etotten93 commented on July 22, 2024

@trhallam - yes, I am looking to extract shot gathers from a publicly available dataset. I have attached a plot of a few columns from the partial scan (10,000 100,000 and 1,000,000) when using segy_header_scrape to get the trace headers, including INLINE_3D and CROSSLINE_3D. There are 804 traces/record - based on these plots, I can't see where a conflict in il/xl is happening. How do I specify this extra header byte for each unique dimension to be safe?

Many thanks again!

from segysak.

trhallam commented on July 22, 2024

If you look at the first image, the fact that you have flat steps on multiple traces for both il and xl indicates that you have duplicate pairs wherever this occurs, these will be the gather groups defined by offset or angle. Normal 3D data would have a saw-tooth type appearance for one of these fields and steps for the other.

If you look at the scan of the trace headers, you should see one byte range that has offset or angle values in it. Usually, this is byte 37. You will then need to specify the offset= keyword argument to the loader, which takes an integer byte position matching the byte position where your offset dimension is kept.

If you have more than one extra dimension to consider (like azimuth and angle for 4D gathers). Then you will need to use the segy_freeloader function, specifying a byte location and name for each dimension. The decomposition of the geometry into orthogonal dimensions (necessary for Xarray) means that each trace must be labelled by a unique coordinate-set (e.g il/xl for 3D, or il/xl/offset for shot gathers) defined by the n-Dimensional volume of traces.

EDIT:

Appologies I have confused the panels on your images. They do look ok, but hopefully the offset keyword argument advice should still be useful. If you plot a smaller number of traces, you might see what I described (just one inline for example).

from segysak.

etotten93 commented on July 22, 2024

Hi @trhallam , adding offset at byte loc 37 did indeed do the trick: and now I see (when I print seisnc_vol variable):
<xarray.Dataset>
Dimensions: (iline: 202, xline: 2, twt: 5001, offset: 906)
Coordinates:
* iline (iline) uint32 133058 133111 133123 133135 ... 141054 141105 141117
* xline (xline) uint32 759053 759055
* twt (twt) float64 0.0 2.0 4.0 6.0 ... 9.996e+03 9.998e+03 1e+04
* offset (offset) int64 -10199 -10198 -10186 -10174 ... -185 -173 -160 -147
Data variables:
data (iline, xline, twt, offset) float32 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
Attributes: (12/13)
ns: None
sample_rate: 2.0
text: C 1 XXXX
measurement_system: ft
d3_domain: None
epsg: None
... ...
corner_points_xy: None
source_file: XXXX.sgy
srd: None
datatype: None
percentiles: [-54.639131383808774, -47.91040736807959, -14.473256...
coord_scalar: -100.0

. It looks like it has worked :) - this is only for a very small subset, though - I guess now is the time to move across to netcdf to access larger subsets.

Thanks for all your help!

from segysak.

trhallam commented on July 22, 2024

Glad you got it sorted!

from segysak.

Geometry issue/ pandas key error about segysak HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent