Giter Site home page Giter Site logo

Comments (21)

niketchhajed avatar niketchhajed commented on August 30, 2024 1

Currently, looking at the files received from @alexgorb and analysing them for migration to Siescomp3. The goal is to interpret the data files and map their schema to Seiscomp3 database schema as much as possible. If there are any missing fields, erroneous data or conflicting fields, the same needs to be reported and mitigated.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

We have 3 types of files: DAT files, HDF files and OUT files. As per discussion with Alexei, the DAT files, which are in the FFB format (Fixed Format Bulletin) need to be parsed and the pick, origin, amplitude, magnitude and event information needs to be extracted. The extracted data needs to be massaged into the FDSNXML format that can be imported into the SC3 DB. The public ID information needs to be synthetically created and added to the resulting FDSNXML file before importing to the SC3 DB.

The FFB format is described at: http://www.isc.ac.uk/standards/ffb/

After going through the Project Report pdf file, the HDF and OUT files seem to be downstream files as a result of processing by ENGDAHL.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

(Please read this comment in edit mode)
Pasting some notes while discussing with Alexei:

origin:

  1. lat lon depth Agency OriginTime Magnitude TP mthodID(EHB) earthModelID(ak135)
    9
    7
    9
    7
    3.030084268
    149.6036987
    15.91859436
    4.030264378
    7.434346199

evaluationMode(automatic)
NA
scautoloc@ip-172-31-30-172
2017-10-19T07:14:16.002041Z

(if no residual, it was not used for location but only for association)

pick:

  1. phase Net Sta Channel Res Dis(deg) Az(or backazimuth) Time methodID(default to STA/LTA) evaluationMode(automatic) author(EHB)

from hiperseis.

basaks avatar basaks commented on August 30, 2024

@niketchhajed As discussed, here is a approach that might work.
We should try and create obspy event objects for each event and obspy has event export functionality into quakeml and also sc3ml.

See how I create picks and amplitude obspy objects in seismic.pickers.PickerMixin class. We need to be able to further create the rest of the objects that are required by the event class, e.g., here. Then we can dump a SC3ML/quakeml that can be ingested into seiscomp3.

We can pursue a similar appraoch for Earthmon/OracleDB event transformation and ingestion.

For PhasePApy, @sudhirJain may have to pursue a similar approach.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

In further discussion with @alexgorb, the relocated data files .HDF and .OUT is what needs to be considered for importing into SC3 db.

Given the fact that in .DAT files, there are 5 sec differences in arrival times for the same station between those received by GA stations and those received by ISC from other sources, importing from .DAT would probably not be the best idea. The .HDF and .OUT files have data that has relocated origins and relocated arrival times. These will be imported in SC3.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

Still there are some grey areas in the .HDF and .OUT files as described below:

  1. In .OUT files, for the same P arrival, there are multiple phase labels i.e. P, Pn, eP, iP, PP, PnPn, PcP, Pdiff, PKiKP, ePKP, ePKI, PKP, pP, epP, etc. These notations need to be clarified.
  2. In the .HDF file, there are 2 columns: ntot(total number of observations used) and ntel(number of teleseismic observations used - delta > 28 deg). The arrival data in .OUT files needs to be correlated with the values in these 2 columns, to establish which arrivals were used for association and which were used for location. Need more clarification on this.

from hiperseis.

alexgorb avatar alexgorb commented on August 30, 2024
  1. There are many different phases (http://www.isc.ac.uk/standards/phases/) and all are required for our purposes. However the first characters such as 'e' or 'i' corresponds to the energy of the arrival phase - emergent or impulsive. These characters usually omitted during conversions if can not be transferred into separated field of new format.
  2. It is not very clear to me. I would say that these numbers are only for model selection purposes and do not have any relation to association or location.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

I have listed below some clarifications required from @alexgorb:

  1. There are more than a million unique stations involved in arrivals in the ENGDAHL files. However, the network code information is missing. Is there an easy way to uniquely determine the network code for a given station name?
  2. In the .out file, most of the S arrivals have the station name missing. Like below:

Do we assume that the station name (and other details) for all S arrivals is the same as the P arrival immediately before?

  1. How to process for arrivals that have the first (and second) phase missing. For e.g. the SCP or PCS arrival above.

  2. One of the desired fields as part of the arrival information is distance. Is the delta field the same as the required distance field? It would be better if we can get the meaning of all columns listed below:

delta, dtdd, focal angle, (the missing column names for the 2 phases), scor, wgt( whether it is time weight or backazimuth weight?)

  1. What do the *s after the residual values mean?

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

arrivals

from hiperseis.

alexgorb avatar alexgorb commented on August 30, 2024

from hiperseis.

alexgorb avatar alexgorb commented on August 30, 2024

Please note - some stations may have only "later" phases such as PKP because no P wave present at long distances.

from hiperseis.

alexgorb avatar alexgorb commented on August 30, 2024

I checked the list of registered seismic stations. There are less than 1600 stations. @niketchhajed how did you calculate more than a million? See attached file.
fdsnsta2013.zip

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

allstations.txt

@alexgorb This is the list of unique stations that are involved in the arrival data of ENGDAHL. If you find that something is not right in this list of stations, let me know. I will investigate.

from hiperseis.

alexgorb avatar alexgorb commented on August 30, 2024

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

Some more feedback from @alexgorb:

  1. magnitude integration
  2. distance (and other fields) for s phase the same as the immediately preceding p phase
  3. include the entire phase and not the first character

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

The current state of data migration was reviewed with @alexgorb and it seems to be in an acceptable condition. Below are the points to be worked upon but not urgent:

  1. Integrate the network codes with the data.

Closing this issue and creating a separate issue for network codes integration.

from hiperseis.

Zephyrpony avatar Zephyrpony commented on August 30, 2024

Jira Task PST-215
https://gajira.atlassian.net/browse/PST-215

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

The engdahl events are backed up at s3://pyrobots-backup/niket/engdahl-events/

from hiperseis.

basaks avatar basaks commented on August 30, 2024

Command to copy s3 dir using awscli: aws s3 cp s3://pyrobots-backup/niket/engdahl-events/ target_dir/ --recursive.

from hiperseis.

basaks avatar basaks commented on August 30, 2024

All engdahl and isc events from the sc3 bucket are also copied in NCI here with read access for everyone: /g/data/ha3/sudipta/event_xmls.

from hiperseis.

niketchhajed avatar niketchhajed commented on August 30, 2024

@basaks, just fyi. these isc events do not have preferred origin set. The ones with preferred origin set are currently in an AWS instance. I will replace the latest in S3.

from hiperseis.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.