Archive at IRIS SPUD EMTF
With FAIR data principles and requirements for data to be public archiving has become another task for publication. MT transfer functions can be archived at the IRIS SPUD EMTF, which provides a web service to query for available data. Anna Kelbert is the gatekeeper to archive transfer functions and early on was tasked with developing an XML standard for MT transfer functions. That format (EMTFXML) is published here. XML can be complicated to read and format correctly especially of there is not a schema to validate with. Luckily mt_metadata has tools to convert transfer functions to the EMTFXML format. Alternatively you may use Anna Kelbert’s Fortran codes found here.
Below are a few examples of converting transfer functions to the EMTFXML format.
EDI to XML
Probably the most common format of MT transfer functions is the EDI format. This was developed in the 1980’s and has been a staple since due to its flexibility and readability. However, the flexibility lends itself to be difficult to read in a standard way. MT-Metadata supports various flavors of EDI files, probably not all so if you find one that doesn’t read in properly raise an issue here with a label transfer function
.
[1]:
from mt_metadata import TF_EDI_RHO_ONLY
from mt_metadata.transfer_functions import TF
Read in the Transfer Function File
First, read in the transfer function file. Here we are reading it into the generic TF
object. The TF
object contains similar metadata to time series data with the addition of a few things like how the transfer function was created.
The file we are reading is bare-bones and only contains resistivity and phase data for the off-diagonal components. Therefore, we are going to need to add a lot of metadata to make it compliant with EMTFXML.
You should also note a feature and sometimes fustrating feature, is that if the transfer function has not elevation data the function mt_metadata.transfer_functions.io.tools.get_nm_elev
is called. This only works for North America and mostly the US. Until we figure out a better way you will see an error message. Don’t fret, this just means that the elevation could not be found and it is set to 0. You can ignore this by setting the keyword TF.read(get_elevation=False)
.
[2]:
tf_object = TF(TF_EDI_RHO_ONLY)
tf_object.read(get_elevation=True)
2023-09-27T15:23:34.213438-0700 | ERROR | mt_metadata.transfer_functions.io.tools | get_nm_elev | Input values (latitude=-34.646, longitude=137.006) could not be found on US National Map.
[3]:
tf_object = TF(TF_EDI_RHO_ONLY)
tf_object.read(get_elevation=False)
At this point we can do one of two things,
Fill in metadata into the
TF
object then write (maybe more intuitive if you have the time series metadata)Convert to an
EMTFXML
object and fill in metadata directly (maybe faster if you are just filling in metadata from scratch)
Fill in Metadata
Before you start filling out fields, you should probably know what fields are there and what they mean. Lets have a look at the fields
[4]:
xml_object = tf_object.to_emtfxml()
Sections of an EMTFXML
Here are the key sections in an EMTFXML file, in order. Note the names are put into lower case for easier typing, in the XML file these will be Capital Case. product_id
-> ProductId
[5]:
for section in xml_object.element_keys:
print(section)
description
product_id
sub_type
notes
tags
external_url
primary_data
attachment
provenance
copyright
site
field_notes
processing_info
statistical_estimates
data_types
site_layout
data
period_range
Getting help with attributes of each section
[6]:
print(xml_object.provenance.__doc__)
+----------------------------------------------+-----------------------------------------------+----------------+
| **Metadata Key** | **Description** | **Example** |
+==============================================+===============================================+================+
| **create_time** | date and time the file was created | 2020-02-08T12:2|
| | | 3:40.324600+00:|
| Required: True | | 00 |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: date time | | |
| | | |
| Default: 1980-01-01T00:00:00+00:00 | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **creating_application** | name of the application that created the XML | EMTF File |
| | file | Conversion |
| Required: True | | Utilities 4.0 |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: free form | | |
| | | |
| Default: mt_metadata | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **creator.name** | author name | person name |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: free form | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **creator.email** | email of the contact person | mt.guru@em.org |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: email | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **creator.org** | organization name | mt gurus |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: free form | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **creator.org_url** | URL of organization | https://www.mt_|
| | | gurus.org |
| Required: False | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: url | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **submitter.name** | author name | person name |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: free form | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **submitter.email** | email of the contact person | mt.guru@em.org |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: email | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **submitter.org** | organization name | mt gurus |
| | | |
| Required: True | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: free form | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
| **submitter.org_url** | URL of organization | https://www.mt_|
| | | gurus.org |
| Required: False | | |
| | | |
| Units: None | | |
| | | |
| Type: string | | |
| | | |
| Style: url | | |
| | | |
| Default: None | | |
+----------------------------------------------+-----------------------------------------------+----------------+
Alternatively using print
If you just want to know what the element looks like in XML, you can use the print statement and to_xml
method
[7]:
print(xml_object.attachment.to_xml(string=True))
<?xml version="1.0" encoding="UTF-8"?>
<Attachment/>
1. Description
Description
is a few words about what is included in the file. For MT transfer functions this will almost always be Magnetotelluric Transfer Functions
, which is the default value.
[8]:
xml_object.description
[8]:
'Magnetotelluric Transfer Functions'
2. ProductID
The Product ID provides a unique identifier to the transfer function, whilst including the survey, project, and station name. The format should be {project}.{station_id}.{year}
for example data collected by a group at the USGS for station MT01 in 2020 would be USGS.MT01.2020
.
[9]:
xml_object.product_id
[9]:
'Spencer Gulf.s08.2020'
[10]:
xml_object.product_id = "USGS.MT01.2020"
3. SubType
SubType
provides a secondary keyword for the what the file includes. This will almost always be MT_TF
, which is the default value
[11]:
xml_object.sub_type = "MT_TF"
4. Notes
Notes
allows for any free text notes about the file.
[12]:
xml_object.notes = "This is an example note"
5. Tags
Tags
provides tags for what type of transfer functions are included in the file. Options are
impedance
tipper
Note that tags are automatically updated when write
is called depending on the type of data the EMTFXML
object contains.
[13]:
xml_object.tags
[13]:
'impedance'
6. ExternalUrl
ExternalUrl
describes a URL that is not contained within the IRIS archive, or a URL that the data is related to that is not within the IRIS archive. For example if the time series is archived in a different place a link to that URL would be placed here.
It contains attributes
description: A description of where and what the URL points to
url: the actual URL link
[14]:
xml_object.external_url
[14]:
{
"external_url": {
"description": null,
"url": null
}
}
[15]:
xml_object.external_url.description = "This is a fake link to non existing time series data."
xml_object.external_url.url = "fake.data.test"
7. Primary Data
Primary data describes an image of the transfer function that is archived along side the transfer function. It has a single attribute
filename: file name of the image file
should be a .png file for easier storage and viewing
should be named {station_id}.png
[16]:
xml_object.primary_data.filename = "s08.png"
8. Attachment
Describes any attachments archived with the XML file. For example it is good practice to archive the original file along with the XML. In this example it would be the original EDI file. Attachment has two attributes
description: describing what is attached
filename: file name attached
[17]:
xml_object.attachment
[17]:
{
"attachment": {
"description": null,
"filename": null
}
}
[18]:
xml_object.attachment.description = "Original EDI file to produce XML"
xml_object.attachment.filename = TF_EDI_RHO_ONLY.as_posix()
9. Provenance
Provenance describes where the file came from, who created it and who submitted it.
Note: create_time
is updated to the time and date the file is written and the creating_application
is filled in on write.
[19]:
xml_object.provenance
[19]:
{
"provenance": {
"create_time": "2020-12-15T00:00:00+00:00",
"creating_application": "DataManager",
"creator.email": null,
"creator.name": null,
"creator.org": null,
"submitter.email": null,
"submitter.name": "DataManager",
"submitter.org": null
}
}
[20]:
xml_object.provenance.creator.name = "me"
xml_object.provenance.creator.email = "my.email@email"
xml_object.provenance.creator.org = "my_organization"
xml_object.provenance.submitter.name = "me"
xml_object.provenance.submitter.email = "my.email@email"
xml_object.provenance.submitter.org = "my_organization"
10. Copyright
Copyright provides details on the accessibility and usage of the transfer function data.
The attributes include:
conditions_of_use: how the data can be used and any disclaimer
acknowledgement: any acknowledgement to how the data was funded, collected, processed.
release_status: [‘Unrestricted Release’ | ‘Restricted Release’ ]
[21]:
xml_object.copyright
[21]:
{
"copyright": {
"conditions_of_use": "All data and metadata for this survey are available free of charge and may be copied freely, duplicated and further distributed provided this data set is cited as the reference. While the author(s) strive to provide data and metadata of best possible quality, neither the author(s) of this data set, not IRIS make any claims, promises, or guarantees about the accuracy, completeness, or adequacy of this information, and expressly disclaim liability for errors and omissions in the contents of this file. Guidelines about the quality or limitations of the data and metadata, as obtained from the author(s), are included for informational purposes only.",
"release_status": "Unrestricted Release"
}
}
[22]:
xml_object.copyright.acknowledgement = "The data collection was funded by someone and land permission was granted by the generous land owner."
11. Site
Site provides information about the site. There are a few tricky attributes.
data_quality_notes: these are meant to provide users with a qualitative first pass at the quality of the transfer function and which periods are useful. This is basically a judgement call and there is no standard way to rate the data, yet.
data_quality_notes.rating:
0: Not rated
1: bad
2: not terrible
3: mediocre
4: good
5: great
name: Should be the closest geographic location to the station. If you are collecting a site near Manhattan in New York City a good name might be
Manhattan, NYC
orientation: is meant to convey the orientation of the transfer function, not how the data were collected. So if you collected the data in an orthogonal coordinate system but was oriented to geomagnetic North, then you processed the data and rotated to geographic North, then the
angle_to_geographic_north
would be 0.orientation.layout: [ ‘orthogonal’ | ‘sitelayout’ ]
[23]:
xml_object.site.to_dict(single=True)
[23]:
OrderedDict([('acquired_by', 'UofAdel,Scripps,GA,GSSA,AuScope'),
('data_quality_notes.rating', 0),
('id', 's08'),
('location.datum', 'WGS84'),
('location.elevation', 0.0),
('location.latitude', -34.646),
('location.longitude', 137.006),
('orientation.layout', 'orthogonal'),
('project', 'Spencer Gulf'),
('run_list', 's08a'),
('start', '2020-10-11T00:00:00+00:00'),
('survey', 'Spencer Gulf'),
('year_collected', 2020)])
[24]:
xml_object.site.data_quality_notes.to_dict(single=True, required=False)
[24]:
OrderedDict([('comments.author', None),
('comments.date', '1980-01-01T00:00:00+00:00'),
('comments.value', None),
('good_from_period', None),
('good_to_period', None),
('rating', 0)])
[25]:
xml_object.site.data_quality_notes.good_from_period = 1E-3 # high frequency / short period
xml_object.site.data_quality_notes.good_to_period = 1E3 # low frequency / long period
12. FieldNotes
FieldNotes is meant to provide information on how the data were collected in the field including what instruments were used, what were the orientations, any notes that maybe useful. This can include multiple runs if necessary.
13. ProcessingInfo
Provides information on how the transfer function was estimated.
[26]:
xml_object.processing_info.to_dict(single=True, required=False)
[26]:
OrderedDict([('process_date', '2020-12-15'),
('processed_by', None),
('processing_software.author', None),
('processing_software.last_mod', '1980-01-01'),
('processing_software.name', None),
('processing_tag', 's08a'),
('remote_info.site.acquired_by', None),
('remote_info.site.comments.author', None),
('remote_info.site.comments.date', '1980-01-01T00:00:00+00:00'),
('remote_info.site.comments.value', None),
('remote_info.site.country', None),
('remote_info.site.data_quality_notes.comments.author', None),
('remote_info.site.data_quality_notes.comments.date',
'1980-01-01T00:00:00+00:00'),
('remote_info.site.data_quality_notes.comments.value', None),
('remote_info.site.data_quality_notes.good_from_period', None),
('remote_info.site.data_quality_notes.good_to_period', None),
('remote_info.site.data_quality_notes.rating', None),
('remote_info.site.data_quality_warnings.flag', None),
('remote_info.site.end', '1980-01-01T00:00:00+00:00'),
('remote_info.site.id', None),
('remote_info.site.location.datum', None),
('remote_info.site.location.elevation', 0.0),
('remote_info.site.location.elevation_uncertainty', None),
('remote_info.site.location.latitude', 0.0),
('remote_info.site.location.latitude_uncertainty', None),
('remote_info.site.location.longitude', 0.0),
('remote_info.site.location.longitude_uncertainty', None),
('remote_info.site.location.x', None),
('remote_info.site.location.x2', None),
('remote_info.site.location.x_uncertainty', None),
('remote_info.site.location.y', None),
('remote_info.site.location.y2', None),
('remote_info.site.location.y_uncertainty', None),
('remote_info.site.location.z', None),
('remote_info.site.location.z2', None),
('remote_info.site.location.z_uncertainty', None),
('remote_info.site.name', None),
('remote_info.site.orientation.angle_to_geographic_north', 0.0),
('remote_info.site.orientation.layout', 'orthogonal'),
('remote_info.site.project', None),
('remote_info.site.run_list', ''),
('remote_info.site.start', '1980-01-01T00:00:00+00:00'),
('remote_info.site.survey', None),
('remote_info.site.year_collected', 1980),
('remote_ref.type', None),
('sign_convention', None)])
14. StatisticalEstimates
Provides boilerplate information on statistical estimates related to the transfer function in the XML file
[27]:
xml_object.statistical_estimates
[27]:
{
"statistical_estimates": {
"estimates_list": [
{
"estimate": {
"description": "Variance",
"external_url": "http://www.iris.edu/dms/products/emtf/variance.html",
"intention": "error estimate",
"name": "VAR",
"tag": "variance",
"type": "real"
}
}
]
}
}
15. DataTypes
Provides boilerplate information on the transfer function data types in the XML file
[28]:
xml_object.data_types
[28]:
{
"data_types": {
"data_types_list": [
{
"data_type": {
"description": "MT impedance",
"external_url": "http://www.iris.edu/dms/products/emtf/impedance.html",
"input": "H",
"intention": "primary data type",
"name": "Z",
"output": "E",
"tag": "impedance",
"type": "complex",
"units": "[mV/km]/[nT]"
}
}
]
}
}
16. SiteLayout
Provides concise information on how the station was setup and which channels were recorded as “input” and “output” channels.
[29]:
xml_object.site_layout
[29]:
{
"site_layout": {
"input_channels": [
{
"magnetic": {
"name": "Hx",
"orientation": 0.0,
"x": 0.0,
"y": 0.0,
"z": 0.0
}
},
{
"magnetic": {
"name": "Hy",
"orientation": 90.0,
"x": 0.0,
"y": 0.0,
"z": 0.0
}
}
],
"output_channels": [
{
"electric": {
"name": "Ex",
"orientation": 0.0,
"x": -5.0,
"x2": 5.0,
"y": 0.0,
"y2": 0.0,
"z": 0.0,
"z2": 0.0
}
},
{
"electric": {
"name": "Ey",
"orientation": 90.0,
"x": 0.0,
"x2": 0.0,
"y": -5.0,
"y2": 5.0,
"z": 0.0,
"z2": 0.0
}
}
]
}
}
17. Data
Provides the actual data as blocks per period. Filled in automatically with the the data attributes of xml_object.data
Attribute |
Description |
---|---|
z |
Impedance values |
z_var |
Impedance variance |
z_invsigcov |
Impedance inverse signal covariance |
z_residcov |
Impedance residual covariance |
t |
Tipper values |
t_var |
Tipper variance |
t_invsigcov |
Tipper inverse signal covariance |
t_residcov |
Tipper residual covariance |
These can also be found in xml_object._transfer_function
as an xarray.DataSet
.
[30]:
print(xml_object.data.to_xml(string=True)[0:896])
<?xml version="1.0" encoding="UTF-8"?>
<Data count="28">
<Period value="7.939999015440e-03" units="secs">
<Z type="complex" size="2 2" units="[mV/km]/[nT]">
<value name="Zxx" output="Ex" input="Hx">1.000000e+32 1.000000e+32</value>
<value name="Zxy" output="Ex" input="Hy">1.081125e+01 7.785428e+00</value>
<value name="Zyx" output="Ey" input="Hx">-1.022391e+01 -7.619160e+00</value>
<value name="Zyy" output="Ey" input="Hy">1.000000e+32 1.000000e+32</value>
</Z>
<Z.var type="real" size="2 2">
<value name="Zxx" output="Ex" input="Hx">1.000000e+32</value>
<value name="Zxy" output="Ex" input="Hy">5.741604e-05</value>
<value name="Zyx" output="Ey" input="Hx">1.050861e-04</value>
<value name="Zyy" output="Ey" input="Hy">1.000000e+32</value>
</Z.var>
</Period
18. PeriodRange
Defines the period range. Automatically filled from the data.
[31]:
xml_object.period_range
[31]:
{
"period_range": {
"max": 2730.8332372990308,
"min": 0.007939999015440123
}
}
Write EMTFXML File
Once you are happy with the attributes you can write the file out. Provide write
with a full path name to the file you want to write to.
[32]:
xml_object.write(TF_EDI_RHO_ONLY.parent.joinpath("example.xml"))
Archive at IRIS
Now you have some XML files and you’d like to submit them to IRIS. Currently the way to do that is to email Anna. More details to come.
[ ]: