mt_metadata.utils.summarize
Created on Tue Feb 23 11:52:35 2021
- copyright:
Jared Peacock (jpeacock@usgs.gov)
- license:
MIT
This module provides functionality to summarize metadata standards from both legacy BaseDict-based objects and modern Pydantic v2 MetadataBase objects.
The main functions are: - summarize_timeseries_standards(): Legacy function for BaseDict objects - summarize_pydantic_standards(): New function for Pydantic v2 MetadataBase objects - extract_metadata_fields_from_pydantic(): Extract fields from individual Pydantic classes - summarize_standards(): Unified interface supporting both legacy and Pydantic systems
- Example usage:
# For Pydantic v2 objects (recommended) >>> df = summarize_standards(metadata_type=”pydantic”)
# Extract fields from individual class >>> from mt_metadata.timeseries import Survey >>> fields = extract_metadata_fields_from_pydantic(Survey)
# Get BaseDict-compatible summary >>> summary = summarize_pydantic_standards()
Attributes
Functions
|
Extract field information from a Pydantic v2 MetadataBase class definition |
|
Collect all MetadataBase subclasses from a given module. |
|
Summarize the standards for metadata using Pydantic v2 MetadataBase classes. |
|
Summarize all metadata from a summarized dictionary of standards |
|
Convert a summary dictionary to a pandas DataFrame. |
|
Summarize standards into a numpy array and write a csv if specified |
Module Contents
- mt_metadata.utils.summarize.SUMMARIZE_DTYPE
- mt_metadata.utils.summarize.extract_metadata_fields_from_pydantic(metadata_class)
Extract field information from a Pydantic v2 MetadataBase class definition and convert it to a format compatible with BaseDict.
- Parameters:
metadata_class (type) – A MetadataBase class (not instance)
- Returns:
Dictionary with field information compatible with BaseDict
- Return type:
dict
- mt_metadata.utils.summarize.collect_basemodel_objects(module)
Collect all MetadataBase subclasses from a given module.
- Parameters:
module (str) – The module to inspect (e.g., ‘mt_metadata.timeseries’)
- Returns:
Dictionary mapping class objects to their names
- Return type:
dict[type[MetadataBase], str]
- mt_metadata.utils.summarize.summarize_pydantic_standards(module='timeseries')
Summarize the standards for metadata using Pydantic v2 MetadataBase classes. Similar to summarize_timeseries_standards but works with the new Pydantic structure.
- Parameters:
module (str, optional) – The module to inspect, by default “timeseries”
- Returns:
BaseDict object containing summarized field information
- Return type:
- mt_metadata.utils.summarize.summary_to_array(summary_dict, dtype=SUMMARIZE_DTYPE)
Summarize all metadata from a summarized dictionary of standards
- Parameters:
summary_dict (dict) – Dictionary of summarized standards
- Returns:
numpy structured array
- Return type:
np.array
- mt_metadata.utils.summarize.summary_to_dataframe(summary_dict)
Convert a summary dictionary to a pandas DataFrame.
- Parameters:
summary_dict (dict) – Dictionary of summarized standards
- Returns:
DataFrame containing the summarized standards
- Return type:
pd.DataFrame
- mt_metadata.utils.summarize.summarize_standards(module='timeseries', csv_fn=None, output_type='dataframe', dtype=SUMMARIZE_DTYPE)
Summarize standards into a numpy array and write a csv if specified
- Parameters:
module (str, optional) – Module to summarize, by default “timeseries”
csv_fn (str or Path, optional) – Full path to write a csv file, by default None
- Returns:
If output_type is “array”, returns a numpy structured array. If output_type is “dataframe”, returns a pandas DataFrame.
- Return type:
numpy.ndarray | pd.DataFrame