curryer.correction.dataio¶
Helpers for querying and downloading NetCDF data from AWS S3.
All interactions rely on the boto3 S3 client. Callers may either provide an
explicit client instance (useful for testing) or rely on the default client, in
which case boto3 must be installed and AWS credentials are read from the
standard AWS_* environment variables.
Attributes¶
Classes¶
Protocol for mission-specific telemetry loading functions. |
|
Protocol for mission-specific science frame loading functions. |
|
Protocol for mission-specific GCP (Ground Control Point) loading functions. |
|
Configuration describing how data is organised within an S3 bucket. |
Functions¶
|
Validate that telemetry loader output has expected structure. |
|
Validate that science loader output has expected structure. |
|
|
|
|
|
Return S3 object keys for NetCDF files in the given date range. |
|
Download the specified S3 objects to |
Module Contents¶
- curryer.correction.dataio.boto3 = None¶
- class curryer.correction.dataio.TelemetryLoader¶
Bases:
ProtocolProtocol for mission-specific telemetry loading functions.
Telemetry loaders are responsible for reading spacecraft state data (position, attitude, timing) from mission-specific formats and returning it in a standard DataFrame format.
- Standard Signature:
def load_telemetry(tlm_key: str, config) -> pd.DataFrame
- Requirements:
Accept tlm_key (path or identifier) and config object
Return DataFrame with mission-specific telemetry fields
Include time fields needed for SPICE kernel creation
Include attitude data (quaternions or DCMs)
Include position data if creating SPK kernels
Example
- def load_clarreo_telemetry(tlm_key: str, config) -> pd.DataFrame:
# Load from multiple CSV files # Convert formats (DCM to quaternion, etc.) # Merge and return return telemetry_df
- __call__(tlm_key: str, config) pandas.DataFrame¶
Load telemetry data for a given key.
- class curryer.correction.dataio.ScienceLoader¶
Bases:
ProtocolProtocol for mission-specific science frame loading functions.
Science loaders provide frame timing and metadata for the instrument observations that will be geolocated.
- Standard Signature:
def load_science(sci_key: str, config) -> pd.DataFrame
- Requirements:
Accept sci_key (path or identifier) and config object
Return DataFrame with frame timing data
Must include time field specified in config.geo.time_field
Time values should match expected format (e.g., GPS microseconds)
Example
- def load_clarreo_science(sci_key: str, config) -> pd.DataFrame:
# Load frame timestamps # Convert to required units (e.g., GPS µs) return science_df
- __call__(sci_key: str, config) pandas.DataFrame¶
Load science frame timing/metadata.
- class curryer.correction.dataio.GCPLoader¶
Bases:
ProtocolProtocol for mission-specific GCP (Ground Control Point) loading functions.
GCP loaders retrieve reference imagery or coordinates for ground truth comparison.
- Standard Signature:
def load_gcp(gcp_key: str, config) -> Any
Note
This interface is currently a placeholder. The return type and structure will be standardized when GCP loading is fully integrated into the pipeline.
Example
- def load_clarreo_gcp(gcp_key: str, config):
# Load Landsat reference image # Or load GCP coordinate database return gcp_data
- __call__(gcp_key: str, config)¶
Load GCP reference data.
- curryer.correction.dataio.validate_telemetry_output(df: pandas.DataFrame, config) None¶
Validate that telemetry loader output has expected structure.
- Parameters:
df – DataFrame returned by telemetry loader
config – CorrectionConfig object
- Raises:
TypeError – If not a DataFrame
ValueError – If DataFrame is empty
Note
Specific column requirements depend on mission and kernel configs. This performs basic structure checks only.
- curryer.correction.dataio.validate_science_output(df: pandas.DataFrame, config) None¶
Validate that science loader output has expected structure.
- Parameters:
df – DataFrame returned by science loader
config – CorrectionConfig object
- Raises:
TypeError – If not a DataFrame
ValueError – If DataFrame is empty or missing required time field
Example
>>> sci_df = load_science("sci_001", config) >>> validate_science_output(sci_df, config)
- class curryer.correction.dataio.S3Configuration(bucket: str, base_prefix: str)¶
Configuration describing how data is organised within an S3 bucket.
- bucket¶
- base_prefix¶
- date_prefix(date: datetime.date) str¶
Return the S3 prefix for
date.
- curryer.correction.dataio._require_client(client: object | None) object¶
- curryer.correction.dataio._iter_dates(start: datetime.date, end: datetime.date) collections.abc.Iterable[datetime.date]¶
- curryer.correction.dataio.find_netcdf_objects(config: S3Configuration, start_date: datetime.date, end_date: datetime.date, *, s3_client=None) list[str]¶
Return S3 object keys for NetCDF files in the given date range.
- Parameters:
config (S3Configuration) – Describes the bucket and prefix layout.
start_date (datetime.date) – Inclusive date range to scan for NetCDF files.
end_date (datetime.date) – Inclusive date range to scan for NetCDF files.
s3_client (boto3 S3 client, optional) – Client instance to use. If omitted, a default client is created.
- curryer.correction.dataio.download_netcdf_objects(config: S3Configuration, object_keys: collections.abc.Iterable[str], destination: os.PathLike[str] | str, *, s3_client=None) list[pathlib.Path]¶
Download the specified S3 objects to
destination.- Parameters:
config (S3Configuration) – Describes the bucket hosting the objects.
object_keys (iterable of str) – S3 object keys to download.
destination (path-like) – Directory where the files should be stored. It is created if needed.
s3_client (boto3 S3 client, optional) – Client instance to use. If omitted, a default client is created.