life_td_data_generation.building module

Combines the data from the individual data providers.

building.assign_source_idref(cat: Table, sources: Table, paras: list[str], provider: str) Table

Joins source identifiers to parameters in a catalog table.

For each parameter that has a reference column, this function: 1. Validates and handles existing source ID columns. 2. Processes null values in reference columns. 3. Links reference data with source identifiers. 4. Manages masked parameter values.

Parameters:
  • cat (Table) – Table with empty para_source_id columns.

  • sources (Table) – Table containing reference data.

  • paras (list[str]) – List of parameters to process.

  • provider (str) – Name of the data provider.

Returns:

Catalog table with parameter source IDs added.

Return type:

Table

building.assign_type(cat: Table, i: int) str

Assigns an object type based on two potential type columns.

Priority is given to ‘type_2’ unless it is masked or ‘None’.

Parameters:
  • cat (Table) – The table containing type columns.

  • i (int) – Index of the row to process.

Returns:

The assigned type string.

Return type:

str

building.best_para(para: str, mes_table: Table) Table

Selects the highest quality measurement for each object in the table.

Parameters:
  • para (str) – Parameter name (e.g., ‘mass’, ‘id’).

  • mes_table (Table) – Table containing measurements.

Returns:

Table with highest quality rows for each unique object.

Return type:

Table

building.best_para_id(mes_table: Table) Table

Selects the best identifier for each object based on reference priority.

Parameters:

mes_table (Table) – Measurement table for identifiers.

Returns:

Table with prioritized identifier rows.

Return type:

Table

building.best_para_membership(mes_table: Table) Table

Selects the best membership measurement for parent-child pairs.

Chooses the row with the maximum membership value.

Parameters:

mes_table (Table) – Measurement table for membership.

Returns:

Table with best membership entries.

Return type:

Table

building.best_parameters_ingestion(cat_mes: Table, cat_basic: Table, para: str, columns: list[str] | None = None) Table

Updates a basic table with the best measurements from a measurement table.

Parameters:
  • cat_mes (Table) – Table containing multiple measurements.

  • cat_basic (Table) – Table to be updated.

  • para (str) – Parameter name.

  • columns (list[str] or None) – Columns to remove from cat_basic before joining.

Returns:

Updated basic table.

Return type:

Table

building.build_objects_table(cat: dict[str, Table], prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Builds the objects table and assigns unique object IDs.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.build_provider_table(cat: dict[str, Table], prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Builds the provider table.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.build_rest_of_tables(cat: dict[str, Table], prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Builds all remaining tables (basic, measurements, etc.) and performs links.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.build_sources_table(prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Initializes the catalog and builds the unique sources table.

Parameters:

prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

Returns:

Dictionary containing the initialized sources table.

Return type:

dict[str, Table]

building.build_tables(prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Orchestrates the building of all database tables.

Parameters:

prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

Returns:

Dictionary of built tables.

Return type:

dict[str, Table]

building.building(prov_tables_dict: dict[str, dict[str, Table]]) dict[str, Table]

Builds the complete LIFE database from provider tables and saves it.

Parameters:

prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary containing data from providers Simbad, Grant Kennedy, Exo-MerCat, Gaia and WDS.

Returns:

Dictionary of processed tables.

Return type:

dict[str, Table]

building.idsjoin(cat: Table, column_ids1: str, column_ids2: str) Table

Merges two identifier columns into one, removing duplicates.

  • Merge identifiers across both columns per row.

  • Remove duplicate identifiers within the same row.

  • Return the result in a row-wise manner, where all identifiers for that row are sorted.

Parameters:
  • cat (Table) – Astropy Table containing two identifier columns.

  • column_ids1 (str) – Name of the first identifier column.

  • column_ids2 (str) – Name of the second identifier column.

Returns:

Table with a unified ‘ids’ column containing merged, unique identifiers.

Return type:

Table

building.join_different_provider_data(cat: dict[str, Table], o_merging: bool, prov_name: str, prov_tables_dict: dict[str, dict[str, Table]], table_name: str) dict[str, Table]

Joins data from a specific provider into the main table.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • o_merging (bool) – Whether to perform object-level merging.

  • prov_name (str) – Name of the provider.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

  • table_name (str) – Name of the table to merge.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.matching_parameters(cat: dict[str, Table], prov_name: str, prov_tables_dict: dict[str, dict[str, Table]], table_name: str) dict[str, Table]

Redefines source reference columns with their corresponding IDs.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • prov_name (str) – Name of the provider.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary of provider tables.

  • table_name (str) – Name of the table to process.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.merge_table(cat1: Table, cat2: Table) Table

Merges two tables.

If one table is empty, vstack is used as join would fail.

Parameters:
  • cat1 (Table) – First table to merge.

  • cat2 (Table) – Second table to merge.

Returns:

Merged table.

Return type:

Table

building.objectmerging(cat: Table) Table

Merges the data of each object given in the different providers.

The object is the same physical one but the data is provided by different providers and merged into one entry.

Parameters:

cat (Table) – Table containing multiple entries for the same objects.

Returns:

Table with unique object entries.

Return type:

Table

building.provider_data_merging(cat: dict[str, Table], table_name: str, prov_tables_dict: dict[str, dict[str, Table]], o_merging: bool = False, para_match: bool = False) dict[str, Table]

Merges the data from the different providers for a specific table.

Parameters:
  • cat (dict[str, Table]) – Dictionary of cumulative tables.

  • table_name (str) – Name of the table to build/merge.

  • prov_tables_dict (dict[str, dict[str, Table]]) – Dictionary mapping providers to their tables.

  • o_merging (bool) – Whether to perform object-level merging (ids and types).

  • para_match (bool) – Whether to perform source ID matching.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]

building.unify_null_values(cat: dict[str, Table]) dict[str, Table]

Unifies null values (‘N’, ‘N/A’) to ‘?’ across specific tables/columns.

Parameters:

cat (dict[str, Table]) – Dictionary of cumulative tables.

Returns:

Updated dictionary of cumulative tables.

Return type:

dict[str, Table]