import_hub_main module

import_hub_main.find_closest_name(col_names: list, targets: str) str[source]

Find the closest column name based on substrings.

Parameters:
  • col_names (list) – List of column names to search through

  • targets (str) – String containing the target column names

Returns:

Closest column name

Return type:

str

import_hub_main.main()[source]

Main function for the streamlit app

import_hub_main.populate_db(df: DataFrame, db_name: str, mappings: dict, config_path: str)[source]

Populate the database with the given dataframe.

Parameters:
  • df (pd.DataFrame) – Dataframe containing the data to be inserted into the database

  • db_name (str) – Name of the database to insert the data into

  • mappings (dict) – Dictionary containing the mappings between the CSV columns and the database tables

  • config_path (str, optional) – Path to the config file. Defaults to ‘config.yaml’.

import_hub_main.preprocess_string(s: str) str[source]

Preprocess the string by converting to lowercase, replacing underscores with spaces, tokenizing, and then reconstructing without special characters.

Parameters:

s (str) – String to preprocess

Returns:

Preprocessed string

Return type:

str