oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope
Module Contents
Classes
Create a IrusFulcrumRelease instance. |
|
The Fulcrum Telescope |
Functions
|
Download Fulcrum data for the release month |
|
Transforms Fulcrum downloaded "totals" and "country" data. |
Attributes
- oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope.IRUS_FULCRUM_ENDPOINT_TEMPLATE = 'https://irus.jisc.ac.uk/api/v3/irus/reports/irus_ir/?platform=235&requestor_id={requestor_id}&beg...'[source]
- class oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope.IrusFulcrumRelease(dag_id, run_id, data_interval_start, data_interval_end, partition_date)[source]
Bases:
observatory.platform.workflows.workflow.PartitionRelease
Create a IrusFulcrumRelease instance.
- Parameters:
dag_id (str) – The ID of the DAG
run_id (str) – The airflow run ID
data_interval_start (pendulum.DateTime) – The beginning of the data interval
data_interval_end (pendulum.DateTime) – The end of the data interval
partition_date (pendulum.DateTime) – The release/partition date
- class oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope.IrusFulcrumTelescope(dag_id, cloud_workspace, publishers, data_partner='irus_fulcrum', bq_dataset_description='IRUS dataset', bq_table_description=None, api_dataset_id='fulcrum', observatory_api_conn_id=AirflowConns.OBSERVATORY_API, irus_oapen_api_conn_id='irus_api', catchup=True, schedule='0 0 4 * *', start_date=pendulum.datetime(2022, 4, 1))[source]
Bases:
observatory.platform.workflows.workflow.Workflow
The Fulcrum Telescope :param dag_id: The ID of the DAG :param cloud_workspace: The CloudWorkspace object for this DAG :param publishers: The publishers pertaining to this DAG instance (as listed in Fulcrum) :param data_partner: The name of the data partner :param bq_dataset_description: Description for the BigQuery dataset :param bq_table_description: Description for the biguery table :param api_dataset_id: The ID to store the dataset release in the API :param observatory_api_conn_id: Airflow connection ID for the overvatory API :param irus_oapen_api_conn_id: Airflow connection ID OAPEN IRUS UK (counter 5) :param catchup: Whether to catchup the DAG or not :param schedule: The schedule interval of the DAG :param start_date: The start date of the DAG
- Parameters:
dag_id (str) –
cloud_workspace (observatory.platform.observatory_config.CloudWorkspace) –
publishers (List[str]) –
data_partner (Union[str, oaebu_workflows.oaebu_partners.OaebuPartner]) –
bq_dataset_description (str) –
bq_table_description (str) –
api_dataset_id (str) –
observatory_api_conn_id (str) –
irus_oapen_api_conn_id (str) –
catchup (bool) –
schedule (str) –
start_date (pendulum.DateTime) –
- make_release(**kwargs)[source]
Create a IrusFulcrumRelease instance Dates are best explained with an example Say the dag is scheduled to run on 2022-04-07 Interval_start will be 2022-03-01 Interval_end will be 2022-04-01 partition_date will be 2022-03-31
- Return type:
- download(release, **kwargs)[source]
Task to download the Fulcrum data for a release
- Parameters:
releases – the IrusFulcrumRelease instance.
release (IrusFulcrumRelease) –
- upload_downloaded(release, **kwargs)[source]
Upload the downloaded fulcrum data to the google cloud download bucket
- Parameters:
release (IrusFulcrumRelease) –
- transform(release, **kwargs)[source]
Task to transform the fulcrum data
- Parameters:
release (IrusFulcrumRelease) –
- upload_transformed(release, **kwargs)[source]
Upload the transformed fulcrum data to the google cloud download bucket
- Parameters:
release (IrusFulcrumRelease) –
- bq_load(release, **kwargs)[source]
Load the transfromed data into bigquery
- Parameters:
release (IrusFulcrumRelease) –
- Return type:
None
- add_new_dataset_releases(release, **kwargs)[source]
Adds release information to API.
- Parameters:
release (IrusFulcrumRelease) –
- Return type:
None
- cleanup(release, **kwargs)[source]
Delete all files and folders associated with this release.
- Parameters:
release (IrusFulcrumRelease) –
- Return type:
None
- oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope.download_fulcrum_month_data(download_month, requestor_id, num_retries=3)[source]
Download Fulcrum data for the release month
- Parameters:
download_month (pendulum.DateTime) – The month to download usage data from
requestor_id (str) – The requestor ID - used to access irus platform
num_retries (str) – Number of attempts to make for the URL
- Return type:
Tuple[List[dict], List[dict]]
- oaebu_workflows.irus_fulcrum_telescope.irus_fulcrum_telescope.transform_fulcrum_data(totals_data, country_data, publishers=None)[source]
Transforms Fulcrum downloaded “totals” and “country” data.
- Parameters:
totals_data (List[dict]) – Fulcrum usage data aggregated over all countries
country_data (List[dict]) – Fulcrum usage data split by country
publishers (List[str]) – Fulcrum publishers to retain. If None, use all publishers
- Return type:
List[dict]