Jump to content
Toggle sidebar
UNITApedia
Search
English
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Talk
Contributions
Navigation
Main Page
User Guide
Documentation
Viewpoints
Structural
Strategic
Beneficiary
Semantic
Infrastructure
Data
Beneficiaries
UNITA Participants
GEMINAE
Collectives
Agile Management Guide
Quality Management Process
Tools
What links here
Related changes
Special pages
Page information
Page values
In other languages
Editing
Documentation
(section)
Page
Discussion
English
Read
Edit
Edit source
View history
More
Read
Edit
Edit source
View history
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Data Ingestion Methods === * <q>'''ETL Pipeline (Semi-automatic – MinIO)'''</q> # '''Apache HOP Integration''': [https://unitapedia.univ-unita.eu/hop/ Apache HOP] serves as the primary '''Extract-Transform-Load (ETL)''' tool. With designed jobs it periodically checks all the configured [https://unitapedia.univ-unita.eu/minio/ MinIO] buckets for each indicator where CSV or other structured files are uploaded by UNITA partners. # '''Data Transformation''': Once [https://unitapedia.univ-unita.eu/hop/ Apache HOP] detects new files in [https://unitapedia.univ-unita.eu/minio/ MinIO], it cleanses and transforms the data according to predefined mappings and rules (e.g., converting date formats, normalizing institution names). # '''Loading into Data Warehouse''': After validation, the transformed data is loaded into [https://unitapedia.univ-unita.eu/pga/ PostgreSQL], ensuring consistent schemas and reliable storage. Any errors or exceptions (e.g., missing columns, incorrect data types) are logged and reported back to the relevant partners. * <q>'''ETL Pipeline (Manual – Strapi forms)'''</q> # '''User Submission''': For data that cannot be automatically generated, UNITA offices fill out [https://unitapedia.univ-unita.eu/strapi/ Strapi] forms for each indicator. # '''Validation and Approval''': Basic validation rules (e.g., mandatory fields, numeric ranges) are applied at form submission. Where needed, designated coordinators such as Task Leaders or Project Managers can review and approve entries before they are transferred to [https://unitapedia.univ-unita.eu/hop/ Apache HOP] for transformation and integration. # '''Data Transformation''': Once [https://unitapedia.univ-unita.eu/hop/ Apache HOP] detects new entries in the [https://unitapedia.univ-unita.eu/strapi/ Strapi] database on [https://unitapedia.univ-unita.eu/pga/ PostgreSQL], it cleanses and transforms the data according to predefined mappings and rules (e.g., converting date formats, normalizing institution names). # '''Loading into Data Warehouse''': After validation, the transformed data is loaded into the [https://unitapedia.univ-unita.eu/pga/ PostgreSQL] Datawarehouse, ensuring consistent schemas and reliable storage. Any errors or exceptions (e.g., missing columns, incorrect data types) are logged and reported back to the relevant partners. * <q>'''Batch vs. Near real-time Ingestion'''</q> # '''Batch Frequency''': In most cases, ingestion jobs run on a scheduled basis—daily or weekly—depending on data volume and the nature of the indicators. For example, monthly metrics on student mobility may only require a weekly refresh. # '''On-Demand Updates''': When critical data (e.g., newly completed deliverables, urgent progress metrics) must be reflected quickly in dashboards, users can trigger an on-demand ETL job via the [https://unitapedia.univ-unita.eu/hop/ Apache HOP] server. [[File:Architecture Data.jpg|thumb|930px|center|Logical Architecture Data Input]]
Summary:
Please note that all contributions to UNITApedia are considered to be released under the Creative Commons Zero (public domain) (see
UNITApedia:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Debug data: