Getty

The main objective of this project is to develop ETL pipeline to effectively meet both existing and future data modeling requirements for the STARDATA database, ensuring a smooth and reliable transfer of data into Arches.

Industry: Cultural Heritage & Preservation

Time: 3 years

Type: Semantics & Software Development

Project Idea

The main objective of this project is to develop ETL pipeline to effectively meet both existing and future data modeling requirements for the STARDATA database, ensuring a smooth and reliable transfer of data into Arches. The ultimate aim is to enable users to navigate and explore these datasets in a structured and semantically rich manner through the Arches platform, thereby unlocking their full potential for research, analysis, and knowledge sharing.

Team

  • Senior Software Engineer
  • Software Engineer
  • AI Engineer
  • Data engineer
  • Operations & Communication Manager

E-infrastructures built on chaos, ordered by us.

Activities

  • Support new developments after the following triggers
    • model change
    • new raw-field modeled
    • generic bug fix
    • static data changes
  • Detailed Reporting using JIRA 
  • ETL workflow updating
  • Support Loading and Querying on Arches
  • Data Normalization & Transformation 
  • Arches customizations
  • Administration panel with AI functionalities

Goal

The successful launch of the Getty Art Provenance Index, which contains over 12 million records, helps researchers understand the ownership history of artworks, in other words, how these objects have moved across the world.

Technologies and Tools

To ensure seamless software development, we use the most suitable tech stack for your project.

Python

React

Java

Node.js

Arches