Automated HRIS Document Extraction & Decoding

Aug

2024

Problem / Purpose

During HRIS migration, there was no method to bulk download stored documents (PDFs and images) from the legacy system. Manual extraction was unscalable, and encoded data posed challenges for decoding and storage.

Solution

Developed an automated solution to extract encoded employee and client documents via the HRIS API, storing metadata and file references in Snowflake. Built a Python-based decoding pipeline to handle large, encoded data chunks, manage invalid file types, and output files into a structured shared drive using standardized naming and folder conventions. This enabled clean handoff to the new HRIS and allowed document access for clients when needed.

Key Achievements / Impact

Eliminated the need for manual document extraction across thousands of client records. Delivered a scalable, organized archive of HRIS documents that could be imported into the new system and referenced reliably. Improved traceability and operational efficiency during the HRIS migration.

1/2

Key Technologies / Tools Used

Python, API, Snowflake, File Decoding, Process Automation, Document Management, HRIS

Role

Data Scientist

ProService Hawaii