Automated HRIS Document Extraction & Decoding
Aug
2024
Problem / Purpose
During HRIS migration, there was no method to bulk download stored documents (PDFs and images) from the legacy system. Manual extraction was unscalable, and encoded data posed challenges for decoding and storage.
Solution
Developed an automated solution to extract encoded employee and client documents via the HRIS API, storing metadata and file references in Snowflake. Built a Python-based decoding pipeline to handle large, encoded data chunks, manage invalid file types, and output files into a structured shared drive using standardized naming and folder conventions. This enabled clean handoff to the new HRIS and allowed document access for clients when needed.
Key Achievements / Impact
Eliminated the need for manual document extraction across thousands of client records. Delivered a scalable, organized archive of HRIS documents that could be imported into the new system and referenced reliably. Improved traceability and operational efficiency during the HRIS migration.



Key Technologies / Tools Used
Python, API, Snowflake, File Decoding, Process Automation, Document Management, HRIS
Role
Data Scientist
ProService Hawaii