LogoData Engineer
Contact
ball texture
Jiaru Liu

I notice patterns, follow half-formed questions, and build through the mess until scattered ideas become something real.

Hi there, I'm Jiaru.

Also Claire

5+ years designing infrastructure for analytics, product, and AI teams - real-time streaming, cloud migration, Medallion architecture, governance, MLOps, and now productionising LLM pipelines.

I build pipelines that move, transform, and unlock data at scale. From raw event streams to production ML systems, I care about infrastructure that's fast, trusted, and ready for whatever comes next.

AT THE HEART OF DATA IS AN OPPORTUNITY TO PROBLEM SOLVE · AT THE HEART OF DATA IS AN OPPORTUNITY TO PROBLEM SOLVE · AT THE HEART OF DATA IS AN OPPORTUNITY TO PROBLEM SOLVE · AT THE HEART OF DATA IS AN OPPORTUNITY TO PROBLEM SOLVE ·
card 1
card 2
card 3

Experience

Where I've been

A timeline of the roles that shaped my engineering and analytical thinking.

GoGuardian

Los Angeles, CA

Data Engineer II

Feb 2023 – Present
  • Led platform-wide migration from AWS ETL to Databricks Lakehouse, delivering ~$400K in annual cost savings
  • Designed scalable browsing data pipeline with Spark and Python, processing 60M - 800M records/day
  • Built streaming pipeline with AWS Kinesis, S3, and Spark to bring app event click data into the Lakehouse
  • Built reverse ETL integrations with HubSpot and Salesforce APIs to enable reliable data syncs and power automated business workflows
  • Drove data quality via dbt Medallion architecture models, reducing duplicate reporting by 40%
  • Productionized 5+ ML/LLM pipelines and implemented PII governance with Unity Catalog
  • Mentored junior engineers and led cross-team data platform initiatives
DatabricksSpark StreamingDelta LakedbtMLOpsUnity CatalogKinesis FirehoseMongoDB

Data Engineer I

Apr 2021 – Feb 2023
  • Built ETL pipelines ingesting data from 30+ sources into AWS S3 data lake and Redshift warehouse
  • Established Airflow from scratch - custom operators, hooks, and DAGs across the full stack
  • Automated infrastructure provisioning with Terraform across multiple AWS environments
  • Designed customer usage reporting models, cutting CSM query time by 75–80%
AWS BatchAWS GlueS3RedshiftAirflowTerraform

Kroll Bond Rating Agency

New York, NY

Data Engineer Intern

Oct 2020 – Jan 2021
  • Built CNN models in TensorFlow for financial time series anomaly detection on 50+ GiB datasets
  • Developed Python packages with full GitLab CI/CD - unit tests, static checks, and automated publishing
  • Containerized workloads with Docker and deployed across multiple Terraformed environments
  • Presented findings to 50+ engineers; recognized by leadership for pioneering engineering solutions
PythonTensorFlowPandasDockerGitLab CI/CDTerraform

Regatta Craft Mixers

New York, NY

Student Consultant

Jun – Jul 2020
  • Researched 8 major competitors and market trends in the craft mixer space
  • Cleaned a full year of Facebook social media data in Python and visualized patterns in Tableau
  • Delivered a tiered marketing strategy for grocery store market entry
PythonTableau

Emerson

Saint Louis, MO

Student Consultant

Jan – May 2020
  • Assessed data utilization across Emerson's $6B+ Commercial and Residential Solutions business unit
  • Standardized and prioritized marketing KPIs through interviews with Marketing and IT leaders
  • Designed data gap frameworks that reduced recurring reporting work by ~20%
  • Delivered a concrete implementation roadmap for the marketing team
Data AnalysisKPI Design

Education

Academic Background

2019 — 2021

M.S. Business Analytics

Washington University in Saint Louis

Focused on data analytics, statistical modeling, and business intelligence. Coursework spanned machine learning, data visualization, SQL, and applied analytics for business decision-making.

Saint Louis, MO

2015 — 2019

B.M. E-commerce

Dalian University of Technology

Studied e-commerce systems, management information systems, and digital marketing. Built a foundation in business analytics, programming, and data-driven strategy.

Dalian, China

Certifications

Credentials & Courses
Cloud & Data Engineering
Databricks

Databricks Certified Data Engineer Associate

Jun 2024
Amazon Web Services

AWS Certified Cloud Practitioner

Apr 2026
Udemy

The Complete Hands-On Introduction to Apache Airflow

Jun 2021
AI & Machine Learning
DAIR.AI

Advanced AI Agents

Feb 2026
DAIR.AI

Introduction to RAG

Feb 2026
DAIR.AI

Prompt Engineering For Developers

Jan 2026
Coursera

Introduction to TensorFlow for AI, ML, and Deep Learning

May 2020
Analytics & Visualization
Udemy

Hands-On Tableau Training For Data Science

May 2020

Let's Connect

Say hello.

Always happy to talk about data engineering challenges, architecture decisions, AI systems, or just life.

Send me an emailConnect on LinkedInGitHubDownload Resume