Data Engineer, TrustFoundry.ai
AI is transforming everything, yet it still fails at the most fundamental requirement: understanding and respecting the law. TrustFoundry is fixing that. We’re building the legal intelligence engine that will power trustworthy AI — a system that delivers accurate, traceable legal search across federal, state, and case law, and exposes every citation, passage, and lineage for machine consumption. Our mission is to eliminate hallucinated legal answers and make the world’s laws accessible to both humans and intelligent agents. If you want to help define how AI interacts with the real world — and prevent the real-world consequences of bad legal reasoning — this is the place to do it.
We are seeking a Data Engineer to own and evolve our Python-based data platform. You will design and operate ETL/ELT pipelines across GCP, working in a Dockerized, Terraform-managed environment, and integrating messy, unique legal and regulatory data sources into reliable, production-grade datasets. This role is ideal for someone who likes to go deep, untangle complexity, and use modern data engineering tools and AI-assisted workflows to move quickly and thoughtfully.
What You'll Do
Design, implement, and maintain robust ETL/ELT pipelines in Python running on GCP. Work within Docker-based development and runtime environments; troubleshoot multi-container setups and cross-service dependencies. Use Terraform to provision and manage cloud infrastructure (GCP services, networking, IAM, storage, etc.) supporting the data platform. Acquire, normalize, and model messy, unstructured, and domain-specific data sources into well-defined schemas and data products. Implement and uphold data engineering best practices, including data quality checks, lineage, observability, monitoring, and documentation. Collaborate with application engineers and product stakeholders to ensure data models and pipelines align with product, analytics, and ML needs. Leverage AI tools to understand existing codebases, accelerate onboarding, assist in refactoring, and debug complex data flows.
What We're Looking For
Strong experience with Python in production data engineering contexts (ETL/ELT, batch/stream, workflow orchestration). Hands-on experience with Docker (multi-service dev environments, debugging environment and dependency issues). Practical experience with GCP (e.g., Cloud Storage, Pub/Sub, Cloud Run/Functions, BigQuery or similar), and Terraform for infrastructure as code. Familiarity with modern data engineering tooling and concepts (e.g., Airflow/Prefect/Dagster, dbt or similar modeling, schema design, CI/CD for data). Even experience with Spark would be a bonus. Demonstrated ability to work with unstructured or complex domain data and turn it into stable, consumable datasets. Ability to work independently, take ownership of problems, and drive solutions with minimal oversight. Experience using AI-assisted development or analysis tools to navigate large codebases and systems is a plus. Overall 2 to 5 years of relevant experience.
Curious and investigative: enjoys reverse-engineering systems and data sources. Systematic and disciplined: cares about correctness, reliability, and repeatability. Self-directed and accountable: comfortable owning pipelines end-to-end in a fast-moving environment.
Join us in making legal information searchable, verifiable, and accessible to all!—apply today.
Get similar opportunities delivered to your inbox. Free, no account needed!
You're currently viewing 1 out of 13,611 available remote opportunities
🔒 13,610 more jobs are waiting for you
Access every remote opportunity
Find your perfect match faster
New opportunities every day
Never miss an opportunity
Join thousands of remote workers who found their dream job
Premium members get unlimited access to all remote job listings, advanced search filters, job alerts, and the ability to save favorite jobs.
Yes! You can cancel your subscription at any time from your account settings. You'll continue to have access until the end of your billing period.
We offer a 7-day money-back guarantee on all plans. If you're not satisfied, contact us within 7 days for a full refund.
Absolutely! We use Stripe for payment processing, which is trusted by millions of businesses worldwide. We never store your payment information.