Skip to main content
All roles
Development

Data Engineer

Remote, PakistanFull-time

The Role

At BearPlex, models and analytics are only as trustworthy as the data feeding them. Clean, well-governed, lineage-tracked data is the difference between an AI system a regulated enterprise can actually deploy and a demo that never leaves the lab.

This position sits at the center of that work. You will design and operate the pipelines that move data from messy source systems into warehouses and feature stores our AI, RAG, and analytics teams depend on, with quality gates and audit trails built in from the first commit, not bolted on later.

What You Will Do

  • Build production pipelines. Design and operate batch and streaming ETL and ELT pipelines in Python and SQL that ingest, transform, and load data reliably at enterprise scale.

  • Model the warehouse. Own dimensional and analytical data models in dbt, with versioned transformations, tested assumptions, and documented lineage that survives audits.

  • Enforce data quality. Implement quality gates, validation checks, and freshness and anomaly monitoring so bad data is caught and quarantined before it reaches a model or a dashboard.

  • Feed the AI systems. Prepare and serve curated, deduplicated, and labeled datasets for RAG indexes, fine-tuning runs, and retrieval pipelines, with provenance preserved end to end.

  • Engineer for streaming. Stand up real-time ingestion using tools such as Kafka or Kinesis, and reconcile streaming and batch sources into a single coherent view.

  • Optimize cost and performance. Tune queries, partitioning, and storage on PostgreSQL and cloud warehouses so pipelines stay fast and predictable as volume grows.

  • Document and harden. Treat pipelines as assets: write clear documentation, add observability, and make every job reproducible and recoverable.

What We Are Looking For

  • Production track record. You have shipped real data pipelines that ran in production and that other teams relied on, not coursework or one-off scripts.

  • Strong Python and SQL. You write clean, performant Python and advanced SQL, and you reason confidently about query plans and data modeling.

  • dbt and warehousing depth. You have built and maintained dbt projects and modern warehouse schemas, with testing and documentation as a habit.

  • Orchestration experience. You have run pipelines on an orchestrator such as Airflow, Dagster, or Prefect and understand scheduling, retries, and backfills.

  • Cloud fluency. You are comfortable on AWS, GCP, or Azure and know the relevant storage, compute, and managed data services.

  • A quality mindset. You design for correctness, lineage, and auditability first, because our clients are regulated enterprises that demand proof.

Nice to Have

  • Streaming systems. Hands-on experience with Kafka, Kinesis, or Flink for real-time data processing.

  • AI data preparation. Experience building datasets for fine-tuning, embeddings, or RAG, including labeling and deduplication workflows.

  • Infrastructure as code. Familiarity with Docker, Kubernetes, and Terraform for deploying and scaling data infrastructure.

  • Governance exposure. Experience with data cataloging, PII handling, or compliance-driven data controls.

Why BearPlex

  • Senior peers. You will work alongside engineers who have shipped real production systems and who hold a high bar for architecture and craft.

  • Real production work. Your pipelines power AI and analytics for regulated enterprises, not internal demos that never reach users.

  • Remote across Pakistan. This is a fully remote role open across Pakistan, with collaboration anchored on our Lahore headquarters.

  • Learning budget. We invest in your growth with a dedicated budget for courses, certifications, and conferences.

  • Clear growth. A defined path from senior individual contributor toward data architecture and platform leadership.

What you bring
PythonSQLdbtETL/ELTAirflowKafkaData WarehousingPostgreSQLAWS
Apply

Send us your work.

Start by dropping your resume. We read it and fill in the form for you, so you only complete what we could not. Our team reviews every application personally; you will hear back either way.

Drop your resume to begin, or click to browse
PDF, up to 10MB. We will read it and fill in the form for you.

Prefer to type it out?