
Senior Data Engineer
Global-Talent-Exchange
Required Skills:
Python
Pandas
Pyspark
Scrap
Beautifulsoup
Selenium
sql
Google Cloud Platform
Aws
Azure
Scikit-Learn
Google TensorFlow
Pytorch
Apache Airflow
Perfect Attendance
Dagster
Python
Pandas
PySpark
Scrapy
Beautiful Soup
Selenium
SQL
Google Cloud Platform
AWS
Azure
Scikit-learn
TensorFlow
PyTorch
Apache Airflow
Prefect
Dagster
About the Role
We are seeking a versatile Data Engineer to build the foundational data systems that power our AI platform. In this role, you will be responsible for designing, constructing, and maintaining robust data pipelines that ingest, process, and organize massive volumes of structured and unstructured data from diverse sources. Your work will directly feed our predictive models and LLM-driven analytics, enabling us to generate unique insights into global property markets.
- Design and Build Scalable Data Pipelines: Architect, develop, and manage reliable ETL/ELT processes to handle large-scale, real-time and batch data.
- Lead Large-Scale Data Acquisition: Develop and maintain advanced scraping and data ingestion systems to collect vast amounts of public and proprietary real estate data from web sources, APIs, and databases.
- Support ML & LLM Initiatives: Build and optimize data infrastructure to facilitate efficient data labeling, feature engineering, model training, and evaluation for our Machine Learning and Large Language Model projects.
- Ensure Data Quality and Reliability: Implement processes for data validation, cleansing, and monitoring to ensure the integrity and availability of our data assets.
- Collaborate with AI/ML Engineers: Work closely with data scientists and ML engineers to understand data requirements, provide clean, structured datasets, and operationalize data-driven features.
- Own Data Infrastructure: Manage and optimize data storage solutions (data warehouses, data lakes) and processing frameworks for performance and cost-effectiveness.
Requirements
- Proven experience (5+ years) as a Data Engineer or in a similar role, with a strong portfolio of building and maintaining data pipelines.
- Expertise in Python and core data libraries (e.g., Pandas, PySpark).
- Hands-on experience with large-scale web scraping frameworks (e.g., Scrapy, Beautiful Soup, Selenium/Playwright) and managing associated challenges (e.g., anti-bot measures, rate limiting).
- Solid understanding of data modeling, data warehousing concepts, and SQL. Experience with cloud data platforms (Google Cloud Platform - BigQuery, Dataflow; AWS - Redshift, Glue; or Azure equivalents).
- Demonstrable experience in supporting ML projects: building training datasets, feature stores, and working with ML frameworks (e.g., Scikit-learn, TensorFlow, PyTorch).
- Familiarity with the full lifecycle of LLM projects, including data collection for pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation) pipeline construction.
- Experience with workflow orchestration tools (e.g., Apache Airflow, Prefect, Dagster).
Bonus Points (Nice-to-Have)
- Direct experience in PropTech, FinTech, or a data-intensive real estate/ financial domain.
- Experience with vector databases (e.g., Pinecone, Weaviate, Chroma) and implementing RAG systems.
- Knowledge of MLOps principles and tools for model deployment and monitoring.
- Experience with real-time data processing (e.g., Apache Kafka, Apache Flink).
About Company

Send me jobs like this
This one's a match? We'll send more your way
Similar Jobs

Site Reliability Engineer (DevOps)
Celigo
Hyderabad, India
Full time
5 - 10 Years
- LPA

Senior DevOps Engineer
Celigo
Hyderabad, India
Full time
5 - 10 Years
- LPA

DevOps Architect
Celigo
Hyderabad, India
Full time
12 - 20 Years
- LPA

Design Automation Engineer, Scribe Design Non-Array
Micron Technology
Hyderabad, India
Full time
8 - 20 Years
- LPA

Staff DevOps Engineer
Celigo
Hyderabad, India
Full time
8 - 12 Years
- LPA

Cloud Security engineer (Devops)
Celigo
Hyderabad, India
Full time
5 - 10 Years
- LPA

K3S with J2ME developer
Cyient
Bangalore Urban, India
12 - 18 Years
- LPA

SDX- IVI, SBC with Container, Qnx, Linux, Qt, Android
Cyient
Bangalore Urban, India
Full time
3 - 8 Years
- LPA

Embedded CUDA
Cyient
Hyderabad, India
Full time
3 - 8 Years
- LPA

Embedded Software Engineer
Cyient
Bangalore Urban, India
Full time
3 - 8 Years
- LPA