top of page

Data Engineer with AIML Expertise

Job Type

Contract / Full-Time Employment (FTE)

Experience

3+ years

Location

India - Remote

Job Description

Seeking an experienced Data Engineer with 3+ years of expertise in AI/ML integration. Role involves developing scalable data pipelines, implementing ETL processes, and deploying machine learning models in production environments. Must be proficient in Python, cloud platforms, and various data engineering tools. Remote position with immediate start available.

Key Responsibilities

  • Data Pipeline Development: Design, implement, and maintain scalable and efficient data pipelines for processing large datasets. Work with structured and unstructured data from multiple sources, ensuring that data is clean, reliable, and available for analysis.

  • Machine Learning Model Integration: Collaborate with data scientists to deploy machine learning models into production environments. Support model training, testing, and inference at scale, ensuring models are integrated seamlessly into the data pipeline.

  • ETL Process Optimization: Design and optimize ETL (Extract, Transform, Load) processes to ensure fast, reliable, and cost-effective data workflows. Automate data extraction and transformation processes using tools like Apache Airflow, Python, or other scripting frameworks.

  • Data Infrastructure Management: Develop and manage cloud-based data storage solutions (e.g., AWS S3, Azure Blob Storage, Google Cloud Storage) and databases (e.g., AWS Redshift, Google BigQuery). Build and maintain data warehouses or data lakes, ensuring high data availability and scalability.

  • Collaboration with Cross-functional Teams: Work closely with data scientists, business analysts, and other stakeholders to understand their data requirements and deliver data solutions that support business needs. Translate business requirements into scalable data systems and processes.

  • Performance Monitoring &Troubleshooting: Monitor the performance of data pipelines and resolve data-related issues quickly to ensure smooth processing and minimal downtime.Troubleshoot issues related to data quality, pipeline failures, and integration problems with AI/ML models.

  • Automation and Scaling: Automate repetitive data engineering tasks using scripting languages (Python, Bash) and tools (Apache Airflow). Ensure data pipelines and models are scalable and can handle increasing amounts of data over time.

  • Documentation & Knowledge Sharing: Document data engineering processes, workflows, and technical solutions to ensure knowledge sharing across teams. Provide support and mentoring to junior data engineers and other team members.

Qualifications

  • Data Engineering: Strong experience in designing and building data pipelines using tools likeApache Kafka, Apache Spark, Apache Airflow, or other similar technologies.

  • Programming: Proficiency in Python and/or SQL for data processing and manipulation.

  • Databases: Experience with relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g.,MongoDB, Cassandra).

  • Cloud Platforms: Hands-on experience with cloud platforms like AWS, Google Cloud, orAzure and their respective data services (e.g., AWS Redshift, Google BigQuery).

  • AI/ML Tools: Experience with AI/ML libraries and frameworks like TensorFlow, PyTorch, scikit-learn, and model deployment tools (e.g., MLflow, Docker, Kubernetes).

  • Version Control: Proficient in using version control systems like Git for collaborative development.

Good to Have: 

  • Professional Data engineering experience with a strong focus on AI/ML integration.

  • Experience in implementing ETL pipelines, data warehouses, and cloud-based data solutions.

  • Demonstrated experience working with machine learning models, deploying them into production, and ensuring their performance at scale.

bottom of page