Lead Engineer - Backend
Short Summary:
We are building Machine Learning Platform to enable MLOPs capabilities to help Data scientists and ML engineers at Target to implement ML solutions at scale. It encompasses building the Featurestore, Model ops, experimentation, iteration, monitoring, explainability, and continuous improvement of the machine learning lifecycle. You will be part of a team building scalable applications by leverage latest technologies. Connect with us if you want to join us in this exiting journey.
Roles and responsibilities:
Build and maintain Machine learning infrastructure that is scalable, reliable and efficient.
Familiar with Google cloud infrastructure and MLOPS
Write highly scalable APIs. Deploy and maintain machine learning models, pipelines and workflows in production environment.
Collaborate with data scientists and software engineers to design and implement machine learning workflows.
Implement monitoring and logging tools to ensure that machine learning models are performing optimally.
Continuously improve the performance, scalability and reliability of machine learning systems.
Work with teams to deploy and manage infrastructure for machine learning services.
Create and maintain technical documentation for machine learning infrastructure and workflows.
Stay up to date with the latest developments in technologies.
Tech stack:GCP cloud skills, GCP Machine Learning Engineer skills , GCP VertexAI skills, Python, Microservices, API development Cassandra, Elastic Search, Postgres, Kafka, Docker, CICD, optional (Java + Spring boot)Required Skills:
Bachelor's or Master's degree in computer science, engineering or related field.
9+ years of experience in software development, machine learning engineering.
A Lead Machine Learning Engineer specializing in Google Cloud (GCP)needs a deep understanding of machine learning (ML) principles, cloud infrastructure and MLOps
Hands-on experience with Vertex AI to manage ML platform for Feature engineering, ML training & deploying models
VertexAI Skills needed are: BigQueyML, Automating ML workflows using Kubeflow (KFP) or Cloud composer, AI APIs, Endpoints for real-time inference, Model Monitoring, Cloud Logging & Monitoring, Cloud Dataflow for stream processing, Cloud Dataproc (Spark & Hadoop) for distributed ML workloads
Deep experience with Python, API development, microservices. Creating ML-powered REST APIs using FastAPI, Flask, Cloud Functions
Java (Optional, but useful for production ML systems)
Expert in building high-performance APIs.
Experience with DevOps practices, containerization and tools such as Kubernetes, Docker, Jenkins, Git.
Good understanding of machine learning concepts and frameworks, deep learning, LLM etc.
Good to have experience in deploying machine learning models in a production environment.
Good to have experience with data streaming technologies such as Kafka, Dataflow, Kinesis, Pub/Sub etc.
Strong analytical and problem-solving skills
Good to have GCP certification - Professional Machine Learning Engineer

Already started
an application?
Login to continue
