R

Data Architect (Azure and Databricks)

Rackspace
On-site
India - Gurgaon

Overview 

We are seeking an experienced Data Architect with extensive expertise in designing and implementing modern data architectures. This role requires strong software engineering principles, hands-on coding abilities, and experience building data engineering frameworks. The ideal candidate will have a proven track record of implementing Databricks-based solutions in the healthcare industry, with expertise in data catalog implementation and governance frameworks. 

 

About the Role 

As a Data Architect, you will be responsible for designing and implementing scalable, secure, and efficient data architectures on the Databricks platform. You will lead the technical design of data migration initiatives from legacy systems to modern Lakehouse architecture, ensuring alignment with business requirements, industry best practices, and regulatory compliance. 

 

Key Responsibilities 


Design and implement modern data architectures using Databricks Lakehouse platform 

Lead the technical design of Data Warehouse/Data Lake migration initiatives from legacy systems 

Develop data engineering frameworks and reusable components to accelerate delivery 

Establish CI/CD pipelines and infrastructure-as-code practices for data solutions 

Implement data catalog solutions and governance frameworks 

Create technical specifications and architecture documentation 

Provide technical leadership to data engineering teams 

Collaborate with cross-functional teams to ensure alignment of data solutions 

Evaluate and recommend technologies, tools, and approaches for data initiatives 

Ensure data architectures meet security, compliance, and performance requirements 

Mentor junior team members on data architecture best practices 

Stay current with emerging technologies and industry trends 


Qualifications 

Extensive experience in data architecture design and implementation 

Strong software engineering background with expertise in Python or Scala 

Proven experience building data engineering frameworks and reusable components 

Experience implementing CI/CD pipelines for data solutions 

Expertise in infrastructure-as-code and automation 

Experience implementing data catalog solutions and governance frameworks 

Deep understanding of Databricks platform and Lakehouse architecture 

Experience migrating workloads from legacy systems to modern data platforms 

Strong knowledge of healthcare data requirements and regulations 

Experience with cloud platforms (AWS, Azure, GCP) and their data services 

Bachelor's degree in Computer Science, Information Systems, or related field; advanced degree preferred 


Technical Skills 

Programming languages: Python and/or Scala (required) 

Data processing frameworks: Apache Spark, Delta Lake 

CI/CD tools: Jenkins, GitHub Actions, Azure DevOps 

Infrastructure-as-code (optional): Terraform, CloudFormation, Pulumi 

Data catalog tools: Databricks Unity Catalog, Collibra, Alation 

Data governance frameworks and methodologies 

Data modeling and design patterns 

API design and development 

Cloud platforms: AWS, Azure, GCP 

Container technologies: Docker, Kubernetes 

Version control systems: Git 

SQL and NoSQL databases 

Data quality and testing frameworks 


Optional - Healthcare Industry Knowledge 

Healthcare data standards (HL7, FHIR, etc.) 

Clinical and operational data models 

Healthcare interoperability requirements 

Healthcare analytics use cases 

\n


\n