Cutshort logo
PySpark Jobs in Mumbai

10+ PySpark Jobs in Mumbai | PySpark Job openings in Mumbai

Apply to 10+ PySpark Jobs in Mumbai on CutShort.io. Explore the latest PySpark Job opportunities across top companies like Google, Amazon & Adobe.

icon
ZeMoSo Technologies

at ZeMoSo Technologies

11 recruiters
Agency job
via TIGI HR Solution Pvt. Ltd. by Vaidehi Sarkar
Mumbai, Bengaluru (Bangalore), Hyderabad, Chennai, Pune
4 - 8 yrs
₹10L - ₹15L / yr
Data engineering
skill iconPython
SQL
Data Warehouse (DWH)
skill iconAmazon Web Services (AWS)
+3 more

Work Mode: Hybrid


Need B.Tech, BE, M.Tech, ME candidates - Mandatory



Must-Have Skills:

● Educational Qualification :- B.Tech, BE, M.Tech, ME in any field.

● Minimum of 3 years of proven experience as a Data Engineer.

● Strong proficiency in Python programming language and SQL.

● Experience in DataBricks and setting up and managing data pipelines, data warehouses/lakes.

● Good comprehension and critical thinking skills.


● Kindly note Salary bracket will vary according to the exp. of the candidate - 

- Experience from 4 yrs to 6 yrs - Salary upto 22 LPA

- Experience from 5 yrs to 8 yrs - Salary upto 30 LPA

- Experience more than 8 yrs - Salary upto 40 LPA

Read more
Deqode

at Deqode

1 recruiter
Alisha Das
Posted by Alisha Das
Bengaluru (Bangalore), Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Mumbai, Pune, Hyderabad, Indore, Jaipur, Kolkata
4 - 5 yrs
₹2L - ₹18L / yr
skill iconPython
PySpark

We are looking for a skilled and passionate Data Engineers with a strong foundation in Python programming and hands-on experience working with APIs, AWS cloud, and modern development practices. The ideal candidate will have a keen interest in building scalable backend systems and working with big data tools like PySpark.

Key Responsibilities:

  • Write clean, scalable, and efficient Python code.
  • Work with Python frameworks such as PySpark for data processing.
  • Design, develop, update, and maintain APIs (RESTful).
  • Deploy and manage code using GitHub CI/CD pipelines.
  • Collaborate with cross-functional teams to define, design, and ship new features.
  • Work on AWS cloud services for application deployment and infrastructure.
  • Basic database design and interaction with MySQL or DynamoDB.
  • Debugging and troubleshooting application issues and performance bottlenecks.

Required Skills & Qualifications:

  • 4+ years of hands-on experience with Python development.
  • Proficient in Python basics with a strong problem-solving approach.
  • Experience with AWS Cloud services (EC2, Lambda, S3, etc.).
  • Good understanding of API development and integration.
  • Knowledge of GitHub and CI/CD workflows.
  • Experience in working with PySpark or similar big data frameworks.
  • Basic knowledge of MySQL or DynamoDB.
  • Excellent communication skills and a team-oriented mindset.

Nice to Have:

  • Experience in containerization (Docker/Kubernetes).
  • Familiarity with Agile/Scrum methodologies.


Read more
IT Service company

IT Service company

Agency job
via Vinprotoday by Vikas Gaur
Mumbai
4 - 10 yrs
₹8L - ₹30L / yr
Google Cloud Platform (GCP)
Workflow
TensorFlow
Deployment management
PySpark
+1 more

Key Responsibilities:

Design, develop, and optimize scalable data pipelines and ETL processes.

Work with large datasets using GCP services like BigQuery, Dataflow, and Cloud Storage.

Implement real-time data streaming and processing solutions using Pub/Sub and Dataproc.

Collaborate with cross-functional teams to ensure data quality and governance.


Technical Requirements:

4+ years of experience in Data Engineering.

Strong expertise in GCP services like Workflow,tensorflow, Dataproc, and Cloud Storage.

Proficiency in SQL and programming languages such as Python or Java

.Experience in designing and implementing data pipelines

and working with real-time data processing.

Familiarity with CI/CD pipelines and cloud security best practices.

Read more
ProtoGene Consulting Private Limited
Mumbai
3 - 8 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more

Data Engineer + Integration engineer + Support specialistExp – 5-8 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing

· Data modelling

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform. Design/ implement, and maintain the data architecture for all AWS data services

· A strong understanding of data modelling, data structures, databases (Redshift), and ETL processes

· Work with stakeholders to identify business needs and requirements for data-related projects

Strong SQL and/or Python or PySpark knowledge

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Monitor and troubleshoot data pipelines

· Collaborate with software engineers to design and implement data-driven features

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance

Experience in Process industry

Data Engineer + Integration engineer + Support specialistExp – 3-5 years

Necessary Skills:· SQL & Python / PySpark

· AWS Services: Glue, Appflow, Redshift

· Data warehousing basics

· Data modelling basics

Job Description:· Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform.

· A strong understanding of data modelling, data structures, databases (Redshift)

Strong SQL and/or Python or PySpark knowledge

· Design and implement ETL processes to load data into the data warehouse

· Creating data models that can be used to extract information from various sources & store it in a usable format

· Optimize data models for performance and efficiency

· Write SQL queries to support data analysis and reporting

· Collaborate with team to design and implement data-driven features

· Monitor and troubleshoot data pipelines

· Perform root cause analysis on data issues

· Maintain documentation of the data architecture and ETL processes

· Maintaining existing applications by updating existing code or adding new features to meet new requirements

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Identifying opportunities to improve performance by improving database structure or indexing methods

· Designing and implementing security measures to protect data from unauthorized access or misuse

· Recommending infrastructure changes to improve capacity or performance


Read more
Bengaluru (Bangalore), Mumbai, Delhi, Gurugram, Pune, Hyderabad, Ahmedabad, Chennai
3 - 7 yrs
₹8L - ₹15L / yr
AWS Lambda
Amazon S3
Amazon VPC
Amazon EC2
Amazon Redshift
+3 more

Technical Skills:


  • Ability to understand and translate business requirements into design.
  • Proficient in AWS infrastructure components such as S3, IAM, VPC, EC2, and Redshift.
  • Experience in creating ETL jobs using Python/PySpark.
  • Proficiency in creating AWS Lambda functions for event-based jobs.
  • Knowledge of automating ETL processes using AWS Step Functions.
  • Competence in building data warehouses and loading data into them.


Responsibilities:


  • Understand business requirements and translate them into design.
  • Assess AWS infrastructure needs for development work.
  • Develop ETL jobs using Python/PySpark to meet requirements.
  • Implement AWS Lambda for event-based tasks.
  • Automate ETL processes using AWS Step Functions.
  • Build data warehouses and manage data loading.
  • Engage with customers and stakeholders to articulate the benefits of proposed solutions and frameworks.
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
Vandana Saxena
Posted by Vandana Saxena
Mumbai, Navi Mumbai
2 - 8 yrs
₹5L - ₹12L / yr
Microsoft Windows Azure
ADF
NumPy
PySpark
Databricks
+1 more
Experience and expertise in using Azure cloud services. Azure certification will be a plus.

- Experience and expertise in Python Development and its different libraries like Pyspark, pandas, NumPy

- Expertise in ADF, Databricks.

- Creating and maintaining data interfaces across a number of different protocols (file, API.).

- Creating and maintaining internal business process solutions to keep our corporate system data in sync and reduce manual processes where appropriate.

- Creating and maintaining monitoring and alerting workflows to improve system transparency.

- Facilitate the development of our Azure cloud infrastructure relative to Data and Application systems.

- Design and lead development of our data infrastructure including data warehouses, data marts, and operational data stores.

- Experience in using Azure services such as ADLS Gen 2, Azure Functions, Azure messaging services, Azure SQL Server, Azure KeyVault, Azure Cognitive services etc.
Read more
Navi Mumbai
3 - 5 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Proficiency in shell scripting
  • Proficiency in automation of tasks
  • Proficiency in Pyspark/Python
  • Proficiency in writing and understanding of sqoop
  • Understanding of CloudEra manager
  • Good understanding of RDBMS
  • Good understanding of Excel

 

Read more
Virtusa

at Virtusa

2 recruiters
Agency job
via Response Informatics by Anupama Lavanya Uppala
Chennai, Bengaluru (Bangalore), Mumbai, Hyderabad, Pune
3 - 10 yrs
₹10L - ₹25L / yr
PySpark
skill iconPython
  • Minimum 1 years of relevant experience, in PySpark (mandatory)
  • Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus 
  • Ability to play lead role and independently manage 3-5 member of Pyspark development team 
  • EMR ,Python and PYspark mandate.
  • Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
Read more
Numantra Technologies

at Numantra Technologies

2 recruiters
nisha mattas
Posted by nisha mattas
Remote, Mumbai, powai
2 - 12 yrs
₹8L - ₹18L / yr
ADF
PySpark
Jupyter Notebook
Big Data
Windows Azure
+3 more
      • Data pre-processing, data transformation, data analysis, and feature engineering
      • Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
    • Required skills:
      • Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
      • Fluency in Python (Pandas), PySpark, SQL, or similar
      • Azure data factory experience (min 12 months)
      • Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
      • Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
      • Ability to work independently with demonstrated experience in project or program management
      • Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
 
Read more
Fragma Data Systems

at Fragma Data Systems

8 recruiters
Evelyn Charles
Posted by Evelyn Charles
Remote, Bengaluru (Bangalore), Hyderabad, Chennai, Mumbai, Pune
8 - 15 yrs
₹16L - ₹28L / yr
PySpark
SQL Azure
azure synapse
Windows Azure
Azure Data Engineer
+3 more
Technology Skills:
  • Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
  • Experience in migrating on-premise data warehouses to data platforms on AZURE cloud. 
  • Designing and implementing data engineering, ingestion, and transformation functions
Good to Have: 
  • Experience with Azure Analysis Services
  • Experience in Power BI
  • Experience with third-party solutions like Attunity/Stream sets, Informatica
  • Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
  • Capacity Planning and Performance Tuning on Azure Stack and Spark.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort