What you'll learn :
Spark SQL Syntax, Component Architecture in Apache Spark
Dataset, Dataframes, RDD
Advanced features on the interaction of Spark SQL with other components
Using data from various data sources like MS Excel, RDBMS, AWS S3, No SQL Mongo DB,
Using the different format of files like Parquet, Avro, JSON
Table partitioning and Bucketing
Introduction to Big Data ecosystem
Basics on SQL
This course is designed for professionals from zero experience to already skilled professionals to enhance their Spark SQL Skills. Hands on session covers on end to end setup of Spark Cluster in AWS and in local systems.
COURSE UPDATED PERIODICALLY SINCE LAUNCH: Last Updated : December
What students are saying:
5 stars, “This is classic. Spark related concepts are clearly explained with real life examples. ” – Temitayo Joseph
In data pipeline whether the data is in structured or in unstructured form, the final extracted data would be in structured form only. At the final stage we need to work with the structured data. SQL is popular query language to do analysis on structured data.
Apache spark facilitates distributed in-memory computing. Spark has inbuilt module called Spark-SQL for structured data processing. Users can mix SQL queries with Spark programs and seamlessly integrates with other constructs of Spark.
Spark SQL facilitates loading and writing data from various sources like RDBMS, NoSQL databases, Cloud storage like S3 and easily it can handle different format of data like Parquet, Avro, JSON and many more.
Spark Provides two types of APIs
Low Level API – RDD
High Level API – Dataframes and Datasets
Spark SQL amalgamates very well with various components of Spark like Spark Streaming, Spark Core and GraphX as it has good API integration between High level and low level APIs.
Initial part of the course is on Introduction on Lambda Architecture and Big data ecosystem. Remaining section would concentrate on reading and writing data between Spark and various data sources.
Dataframe and Datasets are the basic building blocks for Spark SQL. We will learn on how to work on Transformations and Actions with RDDs, Dataframes and Datasets.
Optimization on table with Partitioning and Bucketing.
To facilitate the understanding on data processing following usecase have been included to understand the complete data flow.
Who this course is for :
Beginners who wanted to start with Spark SQL with Apache Spark
Data Analysts, Big data analysts
Those who wants to leverage in-memory computing against structured data.
Course Size Details :
4.5 hours on-demand video
22 downloadable resources
Full lifetime access
Access on mobile and TV
Certificate of completion
People also Search on Google
- free course download
- download udemy courses on pc
- udemy courses free download google drive
- udemy courses free download
- udemy online courses
- online course download
- udemy course download
- udemy paid course for free
- download udemy paid courses for free