Get Apache Spark for Data Science Cookbook PDF

By Padma Priya Chitturi

Key Features

  • Use Apache Spark for info processing with those hands-on recipes
  • Implement end-to-end, large-scale info research greater than ever before
  • Work with robust libraries comparable to MLLib, SciPy, NumPy, and Pandas to achieve insights out of your data

Book Description

Spark has emerged because the so much promising massive facts analytics engine for info technological know-how pros. the real strength and price of Apache Spark lies in its skill to execute info technology initiatives with pace and accuracy. Spark's promoting aspect is that it combines ETL, batch analytics, real-time flow research, desktop studying, graph processing, and visualizations. It enables you to take on the complexities that include uncooked unstructured information units with ease.

This advisor gets you cozy and assured appearing facts technological know-how initiatives with Spark. you are going to know about implementations together with dispensed deep studying, numerical computing, and scalable computing device studying. you may be proven potent recommendations to challenging techniques in information technology utilizing Spark's information technology libraries equivalent to MLLib, Pandas, NumPy, SciPy, and extra. those easy and effective recipes will help you enforce algorithms and optimize your work.

What you are going to learn

  • Explore the subjects of knowledge mining, textual content mining, normal Language Processing, info retrieval, and laptop learning.
  • Solve real-world analytical issues of huge facts sets.
  • Address information technology demanding situations with analytical instruments on a allotted process like Spark (apt for iterative algorithms), which deals in-memory processing and extra flexibility for facts research at scale.
  • Get hands-on adventure with algorithms like type, regression, and suggestion on actual datasets utilizing Spark MLLib package.
  • Learn approximately numerical and clinical computing utilizing NumPy and SciPy on Spark.
  • Use Predictive version Markup Language (PMML) in Spark for statistical information mining models.

About the Author

Padma Priya Chitturi is Analytics Lead at Fractal Analytics Pvt Ltd and has over 5 years of expertise in huge information processing. at present, she is a part of strength improvement at Fractal and liable for answer improvement for analytical difficulties throughout a number of company domain names at huge scale. sooner than this, she labored for an airways product on a real-time processing platform serving 1000000 person requests/sec at Amadeus software program Labs. She has labored on understanding large-scale deep networks (Jeffrey dean's paintings in Google mind) for photo type at the huge facts platform Spark. She works heavily with immense info applied sciences similar to Spark, typhoon, Cassandra and Hadoop. She used to be an open resource contributor to Apache Storm.

Table of Contents

  1. Big info Analytics with Spark
  2. Tricky records with Spark
  3. Data research with Spark
  4. Clustering, category, and Regression
  5. Working with Spark MLlib
  6. NLP with Spark
  7. Working with glowing Water - H2O
  8. Data Visualization with Spark
  9. Deep studying on Spark
  10. Working with SparkR

Show description

Read or Download Apache Spark for Data Science Cookbook PDF

Best data modeling & design books

Data Quality: The Accuracy Dimension (The Morgan Kaufmann by Jack E. Olson PDF

Info caliber: The Accuracy size is ready assessing the standard of company facts and enhancing its accuracy utilizing the knowledge profiling technique. company facts is more and more vital as businesses proceed to discover new how one can use it. Likewise, enhancing the accuracy of information in info platforms is speedy turning into an enormous objective as businesses detect how a lot it impacts their base line.

Read e-book online Complete Maya Programming Volume II: An In-depth Guide to 3D PDF

David Gould's acclaimed first ebook, whole Maya Programming: an in depth consultant to MEL and the C++ API, offers artists and programmers with a deep figuring out of how Maya works and the way it may be superior and customised via programming. In his new e-book David bargains a steady, intuitive advent to the middle principles of special effects.

Designing Sorting Networks: A New Paradigm by Sherenaz W. Al-Haj Baddar,Kenneth E. Batcher PDF

Designing Sorting Networks: a brand new Paradigm presents an in-depth consultant to maximizing the potency of sorting networks, and makes use of 0/1 situations, partly ordered units and Haase diagrams to heavily learn their habit in a simple, intuitive demeanour. This booklet additionally outlines new rules and methods for designing quicker sorting networks utilizing Sortnet, and illustrates how those recommendations have been used to layout quicker 12-key and 18-key sorting networks via a sequence of case stories.

Read e-book online Algorithms, Probability, Networks, and Games: Scientific PDF

This Festschrift quantity is released in honor of Professor Paul G. Spirakis at the party of his sixtieth birthday. It celebrates his major contributions to laptop technological know-how as an eminent, proficient, and influential researcher and such a lot visionary idea chief, with a very good expertise in inspiring and guiding younger researchers.

Additional resources for Apache Spark for Data Science Cookbook

Sample text

Download PDF sample

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

by Michael

Rated 4.25 of 5 – based on 32 votes