Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
By soft data of guide Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser to check out, you might not have to bring the thick prints everywhere you go. Whenever you have going to review Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser, you could open your device to read this book Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser in soft data system. So easy and rapid! Reviewing the soft documents publication Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser will certainly provide you easy way to review. It could also be quicker due to the fact that you could read your publication Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser everywhere you really want. This online Field Guide To Hadoop: An Introduction To Hadoop, Its Ecosystem, And Aligned Technologies, By Kevin Sitto, Marshall Presser could be a referred publication that you can appreciate the remedy of life.

Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser

Free Ebook PDF Online Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You’ll quickly understand how Hadoop’s projects, subprojects, and related technologies work together.
Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you’ll have a good grasp of the playing field.
Topics include:
- Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark
- Database and data management—Cassandra, HBase, MongoDB, and Hive
- Serialization—Avro, JSON, and Parquet
- Management and monitoring—Puppet, Chef, Zookeeper, and Oozie
- Analytic helpers—Pig, Mahout, and MLLib
- Data transfer—Scoop, Flume, distcp, and Storm
- Security, access control, auditing—Sentry, Kerberos, and Knox
- Cloud computing and virtualization—Serengeti, Docker, and Whirr
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser - Amazon Sales Rank: #975523 in Books
- Brand: Sitto, Kevin/ Presser, Marshall
- Published on: 2015-03-23
- Released on: 2015-03-13
- Original language: English
- Number of items: 1
- Dimensions: 9.00" h x .30" w x 6.00" l, .0 pounds
- Binding: Paperback
- 132 pages
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser About the Author
Kevin Sitto is a Field Solutions Engineer with Pivotal Software, providing consulting services to help folks understand and address their big data needs.He lives in Maryland with his wife and two kids and enjoys making homebrew beer when he's not writing books about big data.
Marshall Presser is a Field Chief Technology Officer for Pivotal and is based in McLean VA. In addition to helping customers solve complex analytic problems with the Greenplum Database, he leads the Hadoop Virtual Field Team, working on issues of integrating Hadoop with relational databases.Prior to coming to Pivotal (formerly Greenplum), he spent 12 years at Oracle, specializing in High Availability, Business Continuity, Clustering, Parallel Database Technology, Disaster Recovery and Large Scale Database Systems. Marshall has also worked for a number of hardware vendors implementing clusters and other parallel architectures. His background includes parallel computation, operating system and compiler development as well as private consulting for organizations in heath care, financial services, and federal and state governments. Marshall holds a B.A in Mathematics and an M.A. in Economics and Statistics from the University of Pennsylvania and a M.Sc. in Computing from Imperial College, London.

Where to Download Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
Most helpful customer reviews
6 of 6 people found the following review helpful. Don't buy this book By Alexis C. Whether you are familiar or not with Hadoop, you will learn nothing. In one case, you already know (hopefully more than) what is presented in that book, or you won't understand anything since it doesn't even explain the basics.The book involves many excuses such that :"Although an explanation is beyond the scope of this book ...""Writing Mapreduce can be fairly complicated and is beyond the scope of this book""The truth is that writing applications in YARN is still very involved and too deep for this book...""Good example code is a bit long and complex to include here but ..."Anyway .. in the end you wonder if the authors themselves know what they are talking about.Check out wikipedia's page of the technologies touched on by this book, you'll save money (I can't believe I spent 36$ on this) and time.
4 of 4 people found the following review helpful. Short but useful introduction to most components of Hadoop ecosystem By Pouria Amirian Big data is over-hyped topic and Hadoop is tightly coupled with Big Data. Hadoop ecosystem is composed of many components and many of these components need good knowledge of programming (in Java or python or SQL or Scala). There are a few good books about most components. But there is no book that gives a thirty thousand feet overview of many components in Hadoop ecosystem in short and approachable way.As it clearly mentioned in its preface, this book is a short but useful introduction to most components of Hadoop ecosystem. It is useful for programmers, data scientists and managers new to Hadoop world.The book explains many of most important components of the Hadoop ecosystem in about 100 pages. I recommend this book to new comers to Hadoop world. There are some sample code inside the book, but I don’t think that authors expect the readers to use those code snippets. However, code samples give readers a feeling of model of coding for most of components.Managers who want to know technical details briefly benefit a lot from this book. It is important to know that Hadoop is open source but in order to use Hadoop in production environments most (read all) organizations need commercial support; that is why using open source software doesn't prevent vendor lock in!I was surprised that authors didn't include some important components like Impala but covered HIPI and SpatialHadoop!! In addition it is a good idea to provide details about Hadoop distributions and cloud-based deployment. In summary authors did a great job!
2 of 2 people found the following review helpful. Get the current state of Hadoop's components By Ian Stirk Hi,I have written a detailed chapter-by-chapter review of this book on www DOT i-programmer DOT info, the first and last parts of this review are given here. For my review of all chapters, search i-programmer DOT info for STIRK together with the book's title.This slim book sets out to provide an up-to-date overview of Hadoop and its various components, which seems a worthwhile aim.Hadoop is the most common platform for storing and analysing big data. This book aims to be a short introduction to Hadoop and its various components. The authors compare this to a field guide for birds or trees, so it is broad in scope and shallow in depth. Each chapter briefly covers an area of Hadoop technology, and outlines the major players. The book is not a tutorial, but a high-level overview, consisting of 132 pages in 8 chapters.For each component, details are listed for:* License – much is open source but there may be some conditions* Activity – how much development work is being done on the product* Purpose – what the technology does Official Page – home page of the technology* Hadoop Integration – the technology’s level of integration with HadoopBelow is a chapter-by-chapter exploration of the topics covered.Chapter 1 Core TechnologiesThe chapter opens with a bit of history. The origins of Hadoop can be traced back to a project called Nutch, which stored large amounts of data, together with 2 seminal papers from Google – one relating to the Google File System, and the other about a distributed programming model called MapReduce. The ideas in the papers were incorporated into the Nutch project, and Hadoop was born. Yahoo! began using Hadoop for its search engine, and now Hadoop is the premier platform for processing big data.Hadoop consists of 3 primary resources:* Hadoop Distributed File System (HDFS) – where you store data. This is optimized for high performance, is read-intensive, and provides resilience by holding multiple copies of the data on different machines. A large block size optimizes data movement.* MapReduce – involves 2 components: mappers that analyze chunks of data, and reducers which aggregate the results of the mappers.* Hadoop’s tools – other components, as described in this bookThis was an interesting chapter, laying the groundwork for the rest of the book, identifying what Hadoop is, its major components, and how they work. Helpful links to tutorial information are provided, together with outline code examples (as they are throughout the book).Perhaps some emphasis could have been given to describing the attributes of big data (i.e. volume, velocity and variety) that require a system like Hadoop to process it. I’m not sure why Spark was included in this core section....ConclusionThis book is very broad in scope, and by necessity (since it’s a field guide), shallow in depth. It provides up-to-date but limited detail on the major components of the Hadoop big data system. Helpful links are provided for further information.The book is mostly easy to read, with a consistent layout of content (i.e. License, Activity, Purpose, Official Page, Hadoop Integration, description, tutorial link, and simple example code). Useful comparisons between tools are occasionally provided.This book should prove helpful to managers, developers, and architects, which are new to big data and want a quick overview of the major components of Hadoop.Most Hadoop books discuss some of the components listed here, but this book contains a much wider range of components than other books. That said, there are omissions, including:* Hue - a popular web-based tool providing centralised access to many underlying Hadoop tools (e.g. Sqoop, Hive, Pig, Oozie, HBase, ZooKeeper, Impala, HDFS etc)* Impala – a fast parallel processing SQL query engine for HadoopThe authors intend to update this book regularly (every year or two), which is ideal if you want to know about the current popular components, and especially good if you have access to safari online (but bad if you need to keep buying the updated book).Where should you go next after reading this book? I would suggest gaining some detail by reading Big Data Made Easy, which I recently reviewed.If you’re new to big data and Hadoop, and you want to quickly review what it is, and the current state of its major components, I highly recommend this small book.
See all 7 customer reviews...
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser PDF
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser iBooks
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser ePub
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser rtf
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser AZW
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser Kindle
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser
Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies, by Kevin Sitto, Marshall Presser