分布式
Hadoop: The Definitive Guide 豆瓣
作者: Tom White O'Reilly Media 2015 - 4
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service
The Kubernetes Book 豆瓣
作者: Nigel Poulton Independently published 2017 - 7
Containers are here and resistance is futile! Now that people are getting their heads around Docker, they need an orchestration platform to help them manage their containerized apps. Kubernetes has emerged as one of the hottest and most important container orchestration platforms in the world. This book gets you up to speed fast!
Kubernetes in Action 豆瓣 Goodreads
9.6 (5 个评分) 作者: Marko Luksa Manning Publications 2017 - 8
Kubernetes in Action teaches you to use Kubernetes to deploy container-based distributed applications. You'll start with an overview of Docker and Kubernetes before building your first Kubernetes cluster. You'll gradually expand your initial application, adding features and deepening your knowledge of Kubernetes architecture and operation. As you navigate this comprehensive guide, you'll explore high-value topics like monitoring, tuning, and scaling.
Kubernetes is Greek for "helmsman," your guide through unknown waters. The Kubernetes container orchestration system safely manages the structure and flow of a distributed application, organizing containers and services for maximum efficiency. Kubernetes serves as an operating system for your clusters, eliminating the need to factor the underlying network and server infrastructure into your designs.
Using OpenMP 豆瓣
作者: Barbara Chapman / Gabriele Jost The MIT Press 2007 - 10
OpenMP, a portable programming interface for shared memory parallel computers, was adopted as an informal standard in 1997 by computer scientists who wanted a unified model on which to base programs for shared memory systems. OpenMP is now used by many software developers; it offers significant advantages over both hand-threading and MPI. Using OpenMP offers a comprehensive introduction to parallel programming concepts and a detailed overview of OpenMP. Using OpenMP discusses hardware developments, describes where OpenMP is applicable, and compares OpenMP to other programming interfaces for shared and distributed memory parallel architectures. It introduces the individual features of OpenMP, provides many source code examples that demonstrate the use and functionality of the language constructs, and offers tips on writing an efficient OpenMP program. It describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, discussing several case studies in detail, and offers in-depth troubleshooting advice. It explains how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance. Finally, Using OpenMP considers trends likely to influence OpenMP development, offering a glimpse of the possibilities of a future OpenMP 3.0 from the vantage point of the current OpenMP 2.5. With multicore computer use increasing, the need for a comprehensive introduction and overview of the standard interface is clear. Using OpenMP provides an essential reference not only for students at both undergraduate and graduate levels but also for professionals who intend to parallelize existing codes or develop new parallel programs for shared memory computer architectures. Barbara Chapman is Professor of Computer Science at the University of Houston. Gabriele Jost is Principal Member of Technical Staff, Application Server Performance Engineering, at Oracle, Inc. Ruud van der Pas is Senior Staff Engineer at Sun Microsystems, Menlo Park.
Scalability Patterns: Best Practices for Designing High Volume Websites 豆瓣
Apress 2018 - 7
With the proliferation of countless electronic devices and the ever growing number of Internet users, the scalability of websites has become an increasingly important challenge. Scalability, even though highly coveted, may not be so easy to achieve. Think that you can't attain responsiveness along with scalability? Chander Dhall will demonstrate that, in fact, they go hand in hand.
What You'll Learn
Architect and develop applications so that they are easy to scale.Learn different scaling and partitioning options and the combinations.Learn techniques to speed up responsiveness.Deep dive intocaching,column-family databases, document databases, search engines and RDBMS.Learnscalability andresponsiveness concepts that are usually ignored.Effectively balance scalability, performance, responsiveness, and availability while minimizing downtime.
Who This Book Is For
Executives (CXOs),software architects, developers, and IT Pros
Spark内核设计的艺术 豆瓣
作者: 耿嘉安 2018 - 1
多位专家联袂推荐,360大数据专家撰写,基于Spark 2.1.0剖析架构与实现精髓。细化到方法级,提炼出多个流程图,立体呈现架构、环境、调度、存储、计算、部署、API七大核心设计。本书一共有10章内容,主要包括以下部分。
准备部分(第1~2章):简单介绍了Spark的环境搭建和基本原理。本部分通过详尽的描述,有效降低了读者进入Spark世界的门槛,同时能对Spark背景知识及整体设计有宏观的认识。
基础部分(第3~5章):介绍Spark的基础设施(包括配置、RPC、度量等)、SparkContext的初始化、Spark执行所需要的环境等内容。经过此部分的学习,将能够对RPC框架的设计、执行环境的功能有深入的理解,这也是对核心内容了解的前提。
核心部分(第6~9章):为Spark最核心的部分,包括存储体系、调度系统、计算引擎、部署模式等。通过本部分的学习,读者将充分了解Spark的数据处理体系细节,能够对Spark核心功能进行扩展、性能优化以及对线上问题进行精准排查。
API部分(第10章):这部分主要对Spark的新老API进行对比,对新API进行简单介绍。
ZeroMQ 豆瓣
作者: Pieter Hintjens O'Reilly Media 2013 - 3
Dive into ØMQ (aka ZeroMQ), the smart socket library that gives you fast, easy, message-based concurrency for your applications. With this quick-paced guide, you’ll learn hands-on how to use this scalable, lightweight, and highly flexible networking tool for exchanging messages among clusters, the cloud, and other multi-system environments.
ØMQ maintainer Pieter Hintjens takes you on a tour of real-world applications, using extended examples in C to help you work with ØMQ’s API, sockets, and patterns. Learn how to use specific ØMQ programming techniques, build multithreaded applications, and create your own messaging architectures. You’ll discover how ØMQ works with several programming languages and most operating systems—with little or no cost.
Learn ØMQ’s main patterns: request-reply, publish-subscribe, and pipeline
Work with ØMQ sockets and patterns by building several small applications
Explore advanced uses of ØMQ’s request-reply pattern through working examples
Build reliable request-reply patterns that keep working when code or hardware fails
Extend ØMQ’s core pub-sub patterns for performance, reliability, state distribution, and monitoring
Learn techniques for building a distributed architecture with ØMQ
Discover what’s required to build a general-purpose framework for distributed applications
Streaming Systems 豆瓣
作者: Tyler Akidau / Slava Chernyak O'Reilly Media 2017 - 10
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.
Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.
You’ll explore:
How streaming and batch data processing patterns compare
The core principles and concepts behind robust out-of-order data processing
How watermarks track progress and completeness in infinite datasets
How exactly-once data processing techniques ensure correctness
How the concepts of streams and tables form the foundations of both batch and streaming data processing
The practical motivations behind a powerful persistent state mechanism, driven by a real-world example
How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
Distributed Algorithms: An Intuitive Approach 豆瓣
作者: Wan Fokkink MIT Press 2018 - 2
The new edition of a guide to distributed algorithms that emphasizes examples and exercises rather than the intricacies of mathematical models.
This book offers students and researchers a guide to distributed algorithms that emphasizes examples and exercises rather than the intricacies of mathematical models. It avoids mathematical argumentation, often a stumbling block for students, teaching algorithmic thought rather than proofs and logic. This approach allows the student to learn a large number of algorithms within a relatively short span of time. Algorithms are explained through brief, informal descriptions, illuminating examples, and practical exercises. The examples and exercises allow readers to understand algorithms intuitively and from different perspectives. Proof sketches, arguing the correctness of an algorithm or explaining the idea behind fundamental results, are also included. The algorithms presented in the book are for the most part "classics," selected because they shed light on the algorithmic design of distributed systems or on key issues in distributed computing and concurrent programming.
This second edition has been substantially revised. A new chapter on distributed transaction offers up-to-date treatment of database transactions and the important evolving area of transactional memory. A new chapter on security discusses two exciting new topics: blockchains and quantum cryptography. Sections have been added that cover such subjects as rollback recovery, fault-tolerant termination detection, and consensus for shared memory. An appendix offers pseudocode descriptions of many algorithms. Solutions and slides are available for instructors.
Distributed Algorithms can be used in courses for upper-level undergraduates or graduate students in computer science, or as a reference for researchers in the field.
Principles of Transaction Processing for the Systems Professional 豆瓣
作者: Philip A. Bernstein / Eric Newcomer Morgan Kaufmann 1997 - 1
Principles of Transaction Processing is a clear, concise guide for anyone
involved in developing applications, evaluating products, designing systems,
or engineering products. This book provides an understanding of the internals of
transaction processing systems, describing how they work and how best to use them.
It includes the architecture of transaction processing monitors, transactional
communications paradigms, and mechanisms for recovering from transaction and
system failures.</p>
Use of transaction processing systems in business, industry, and
government is increasing rapidly; the emergence of electronic commerce on
the Internet is creating new demands. As a result, many developers are
encountering transaction processing applications for the first time and need
a practical explanation of techniques. Software engineers who build and
market operating systems, communications systems, programming tools, and
other products used in transaction processing applications will also benefit
from this thorough presentation of principles. Rich with examples, it
describes commercial transaction processing systems, transactional aspects
of database servers, messaging systems, Internet servers, and
object-oriented systems, as well as each of their subsystems.</p>
* Easy-to-read descriptions of fundamentals.
* Real world examples illustrating key points.
* Focuses on practical issues faced by developers.
* Explains most major products and standards, including IBM's CICS, IMS, and MQSeries; X/Open's XA, STDL, and TX; BEA Systems' TUXEDO; Digital's ACMS; Transarc's Encina; AT&T/NCR's TOP END; Tandem's Pathway/TS; OMG's OTS; and Microsoft's Microsoft Transaction Server.
Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services 豆瓣
作者: Brendan Burns O'Reilly Media 2017 - 10
Developing reliable, scalable distributed systems today is often more black art than science. Building these systems is complicated and, because few formally established patterns are available for designing them, most of these systems end up looking very unique. This practical guide shows you how to use existing software design patterns for designing and building reliable distributed applications.Although patterns such as those developed more than 20 years ago by the Gang of Four were largely restricted to running on single machines, author Brendan Burns—a Partner Architect in Microsoft Azure—demonstrates how you can reuse several of them in modern distributed applications.Systems engineers and application developers will learn how these patterns provide a common language and framework for dramatically increasing the quality of your system.
Mastering Bitcoin 豆瓣
8.3 (7 个评分) 作者: Andreas M. Antonopoulos O'Reilly Media 2014
Mastering Bitcoin tells you everything you need to know about joining one of the most exciting revolutions since the invention of the web: digital money. Bitcoin is the first successful digital currency. It's instant, global, frictionless and it is changing money forever. Bitcoin is still in its infancy, and yet it has already spawned an economy valued at nearly $2 billion that is growing exponentially. Established companies like PayPal are considering adding bitcoin as a payment method, and investors are funding a flurry of new startups aiming to stake claims in a new industry that may rival the Internet in terms of scale and impact on daily life.
If you're interested in learning more about the technical operation of bitcoin, or if you're building the next great bitcoin killerapp or business, you will find this book essential reading. From the basic use of a bitcoin wallet to buy a cup of coffee, to running a bitcoin marketplace with hundreds of thousands of transactions, or collaboratively building new financial innovations that will transform our understanding of currency and credit, this book will help you engineer money. You're about to unlock the API to a new economy. This book is your key.
Systems Performance 豆瓣 Goodreads
作者: Brendan Gregg Prentice Hall 2013 - 10
The accelerating deployment of large-scale web, cloud, Big Data, and virtualized computing systems has introduced serious new challenges in performance optimization. Until now, however, little reliable, practical information has been available to IT professionals who are responsible for running these systems efficiently and cost-effectively.
Systems Performance: Enterprise and the Cloud is the solution. Internationally renowned performance optimization expert Brendan Gregg brings together state-of-the-art techniques and tools for analysis and tuning of large-scale web/cloud computing environments.
Gregg focuses on Linux/Unix/Solaris performance issues, while offering proven methodologies and discussing key issues that apply to all enterprise operating systems. Coverage includes:
Modern performance analysis and capacity planning, including key issues such as latency and dynamic tracing
New performance and reliability challenges associated with cloud computing
Methodology, concepts, terminology, tools, and metrics
Key tradeoffs, including problems of load vs. architecture
Tuning operating systems, CPUs, memory, file systems, disks, networks, and busses
Tuning virtualized systems
Programming language issues related to performance — including application profiling for C, C++, Java, and node.js
Benchmarking strategies and pitfalls, including custom microbenchmarking
Database System Concepts 豆瓣
8.4 (5 个评分) 作者: Abraham Silberschatz / Henry F. Korth McGraw-Hill Education 2010 - 3
"Database System Concepts" by Silberschatz, Korth and Sudarshan is now in its 6th edition and is one of the cornerstone texts of database education. It presents the fundamental concepts of database management in an intuitive manner geared toward allowing students to begin working with databases as quickly as possible. The text is designed for a first course in databases at the junior/senior undergraduate level or the first year graduate level. It also contains additional material that can be used as supplements or as introductory material for an advanced course. Because the authors present concepts as intuitive descriptions, a familiarity with basic data structures, computer organization, and a high-level programming language are the only prerequisites. Important theoretical results are covered, but formal proofs are omitted. In place of proofs, figures and examples are used to suggest why a result is true.
Distributed Algorithms 豆瓣
作者: Nancy A. Lynch Morgan Kaufmann 1996 - 3
In "Distributed Algorithms", Nancy Lynch provides a blueprint for designing, implementing, and analyzing distributed algorithms. She directs her book at a wide audience, including students, programmers, system designers, and researchers. "Distributed Algorithms" contains the most significant algorithms and impossibility results in the area, all in a simple automata-theoretic setting. The algorithms are proved correct, and their complexity is analyzed according to precisely defined complexity measures. The problems covered include resource allocation, communication, consensus among distributed processes, data consistency, deadlock detection, leader election, global snapshots, and many others. The material is organized according to the system model-first by the timing model and then by the interprocess communication mechanism. The material on system models is isolated in separate chapters for easy reference. The presentation is completely rigorous, yet is intuitive enough for immediate comprehension. This book familiarizes readers with important problems, algorithms, and impossibility results in the area: readers can then recognize the problems when they arise in practice, apply the algorithms to solve them, and use the impossibility results to determine whether problems are unsolvable. The book also provides readers with the basic mathematical tools for designing new algorithms and proving new impossibility results. In addition, it teaches readers how to reason carefully about distributed algorithms - to model them formally, devise precise specifications for their required behavior, prove their correctness, and evaluate their performance with realistic measures.