大数据
Flink内核原理与实现 豆瓣
作者: 冯飞 / 崔鹏云 2020 - 9
《Flink内核原理与实现》是一本机械工业出版社的图书,作者是冯飞、崔鹏云、陈冠华三位大数据专家,从系统整体视角出发,既讲解了Flink的入门、安装、流计算开发入门、监控运维等基础知识,又讲解了Flink的时间概念、Window的实现原理及其代码解析,Flink的容错机制原理,容错的关键设计、代码实现分析,作业从源码到执行整个过程的解析, 作业的调度策略、资源管理、类型和序列化系统、内存管理、类数据交换的关键设计和代码实现分析,RPC通信框架等深度内容。
《Flink内核原理与实现》适合对实时计算感兴趣的大数据开发、运维领域的从业人员阅读,此外对机器学习工程技术人员也有所帮助。
Stream Processing with Apache Spark 豆瓣
作者: Gerard Maas / Francois Garillot O'Reilly Media 2018 - 7
To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming.
If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.
Understand how Spark Streaming fits in the big picture
Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream
Discover how to create a robust deployment
Dive into streaming algorithmics
Learn how to tune, measure, and monitor Spark Streaming
Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications 豆瓣
作者: Fabian Hueske / Vasiliki Kalavri O'Reilly Media 2018 - 7
Get started with Apache Flink, the open source framework that enables you to process streaming data—such as user interactions, sensor data, and machine logs—as it arrives. With this practical guide, you’ll learn how to use Apache Flink’s stream processing APIs to implement, continuously run, and maintain real-world applications.Authors Fabian Hueske, one of Flink’s creators, and Vasia Kalavri, a core contributor to Flink’s graph processing API (Gelly), explains the fundamental concepts of parallel stream processing and shows you how streaming analytics differs from traditional batch data analysis. Software engineers, data engineers, and system administrators will learn the basics of Flink’s DataStream API, including the structure and components of a common Flink streaming application.Solve real-world problems with Apache Flink’s DataStream APISet up an environment for developing stream processing applications for FlinkDesign streaming applications and migrate periodic batch workloads to continuous streaming workloadsLearn about windowed operations that process groups of recordsIngest data streams into a DataStream application and emit a result stream into different storage systemsImplement stateful and custom operators common in stream processing applicationsOperate, maintain, and update continuously running Flink streaming applicationsExplore several deployment options, including the setup of highly available installations
AI极简经济学 豆瓣
Prediction Machines
8.2 (6 个评分) 作者: 阿杰伊·阿格拉沃尔 / 乔舒亚·甘斯 译者: 闾佳 博集天卷 | 湖南科学技术出版社 2018
◆ 人工智能对你的工作、你的生意意味着什么?读这本书你就能明白。 ——哈尔·瓦里安,谷歌首席经济学家
◆ AI 商业化领军实验室出品,直击人工智能痛点,从经济学角度解决“何为人工智能,它有什么用,我们该怎么办”,化繁为简,深入浅出地阐释了人工智能对我们工作与生活的影响。
◆《失控》《必然》 作者凯文·凯利力荐的“天才之作”,哈佛、麻省理工、斯坦福等大学经济学教授和苹果、谷歌、微软等公司人工智能部门高管一致好评,《经济学人》《金融时报》《麦肯锡季刊》争相报道!
【内容简介】
人工智能正在以不可阻挡的态势席卷全球。无论是 iPhone 的神经网络引擎、AlphaGo 的围棋算法,还是无人驾驶、深度学习……毫无疑问,人工智能正在改写行业形态。如同此前个人电脑、互联网、大数据的风行一般,技术创新又一次极大地改变了我们的工作与生活。
那么,究竟应该如何看待人工智能?在《AI极简经济学》一书中,三位深耕人工智能和决策领域的经济学家给出了清晰的答案。他们以坚实的经济学理论剖析动态,把握本质,将人工智能领域变化多端的表象总结为不断提高的机器预测能力。无论你是需要做出决策的企业家,还是尚且在规划职业生涯的普通人,或是面对剧烈社会变动的思考者,都能从这本书中获得重要启发。
【名人推荐】
《AI极简经济学》将人工智能视为一种全新的平价商品——预测能力,这让我们能更加轻松地理解人工智能,实为天才之举。我感觉这本书出奇地有用。
—— 凯文·凯利,《失控》《必然》 作者,《连线》杂志创始主编
无论你是商界领袖、政策制定者、经济学家或战略家,还是一个想要了解“人工智能究竟意味着什么”的普通人,《AI极简经济学》都是必读之书。从制定企业战略,做出决策,到理解人工智能将会如何影响我们的社会,你都可以在本书中找到答案。
—— 鲁斯兰·萨拉赫丁诺夫,苹果公司人工智能研究主管,卡内基·梅隆大学教授
AI 是我们这个时代最具革命性的技术。本书的三位作者深谙这项技术的本质,并经由这本书,向我们传达了他们对 AI 的经济意义及其权衡机制的深刻理解。如果想要拨开 AI 迷雾弄清这项技术将带来的机遇与挑战,你首先应该阅读这本书。
—— 埃里克·布莱恩约弗森,麻省理工学院教授,《第二次机器革命》作者
人工智能或许会改变你的生活,本书则会刷新你对人工智能的认知。我们无法确定人工智能是不是最好的科技进步,可以确定的是,这本书是迄今为止该领域最好的书。
—— 劳伦斯·萨默斯,哈佛大学前任校长,世界银行前首席经济学家,奥巴马政府国家经济委员会主席
《AI极简经济学》是一本开创性的著作,它聚焦于战略家和管理者真正需要了解的AI技术革命的知识。本书采用现实的视角来考量这项技术,并且从经济学和企业战略的原则出发,解读公司、产业和管理将如何被AI技术改变。
—— 苏珊·阿西,斯坦福大学教授,前微软研究员
《AI极简经济学》完成了一项创举:它驱散了笼罩于 AI 周边,使其显得远离现实的迷雾,让它轻松易懂。本书提出了前所未有的见解。高层管理者和政策制定者能轻松理解其中的含义。每一位领导者都应该阅读本书。
—— 多米尼克·巴顿,麦肯锡公司全球管理合伙人
Kafka 豆瓣
作者: Neha Narkhede / Gwen Shapira O'Reilly Media 2017 - 10
Learn how to take full advantage of Apache Kafka, the distributed, publish-subscribe queue for handling real-time data feeds. With this comprehensive book, you'll understand how Kafka works and how it's designed. Authors Neha Narkhede, Gwen Shapira, and Todd Palino show you how to deploy production Kafka clusters; secure, tune, and monitor them; write rock-solid applications that use Kafka; and build scalable stream-processing applications.
Hadoop: The Definitive Guide 豆瓣
作者: Tom White O'Reilly Media 2015 - 4
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service