首页 / APACHE / What Is Apache Hadoop?

What Is Apache Hadoop?

内容导读

互联网集市收集整理的这篇技术教程文章主要介绍了What Is Apache Hadoop?，小编现在分享给大家，供广大互联网技能从业者学习和参考。文章包含3078字，纯文字阅读大概需要5分钟。

内容图文

http://hadoop.apache.org/

技术分享

The Apache? Hadoop? project develops open-source software for reliable, scalable,distributed computing.

The Apache Hadoop software library is a framework that allows for the distributedprocessing of large data sets across clusters of computers using simple programming models.

It is designed to scale up from single servers to thousands of machines, each offering local
computation and storage. Rather than rely on hardware to deliver high-availability, the
library itself is designed to detect and handle failures at the application layer, so delivering
a highly-available service on top of a cluster of computers, each of which may be prone to
failures.
The project includes these modules:
? Hadoop Common: The common utilities that support the other Hadoop modules.
? Hadoop Distributed File System (HDFS?): A distributed file system that provides
high-throughput access to application data.
? Hadoop YARN: A framework for job scheduling and cluster resource management.
? Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
Other Hadoop-related projects at Apache include:
? Ambari?: A web-based tool for provisioning, managing, and monitoring Apache
Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive,
HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard
for viewing cluster health such as heatmaps and ability to view MapReduce, Pig and Hive
applications visually alongwith features to diagnose their performance characteristics in a
user-friendly manner.
? Avro?: A data serialization system.
? Cassandra?: A scalable multi-master database with no single points of failure.
? Chukwa?: A data collection system for managing large distributed systems.
? HBase?: A scalable, distributed database that supports structured data storage for large
tables.
? Hive?: A data warehouse infrastructure that provides data summarization and ad hoc
querying.
? Mahout?: A Scalable machine learning and data mining library.
? Pig?: A high-level data-flow language and execution framework for parallel
computation.
? Spark?: A fast and general compute engine for Hadoop data. Spark provides a simple
and expressive programming model that supports a wide range of applications, including
ETL, machine learning, stream processing, and graph computation.
Welcome to Apache? Hadoop?!
Page 3 Copyright ? 2014 The Apache Software Foundation. All rights reserved.
? Tez?: A generalized data-flow programming framework, built on Hadoop YARN,
which provides a powerful and flexible engine to execute an arbitrary DAG of tasks to
process data for both batch and interactive use-cases. Tez is being adopted by Hive?,
Pig? and other frameworks in the Hadoop ecosystem, and also by other commercial
software (e.g. ETL tools), to replace Hadoop? MapReduce as the underlying execution
engine.
? ZooKeeper?: A high-performance coordination service for distributed applications.

技术分享

xxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

原文：http://www.cnblogs.com/xgqfrms/p/5014053.html

内容总结

以上是互联网集市为您收集整理的What Is Apache Hadoop?全部内容，希望文章能够帮你解决What Is Apache Hadoop?所遇到的程序开发问题。如果觉得互联网集市技术教程内容还不错，欢迎将互联网集市网站推荐给程序员好友。

内容备注

版权声明：本文内容由互联网用户自发贡献，该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至 gblab@vip.qq.com 举报，一经查实，本站将立刻删除。

内容手机端

扫描二维码推送至手机访问。

本文链接：https://qyyshop.com/info/1285910.html

来源：【匿名】

【下一篇】Apache2的httpd.conf翻译

更多 ►

【What Is Apache Hadoop?】教程文章相关的互联网学习教程文章

What Is Apache Hadoop?【图】

http://hadoop.apache.org/ 1The Apache? Hadoop? project develops open-source software for reliable, scalable,distributed computing.The Apache Hadoop software library is a framework that allows for the distributedprocessing of large data sets across clusters of computers using simple programming models.It is designed to scale up from single servers to thousands of machines, each offering localcomput...

Hive问题：Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask【代码】

hive执行过程中报错，抓重点（黄色）：2019-02-0109:56:54,623 ERROR [pool-7-thread-4] dao.IHiveDaoImpl - java.sql.SQLException: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2from org.apache.hadoop.hive.ql.exec.mr.MapRedTaskat org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)at org.apache.hive.serv...

异常-Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hdfs, access=WRITE, inode="/hbase":root:supergroup:drwxr-xr-x【代码】

1 详细异常Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hdfs, access=WRITE, inode="/hbase":root:supergroup:drwxr-xr-xat org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:279)at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(...

MapReduce运行异常：Unknown protocol to name node: org.apache.hadoop.mapred.JobSubmissionProtocol

描述：job 配置错误异常：14/03/26 22:23:27 ERROR security.UserGroupInformation: PriviledgedActionException as:allen cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: Unknown protocol to name node: org.apache.hadoop.mapred.JobSubmissionProtocolat org.apache.hadoop.hdfs.server.namenode.NameNode.getProtocolVersion(NameNode.java:149)at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown...

Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.hbase.client.Result【代码】

问题：spark操作HBase的时候报错Result为序列化问题报错：Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.hbase.client.Result Serialization stack:- object not serializable (class: org.apache.hadoop.hbase.client.Result, value: keyvalues={tjiloaB#3#20190520/wifiTargetCF:inNum/1581...

Secondarynamenode无法正常备份:ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint【代码】

原先使用hadoop默认设置（hadoop1.2.1），secondarynamenode会正常进行备份，定时从namenode拷贝image文件到SNN。但是具体SNN备份的时间周期和log文件的大小无法定制，后来楼主就修改了SNN的设置，将fs.checkpoint.period修改为3600s,fs.checkpoint.size修改为64兆。在core-site.xml配置文件中添加这两个参数之后，却发现SNN总是无法备份。后来google查找发现还是配置文件没有配置完整造成的，修改配置文件core-site.xml 和hdfs-si...

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=Mypc, access=WRITE, inode="/":fan:supergroup:drwxr-xr-x【代码】

在window上编程提示没有写Hadoop的权限 Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=Mypc, access=WRITE, inode="/":fan:supergroup:drwxr-xr-x 曾经踩过的坑：保存结果到hdfs上没有写的权限* 通过修改权限将文件写入到指定的目录下* * $HADOOP_HOME/bin/hdfs dfs -mkdir /output* $HADOOP_HOME/bin/hdfs dfs -chmod 777...

Apache顶级项目介绍4 － Hadoop【图】

大象起舞，天下太平，极客们应该知道我们的重量级人物Hadoop登场了。提到Hadoop, 正所谓饮水思源，我们不得不提及一下Hadoop之父，其对技术界的影响，在近10年可谓重大，他就是Doug Cutting，其传奇人生及其大作这里不赘述了，大家可以Google/Bing一下，之后或许我们会推出一些技术大牛的介绍以及好书推荐。值得提及的是Hadoop之所以诞生于2005-2006之际，是Doug当时受到来自 Google Lab公开发布的几篇paper启发，包括Map/Redu...

hive 使用where条件报错 java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.getConvertedNode

hadoop 版本 2.6.0hive版本 1.1.1错误：java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.getConvertedNod……解决办法1，修改Hive的配置文件conf/hive-site.xmlhive.optimize.ppd==false然后重启hadoop环境2，hive命令行：执行set hive.optimize.ppd=false;原文：http://www.cnblogs.com/looye-5/p/5685539.html

hive 报错 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun...

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException【图】

使用MapReduce编写的中文分词程序出现了 Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 这样的问题如图：上网查了好多资料，才明白这是hadoop本身的问题，具体参考：https://issues.apache.org/jira/browse/YARN-1298https://issues.apache.org/jira/browse/MAPREDUCE-5655解决办法是重新编译hadoop具体参考：http://zy19982004.iteye.com/blog/2031172版权声明：本文为博主原创文章，未经博...

Hbase 出现 org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 错误【代码】

ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yetat org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2372)at org.apache.hadoop.hbase.master.MasterRpcServices.isMasterRunning(MasterRpcServices.java:931)at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55654)at org.apache.had...

Apache Hadoop 2.0.3

Apache Hadoop 2.0.3发布了，在这次版本更新中，主要增加了以下几个特性： 1. 引入一种新的HDFS HA解决方案QJM 之前NameNode HA已经有两种解决方案，分别是基于共享存储区的Backup Node方案和基于Bookeeper的方案，在该版本中引入另外一种方案：QJM（Quorum Journal Manager）。该方案（HDFS-3077）采用了quorum commit protocol，引入两个角色：QuorumJournalManager和JournalNode，QuourumJournalManager通过 RPC将edits日志...

Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException】引发的血案【代码】

在成功启动Hive之后感慨这次终于没有出现Bug了，满怀信心地打了长长的创建表格的命令，结果现实再一次给了我一棒，报了以下的错误Error, return code 1 from org.apache.Hadoop.hive.ql.exec.DDLTask. MetaException，看了一下错误之后，先是楞了一下，接着我就发出感慨，自从踏上编程这条不归路之后，就没有一天不是在找Bug的路上就是在处理Bug，给自己贴了个标签：找Bug就跟吃饭一样的男人。抒发心中的感慨之后，该干活还是的干活...

apache hadoop【代码】【图】

两年多没有搭建过apache hadoop的环境了，昨天再次搭建hadoop环境，将过程记录下来，以便以后查阅。主机角色分配：NameNode、DFSZKFailoverController 角色由 oversea-stable、bus-stable 服务器承担；需要安装软件有：JDK、Hadoop2.9.1ResourceManager角色由 oversea-stable 服务器承担；需要安装软件有：JDK、Hadoop2.9.1JournalNode、DataNode、NodeManager角色由open-stable、permission-stable、sp-stable服务器承担；需要安装...

APACHE - 最热教程

php服务器环境搭建配置(apache与iis两种...Ubuntu14.04下搭建LANMP环境(Apache+Ng...mac机启动apache出现问题启动不了 PY4J.PROTOCOL.PY4JERROR: ORG.APACHE....CentOS7中apache与php7及mysql5.7的安装...求助nginx伪静态规则转为apache PHP5.3.1安装教程[基于Windows下Apache...WAMP5中Apache服务总是启动不了,该怎么...Apache缓存相关配置_PHP教程 Apache2.4文件上传失败

首页 / APACHE / What Is Apache Hadoop?

What Is Apache Hadoop?

内容导读

内容图文

内容总结

内容备注

内容手机端

【What Is Apache Hadoop?】教程文章相关的互联网学习教程文章

What Is Apache Hadoop?【图】

Hive问题：Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask【代码】

异常-Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hdfs, access=WRITE, inode="/hbase":root:supergroup:drwxr-xr-x【代码】

MapReduce运行异常：Unknown protocol to name node: org.apache.hadoop.mapred.JobSubmissionProtocol

Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.hbase.client.Result【代码】

Secondarynamenode无法正常备份:ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint【代码】

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=Mypc, access=WRITE, inode="/":fan:supergroup:drwxr-xr-x【代码】

Apache顶级项目介绍4 － Hadoop【图】

hive 使用where条件报错 java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.getConvertedNode

hive 报错 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException【图】

Hbase 出现 org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet 错误【代码】

Apache Hadoop 2.0.3

Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException】引发的血案【代码】

apache hadoop【代码】【图】

HADOOP - 相关标签

APACHE - 相关标签

APACHE - 最新教程

APACHE - 最热教程