视频1 视频21 视频41 视频61 视频文章1 视频文章21 视频文章41 视频文章61 推荐1 推荐3 推荐5 推荐7 推荐9 推荐11 推荐13 推荐15 推荐17 推荐19 推荐21 推荐23 推荐25 推荐27 推荐29 推荐31 推荐33 推荐35 推荐37 推荐39 推荐41 推荐43 推荐45 推荐47 推荐49 关键词1 关键词101 关键词201 关键词301 关键词401 关键词501 关键词601 关键词701 关键词801 关键词901 关键词1001 关键词1101 关键词1201 关键词1301 关键词1401 关键词1501 关键词1601 关键词1701 关键词1801 关键词1901 视频扩展1 视频扩展6 视频扩展11 视频扩展16 文章1 文章201 文章401 文章601 文章801 文章1001 资讯1 资讯501 资讯1001 资讯1501 标签1 标签501 标签1001 关键词1 关键词501 关键词1001 关键词1501 专题2001
hadoop的"mapred.ReduceTask:java.net.ConnectExceptio
2020-11-09 07:49:10 责编:小采
文档


集群某节点91有故障发生,出现 [plain] 2013-11-08 08:32:13,908 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201311061017_102_r_000000_0 copy failed: attempt_201311061017_102_m_000003_0 from node-192 2013-11-08 08:32:13,921 WARN org.a

集群某节点91有故障发生,出现

[plain]

2013-11-08 08:32:13,908 WARN org.apache.hadoop.mapred.ReduceTask: attempt_201311061017_102_r_000000_0 copy failed: attempt_201311061017_102_m_000003_0 from node-192

2013-11-08 08:32:13,921 WARN org.apache.hadoop.mapred.ReduceTask: java.net.ConnectException: Connection timed out

at java.net.PlainSocketImpl.socketConnect(Native Method)

at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)

at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)

at java.net.AbstractPlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at sun.net.NetworkClient.doConnect(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.openServer(Unknown Source)

at sun.net.www.http.HttpClient.(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.http.HttpClient.New(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)

at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1631)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1588)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1488)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1399)

at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1331)

分析hadoop代码:

[java]

localFs = FileSystem.getLocal(fConf);

if (fConf.get("slave.host.name") != null) {

this.localHostname = fConf.get("slave.host.name");

}

if (localHostname == null) {

this.localHostname =

DNS.getDefaultHost

(fConf.get("mapred.tasktracker.dns.interface","default"),

fConf.get("mapred.tasktracker.dns.nameserver","default"));

}

在该节点ping 下这个hostname:

[plain]

ping node-191

PING node-128-191.localhost (220.250..228) 56(84) bytes of data.

bytes from 220.250..228: icmp_seq=1 ttl=247 time=14.8 ms

bytes from 220.250..228: icmp_seq=2 ttl=247 time=14.3 ms

bytes from 220.250..228: icmp_seq=3 ttl=247 time=14.4 ms

发现压根不是191的ip。

到该节点的hosts里查看,也没有配置191的hostname。

问题得解。

将191的hostname添加到集群所有节点的hosts上。重启tasktracker搞定。

下载本文
显示全文
专题