Nomad简易集群技术指南

学习使用nomad

上期consul搭建完成以后，就可以根据这期的内容，部署nomad服务，nomad会自动找到本机的8500consul端口，主动去注册服务。

启动服务器

第一步是为服务器创建配置文件。无论是从下载的文件github，或粘贴到一个名为server.hcl：

vim server.hcl # Increase log verbosity log_level = "DEBUG"  # Setup data dir data_dir = "/tmp/server1"  # Enable the server server { enabled = true  # Self-elect, should be 3 or 5 for production bootstrap_expect = 1}

这是一个相当最小的服务器配置文件，但只能以仅服务器方式启动代理，并将其选为leader。应该对生产进行的主要变化是运行多台服务器，并更改相应的bootstrap_expect值。创建文件后，在新选项卡中启动代理：

$ sudo nomad agent -config server.hcl ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation. ==> Starting Nomad agent... ==> Nomad agent configuration:  Client: false Log Level: DEBUG Region: global (DC: dc1) Server: true Version: 0.6.0  ==> Nomad agent started! Log data will stream in below:  [INFO] serf: EventMemberJoin: nomad.global 127.0.0.1 [INFO] nomad: starting 4 scheduling worker(s) for [service batch _core] [INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state [INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1) [WARN] raft: Heartbeat timeout reached, starting election [INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state [DEBUG] raft: Votes needed: 1 [DEBUG] raft: Vote granted. Tally: 1 [INFO] raft: Election won. Tally: 1 [INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state [INFO] nomad: cluster leadership acquired [INFO] raft: Disabling EnableSingleNode (bootstrap) [DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]

我们可以看到，客户端模式被禁用，我们只是作为服务器运行。这意味着该服务器将管理状态并进行调度决策，但不会执行任何任务。现在我们需要一些代理来运行任务！

启动客户端

与服务器类似，我们必须先配置客户端。请从github下载client1和client2的配置，或将以下内容粘贴到client1.hcl：

# Increase log verbosity log_level = "DEBUG"  # Setup data dir data_dir = "/tmp/client1"  # Enable the client client { enabled = true  # For demo assume we are talking to server1. For production, # this should be like "nomad.service.consul:4647" and a system # like Consul used for service discovery. servers = ["127.0.0.1:4647"] }  # Modify our port to avoid a collision with server1 ports { http = 5656 }

将该文件复制client2.hcl并更改data_dir为“/ tmp / httpclient2 ”并将端口更改为5657.一旦创建了这两个文件，client1.hcl并client2.hcl打开每个选项卡并启动代理程序：

$ sudo nomad agent -config client1.hcl ==> Starting Nomad agent... ==> Nomad agent configuration:  Client: true Log Level: DEBUG Region: global (DC: dc1) Server: false Version: 0.6.0  ==> Nomad agent started! Log data will stream in below:  [DEBUG] client: applied fingerprints [host memory storage arch cpu] [DEBUG] client: available drivers [docker exec] [DEBUG] client: node registration complete ...

在输出中，我们可以看到代理仅在客户端模式下运行。该代理将可用于运行任务，但不会参与管理集群或做出调度决策。使用node-status命令我们应该看到ready状态中的两个节点：

$ nomad node-status ID Datacenter Name Class Drain Status fca62612 dc1 nomad <none> false ready c887deef dc1 nomad <none> false ready

我们现在有一个简单的三节点集群运行。演示和完整生产集群之间的唯一区别是，我们运行的是单个服务器，而不是三个或五个。

提交工作

现在我们有一个简单的集群，我们可以用它来安排一个工作。我们还应该拥有example.nomad之前的作业文件，但是确认count仍然设置为3。然后，使用run命令提交作业：

$ nomad init $ nomad run example.nomad ==> Monitoring evaluation "8e0a7cf9" Evaluation triggered by job "example" Evaluation within deployment: "0917b771" Allocation "501154ac" created: node "c887deef", group "cache" Allocation "7e2b3900" created: node "fca62612", group "cache" Allocation "9c66fcaf" created: node "c887deef", group "cache" Evaluation status changed: "pending" -> "complete" ==> Evaluation "8e0a7cf9" finished with status "complete"

我们可以在输出中看到调度程序为其中一个客户机节点分配了两个任务，剩下的任务分配给第二个客户端。我们可以再次使用status命令验证：

$ nomad status example ID = example Name = example Submit Date = 07/26/17 16:34:58 UTC Type = service Priority = 50 Datacenters = dc1 Status = running Periodic = false Parameterized = false  Summary Task Group Queued Starting Running Failed Complete Lost cache 0 0 3 0 0 0  Latest Deployment ID = fc49bd6c Status = running Description = Deployment is running  Deployed Task Group Desired Placed Healthy Unhealthy cache 3 3 0 0  Allocations ID Eval ID Node ID Task Group Desired Status Created At 501154ac 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT 7e2b3900 8e0a7cf9 fca62612 cache run running 08/08/16 21:03:19 CDT 9c66fcaf 8e0a7cf9 c887deef cache run running 08/08/16 21:03:19 CDT

我们可以看到我们的所有任务已经分配并正在运行。一旦我们对我们的工作感到满意，我们就可以把它删掉了nomad stop。

使用nomad UI

在nomad官方文档上，nomad UI似乎没能很好的实现，虽然官方说0.7版本以后，ui被集成了，但是在我的本地环境中，浏览器访问nomadIP:4646，会出现404错误，鉴于我一直没有解决这个404问题。我找到并使用github上一位大牛制作的UI，https://github.com/jippi/hashi-ui。

UI更新

今天更新了nomad 0.7版本，在github上下载nomad，可以使用官方的UI。
这里是github上nomad项目的ui目录，https://github.com/hashicorp/nomad/tree/master/ui
按照readme提示，安装依赖，即可！如果你想要使用自定义的地址访问，使用这条命令：ember serve --proxy http://10.30.0.52:4646 （10.30.0.52换成你的外网IP，4646换成你自定义的端口）

本文发表于2018年01月02日 16:32
(c)注：本文转载自https://my.oschina.net/u/3762054/blog/1600656，转载目的在于传递更多信息，并不代表本网赞同其观点和对其真实性负责。如有侵权行为，请联系我们，我们会及时删除.

阅读 3066 讨论 0 喜欢 0

快捷链接
网站地图
提交友链