###说明:
1. solr已内置jetty服务,默认端口8983,可以很方便的在web端操作,不用安装tomcat。
2. 安装完成后,建议使用谷歌浏览器访问,有的浏览器可能会报错。
3. 开放zk要用到的端口或关闭防火墙
4. solr自带zk,一般不使用,而是自定义安装

一.环境准备

二.安装jdk

此处略

三.安装zk

  • 上传zookeeper-3.4.8.tar.gz到60.35的/home目录下
  • 在/home下解压zk
[root@app4 home]# tar -zxvf zookeeper-3.4.8.tar.gz
  • 配置zk

    重命名配置文件为zoo.cfg

    [root@app1 conf]# pwd
    /home/zookeeper-3.4.8/conf
    [root@app1 conf]# mv zoo_sample.cfg zoo.cfg

    修改zoo.cfg内容

    
    #存储数据的目录
    
    dataDir=/home/zookeeper-3.4.8/data/  
    
    #存储日志的目录
    
    dataLogDir=/home/zookeeper-3.4.8/log/
    
    #分布式配置,1、2、3对应每台机器的myid
    
    server.1=192.168.60.35:28881:3881
    server.2=192.168.60.38:28881:3881
    server.3=192.168.60.41:28881:3881
  • 远程拷贝zk目录到另两台机器

    [root@app4 home]# scp -r zookeeper-3.4.8 hbadmin@192.168.60.38:/home/
    [root@app4 home]# scp -r zookeeper-3.4.8 hbadmin@192.168.60.41:/home/
  • 创建myid
    在dataDir目录下创建myid,内容是zoo.cfg中配置的:60.35的myid内容为1,60.38的myid内容为2,60.41的myid内容为3

    [root@app1 data]# pwd
    /home/zookeeper-3.4.8/data
    [root@app1 data]# vi myid
  • 每台机器分别启动zk

    [root@app4 zookeeper-3.4.8]# cd bin
    [root@app4 bin]# ./zkServer.sh start
    ZooKeeper JMX enabled by default
    Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED
  • 查看zk启动情况,查看端口和状态

    如果启动失败,可以尝试修改zoo.cfg中配置的端口,查看zk的bin目录下的日志zookeeper.out

60.35,结果是leader

[root@app1 bin]# netstat -anp|grep 28881
tcp        0      0 ::ffff:192.168.60.35:28881  :::*                        LISTEN      29970/java          
tcp        0      0 ::ffff:192.168.60.35:28881  ::ffff:192.168.60.41:33793  ESTABLISHED 29970/java          
tcp        0      0 ::ffff:192.168.60.35:28881  ::ffff:192.168.60.38:41405  ESTABLISHED 29970/java          
[root@app1 bin]# netstat -anp|grep 3881
tcp        0      0 ::ffff:192.168.60.35:3881   :::*                        LISTEN      29970/java          
tcp        0      0 ::ffff:192.168.60.35:3881   ::ffff:192.168.60.41:33566  ESTABLISHED 29970/java          
tcp        0      0 ::ffff:192.168.60.35:3881   ::ffff:192.168.60.38:46208  ESTABLISHED 29970/java          
[root@app1 bin]# 
[root@app1 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: leader

60.38,结果是follower

[root@app4 bin]# netstat -anp|grep 3881
tcp        0      0 ::ffff:192.168.60.38:3881   :::*                        LISTEN      21101/java          
tcp        0      0 ::ffff:192.168.60.38:3881   ::ffff:192.168.60.41:41058  ESTABLISHED 21101/java          
tcp        0      0 ::ffff:192.168.60.38:46208  ::ffff:192.168.60.35:3881   ESTABLISHED 21101/java          
[root@app4 bin]# netstat -anp|grep 28881
tcp        0      0 ::ffff:192.168.60.38:41405  ::ffff:192.168.60.35:28881  ESTABLISHED 21101/java 
[root@app4 bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

60.41,结果是follower

[root@localhost bin]# netstat -anp|grep 3881
tcp        0      0 ::ffff:192.168.60.41:3881   :::*                        LISTEN      22719/java          
tcp        0      0 ::ffff:192.168.60.41:33566  ::ffff:192.168.60.35:3881   ESTABLISHED 22719/java          
tcp        0      0 ::ffff:192.168.60.41:41058  ::ffff:192.168.60.38:3881   ESTABLISHED 22719/java          
[root@localhost bin]# netstat -anp|grep 28881
tcp        0      0 ::ffff:192.168.60.41:33793  ::ffff:192.168.60.35:28881  ESTABLISHED 22719/java          
[root@localhost bin]# ./zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower
[root@localhost bin]# 
  • zk常用命令
选项含义备注
启动ZK服务bin/zkServer.sh start
查看ZK服务状态bin/zkServer.sh status
停止ZK服务bin/zkServer.sh stop
重启ZK服务bin/zkServer.sh restart
连接服务器zkCli.sh -server 192.168.60.35:2181
查看根目录ls /
创建 testnode节点,关联字符串”zz”create /zk/testnode “zz”
查看节点内容get /zk/testnode
设置节点内容set /zk/testnode abc
删除节点delete /zk/testnode
更多连接zk后执行help查看

四.安装solr

  • 先解压安装脚本,从压缩包中抽取出安装脚本
tar xzf solr-5.5.1.tgz solr-5.5.1/bin/install_solr_service.sh --strip-components=2
  • 执行安装脚本
./install_solr_service.sh solr-5.5.1.tgz -i /opt -d /var/solr -u root -s solr -p 8983
选项含义
-i/opt指定solr的安装目录,默认为/opt(安装时会生成指向安装目录的符号连接 /opt/solr )
-d/var/solr指定写文件的目录,包括索引、日志、初环境变量设置等,默认为/var/solr
-uroot指定solr文件和运行进程的所属用户, 默认为solr(安装脚本自动创建了solr账号)
-ssolrsolr服务的名称, 默认为solr
-p8983solr服务的监听端口,默认为8983

执行中会打印出solr的安装信息

名称路径
配置文件路径/etc/default/solr.in.sh
安装目录/opt/solr
数据目录/var/solr/data

[root@app1 home]# ./install_solr_service.sh solr-5.5.1.tgz -i /opt -d /var/solr -u root -s solr -p 8983

Extracting solr-5.5.1.tgz to /opt


Installing symlink /opt/solr -> /opt/solr-5.5.1 ...


Installing /etc/init.d/solr script ...


Installing /etc/default/solr.in.sh ...
...
2016-10-14 05:33:58.549 INFO  (main) [   ] o.e.j.s.Server Started @2276ms                                  
Found 1 Solr nodes: 

Solr process 15538 running on port 8983
{
  "solr_home":"/var/solr/data",
  "version":"5.5.1 c08f17bca0d9cbf516874d13d221ab100e5b7d58 - anshum - 2016-04-30 13:28:18",
  "startTime":"2016-10-14T05:33:56.273Z",
  "uptime":"0 days, 0 hours, 0 minutes, 37 seconds",
  "memory":"50.2 MB (%10.2) of 490.7 MB"}

Service solr installed.
  • 修改solr的配置文件solr.in.sh,使用之前安装的zk
ZK_HOST="192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181"
  • 任意目录下重启solr服务,使zk配置生效
[root@app4 solr]service solr restart
  • 查看solr服务的状态

    会打印出solr服务的端口、zk的配置和活动节点个数、collection的个数等

[root@solr3 opt]# service solr status

Found 1 Solr nodes: 

Solr process 23970 running on port 8983
{
  "solr_home":"/var/solr/data",
  "version":"5.5.1 c08f17bca0d9cbf516874d13d221ab100e5b7d58 - anshum - 2016-04-30 13:28:18",
  "startTime":"2016-10-16T01:39:57.381Z",
  "uptime":"0 days, 0 hours, 5 minutes, 37 seconds",
  "memory":"81.7 MB (%16.7) of 490.7 MB",
  "cloud":{
    "ZooKeeper":"192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181",
    "liveNodes":"3",
    "collections":"0"}}

[root@solr3 opt]# 
  • solr创建collection

    1.参数说明

    选项含义备注
    -s表示分片个数
    -rf表示副本个数
    -n表示配置在zk上的文件名称
    -d配置文件路径

    2.执行此命令时会打印出过程中实际执行的3个操作

    a)连接zk
    b)上传solr的配置文件
    c)创建collection

    执行过程如下:

 [root@app4 solr]# pwd
/opt/solr
[root@app4 solr]# bin/solr create -c testcollection -d data_driven_schema_configs -s 3 -rf 2 -n myconf

Connecting to ZooKeeper at 192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181 ...
Uploading /opt/solr/server/solr/configsets/data_driven_schema_configs/conf for config myconf to ZooKeeper at 192.168.60.35:2181,192.168.60.41:2181,192.168.60.38:2181

Creating new collection 'testcollection' using command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=testcollection&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=myconf

{
  "responseHeader":{
    "status":0,
    "QTime":26737},
  "success":{
    "192.168.60.38:8983_solr":{
      "responseHeader":{
        "status":0,
        "QTime":17222},
      "core":"testcollection_shard1_replica2"},
    "192.168.60.35:8983_solr":{
      "responseHeader":{
        "status":0,
        "QTime":17663},
      "core":"testcollection_shard2_replica1"},
    "192.168.60.41:8983_solr":{
      "responseHeader":{
        "status":0,
        "QTime":18110},
      "core":"testcollection_shard1_replica1"}}}
[root@app4 solr]# 
  • zk上查看节点是否创建成功

    在configs节点下有myconf,在collections节点下有testcollection

[root@app1 bin]# pwd
/home/zookeeper-3.4.8/bin
[root@app1 bin]# ./zkCli.sh
Connecting to localhost:2181
2016-10-16 09:35:55,384 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT
2016-10-16 09:35:55,389 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=localhost.localdomain
2016-10-16 09:35:55,389 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.7.0_79
2016-10-16 09:35:55,393 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2016-10-16 09:35:55,393 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.7.0_79/jre
2016-10-16 09:35:55,393 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/home/zookeeper-3.4.8/bin/../build/classes:/home/zookeeper-3.4.8/bin/../build/lib/*.jar:/home/zookeeper-3.4.8/bin/../lib/slf4j-log4j12-1.6.1.jar:/home/zookeeper-3.4.8/bin/../lib/slf4j-api-1.6.1.jar:/home/zookeeper-3.4.8/bin/../lib/netty-3.7.0.Final.jar:/home/zookeeper-3.4.8/bin/../lib/log4j-1.2.16.jar:/home/zookeeper-3.4.8/bin/../lib/jline-0.9.94.jar:/home/zookeeper-3.4.8/bin/../zookeeper-3.4.8.jar:/home/zookeeper-3.4.8/bin/../src/java/lib/*.jar:/home/zookeeper-3.4.8/bin/../conf:.:/usr/java/jdk1.7.0_79/lib:/usr/java/jdk1.7.0_79/jre/lib
2016-10-16 09:35:55,393 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-10-16 09:35:55,394 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2016-10-16 09:35:55,394 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2016-10-16 09:35:55,394 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2016-10-16 09:35:55,394 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2016-10-16 09:35:55,394 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=2.6.18-308.el5
2016-10-16 09:35:55,395 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2016-10-16 09:35:55,395 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2016-10-16 09:35:55,395 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/home/zookeeper-3.4.8/bin
2016-10-16 09:35:55,397 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@514b9eeb
Welcome to ZooKeeper!
2016-10-16 09:35:55,438 [myid:] - INFO  [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2016-10-16 09:35:55,446 [myid:] - INFO  [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost.localdomain/127.0.0.1:2181, initiating session
JLine support is enabled
2016-10-16 09:35:55,459 [myid:] - INFO  [main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost.localdomain/127.0.0.1:2181, sessionid = 0x157caf5fb680004, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[configs, security.json, zookeeper, clusterstate.json, aliases.json, live_nodes, overseer, overseer_elect, collections]
[zk: localhost:2181(CONNECTED) 1] ls /collections
[testcollection]
[zk: localhost:2181(CONNECTED) 1] ls /configs
[myconf]
[zk: localhost:2181(CONNECTED) 2] ls /configs/myconf
[currency.xml, protwords.txt, synonyms.txt, elevate.xml, params.json, solrconfig.xml, lang, stopwords.txt, managed-schema]
[zk: localhost:2181(CONNECTED) 3] 
  • 查看solr日志
[root@localhost logs]# pwd
/var/solr/logs
[root@localhost logs]# ll
total 156
-rw-r--r-- 1 root root 12518 Oct  9 16:21 solr-8983-console.log
-rw-r--r-- 1 root root  2290 Oct  9 16:20 solr_gc.log
-rw-r--r-- 1 root root 62869 Oct  9 16:20 solr_gc_log_20161009_1620
-rw-r--r-- 1 root root 13825 Oct  9 16:21 solr.log
-rw-r--r-- 1 root root 30771 Oct  9 16:20 solr_log_20161009_1620
[root@localhost logs]# 
  • 查看8983端口是否启动
[root@app4 bin]# netstat -nplt|grep 8983
tcp        0      0 :::8983                     :::*                        LISTEN      23034/java          


安装成功!

五.solr相关命令备注

  1. 创建collection

    [root@app4 solr]#bin/solr create -c testcollection -d data_driven_schema_configs -s 3 -rf 2 -n myconf
  2. 删除collection

    curl 'http://192.168.60.35:8983/solr/admin/collections?action=DELETE&name=testcollection'
  3. 修改schema信息后更新

    • 所有配置上传到zk
    zkcli.sh -zkhost 192.168.60.35:2181 -cmd upconfig -collection testcollection -confdir 
    /opt/solr/server/solr/configsets/data_driven_schema_configs/conf -confname myconf
    • 重新加载collection
    curl 'http://192.168.60.35:8983/solr/admin/collections?action=RELOAD&name=postcollection'
  4. solr配置文件schema的filed属性说明
属性含义备注
type代表索引数据类型,我这里将type全部设置为string是为了避免异常类型的数据导致索引建立失败,正常情况下应该根据实际字段类型设置,比如整型字段设置为int,更加有利于索引的建立和检索;
indexed参数代表此字段是否建立索引,根据实际情况设置,建议不参与条件过滤的字段一律设置为false;
stored参数代表是否存储此字段的值,建议根据实际需求只将需要获取值的字段设置为true,以免浪费存储,比如我们的场景只需要获取rowkey,那么只需把rowkey字段设置为true即可,其他字段全部设置flase;
required参数代表此字段是否必需,如果数据源某个字段可能存在空值,那么此属性必需设置为false,不然Solr会抛出异常;
multiValued参数代表此字段是否允许有多个值,通常都设置为false,根据实际需求可设置为true。

六.solr相关概念备注

Collection:在SolrCloud集群中逻辑意义上的完整的索引。它常常被划分为一个或多个Shard,它们使用相同的Config Set。如果Shard数超过一个,它就是分布式索引,SolrCloud让你通过Collection名称引用它,而不需要关心分布式检索时需要使用的和Shard相关参数。

Core:也就是Solr Core,一个Solr中包含一个或者多个Solr Core,每个Solr Core可以独立提供索引和查询功能,每个Solr Core对应一个索引或者Collection的Shard,Solr Core的提出是为了增加管理灵活性和共用资源。在SolrCloud中有个不同点是它使用的配置是在Zookeeper中的,传统的Solr core的配置文件是在磁盘上的配置目录中。

Leader:赢得选举的Shard replicas。每个Shard有多个Replicas,这几个Replicas需要选举来确定一个Leader。选举可以发生在任何时间,但是通常他们仅在某个Solr实例发生故障时才会触发。当索引documents时,SolrCloud会传递它们到此Shard对应的leader,leader再分发它们到全部Shard的replicas。

Replica:Shard的一个拷贝。每个Replica存在于Solr的一个Core中。一个命名为“test”的collection以numShards=1创建,并且指定replicationFactor设置为2,这会产生2个replicas,也就是对应会有2个Core,每个在不同的机器或者Solr实例。一个会被命名为test_shard1_replica1,另一个命名为test_shard1_replica2。它们中的一个会被选举为Leader。

Shard:Collection的逻辑分片。每个Shard被化成一个或者多个replicas,通过选举确定哪个是Leader,保存索引时,会用哈希算法存储到不同分片上。

Zookeeper: Zookeeper提供分布式锁功能,对SolrCloud是必须的。它处理Leader选举。Solr可以以内嵌的Zookeeper运行,但是建议用独立的,并且最好有3个以上的主机。

七.架构图

  1. solr索引逻辑图
  2. solr创建索引
  3. solr查询索引
  4. solr的shard分裂

以上,安装过程中碰到问题请留言!

Logo

权威|前沿|技术|干货|国内首个API全生命周期开发者社区

更多推荐