springboot 集成 zookeeper 问题记录
环境
springboot - 2.7.8
dubbo - 3.1.11
dubbo-dependencies-zookeeper-curator5 - 3.1.11
模拟真实环境,将 windows 上的 zookeeper 迁移到虚拟机 linux 的 docker 环境
failed to connect to zookeeper server
迁移到 linux 环境,突然出现连不上 zookeeper 的问题,springboot 报错
Caused by: java.lang.IllegalStateException: failed to connect to zookeeper serverat org.apache.dubbo.registry.zookeeper.util.CuratorFrameworkUtils.buildCuratorFramework(CuratorFrameworkUtils.java:100)at org.apache.dubbo.registry.zookeeper.ZookeeperServiceDiscovery.<init>(ZookeeperServiceDiscovery.java:82)... 74 more
猜测一
首先怀疑是 linux 上 docker 环境的 zookeeper 的问题,于是主机使用 zookeeper 的 zkCli.cmd 连接 docker 上的 zookeeper
zkCli.cmd -server 192.168.x.x
结果连接服务端成功,所以 zookeeper 端没有问题
猜测二
那么问题应该出现在配置上了,因为之前都在本机是没有问题的,并且在本机用客户端去连接 docker 上的 zookeeper 时,响应会有一点慢。于是增加了 dubbo 中的配置超时时间,这样应该就万事大吉了
dubbo:registry:address: zookeeper://${zookeeper.address:192.168.61.80}:2181timeout: 60000 # 增加这个超时时间
然而,并没有什么用
猜测三
经验法无法解决,只能老老实实的根据报错堆栈信息定位报错位置
CuratorFramework curatorFramework = builder.build(); # 构造curatorFramework.start(); # 启动
curatorFramework.blockUntilConnected(BLOCK_UNTIL_CONNECTED_WAIT.getParameterValue(connectionURL),BLOCK_UNTIL_CONNECTED_UNIT.getParameterValue(connectionURL)); # 阻塞直至连接if (!curatorFramework.getState().equals(CuratorFrameworkState.STARTED)) {throw new IllegalStateException("zookeeper client initialization failed");
}
if (!curatorFramework.getZookeeperClient().isConnected()) {throw new IllegalStateException("failed to connect to zookeeper server");
}
CuratorFramework 构造,启动,阻塞直至连接这三步里面肯定有一个有问题,最让人怀疑的是这个阻塞的步骤,于是往下 BLOCK_UNTIL_CONNECTED_WAIT
这个数据从哪里来的
/*** The enumeration for the parameters of {@link CuratorFramework}** @see CuratorFramework* @since 2.7.5*/
public enum CuratorFrameworkParams {.../*** Wait time to block on connection to Zookeeper.*/BLOCK_UNTIL_CONNECTED_WAIT("blockUntilConnectedWait", 10, Integer::valueOf),/*** The unit of time related to blocking on connection to Zookeeper.*/BLOCK_UNTIL_CONNECTED_UNIT("blockUntilConnectedUnit", TimeUnit.SECONDS, TimeUnit::valueOf),;
明显,这个阻塞时间是 10s,我超时时间是 60 秒,结果这里 10s 就报连不上。但是这个值可以配置还是写死的?于是全局搜索关键字 blockUntilConnectedWait
,发现并没有,于是我想着看看 starter 里面会不会有,这个依赖情况如下
- dubbo-spring-boot-starter
- dubbo-spring-boot-autoconfigure
- dubbo-spring-boot-autoconfigure-compatible
- dubbo-spring-boot-autoconfigure
根据 springboot 的 starter 的习惯,发现配置类 org.apache.dubbo.spring.boot.autoconfigure.DubboConfigurationProperties
@ConfigurationProperties("dubbo")
public class DubboConfigurationProperties {@NestedConfigurationPropertyprivate Config config = new Config();@NestedConfigurationPropertyprivate Scan scan = new Scan();@NestedConfigurationPropertyprivate ApplicationConfig application = new ApplicationConfig();@NestedConfigurationPropertyprivate ModuleConfig module = new ModuleConfig();@NestedConfigurationPropertyprivate RegistryConfig registry = new RegistryConfig();...
}
zookeeper 属于注册中心部分,所以继续查看 org.apache.dubbo.config.RegistryConfig
/*** RegistryConfig** @export*/
public class RegistryConfig extends AbstractConfig {public static final String NO_AVAILABLE = "N/A";private static final long serialVersionUID = 5508512956753757169L;/*** Register center address*/private String address;/*** Username to login register center*/private String username;/*** Password to login register center*/private String password;/*** Default port for register center*/private Integer port;/*** Protocol for register center*/private String protocol;/*** Network transmission type*/private String transporter;private String server;private String client;/*** Affects how traffic distributes among registries, useful when subscribing multiple registries, available options:* 1. zone-aware, a certain type of traffic always goes to one Registry according to where the traffic is originated.*/private String cluster;/*** The region where the registry belongs, usually used to isolate traffics*/private String zone;/*** The group that services registry in*/private String group;private String version;/*** Connect timeout in milliseconds for register center*/private Integer timeout;/*** Session timeout in milliseconds for register center*/private Integer session;/*** File for saving register center dynamic list*/private String file;/*** Wait time before stop*/private Integer wait;/*** Whether to check if register center is available when boot up*/private Boolean check;/*** Whether to allow dynamic service to register on the register center*/private Boolean dynamic;/*** Whether to allow exporting service on the register center*/private Boolean register;/*** Whether to allow subscribing service on the register center*/private Boolean subscribe;/*** The customized parameters*/private Map<String, String> parameters;/*** Simple the registry. both useful for provider and consumer** @since 2.7.0*/private Boolean simplified;/*** After simplify the registry, should add some parameter individually. just for provider.* <p>* such as: extra-keys = A,b,c,d** @since 2.7.0*/private String extraKeys;/*** the address work as config center or not*/private Boolean useAsConfigCenter;/*** the address work as remote metadata center or not*/private Boolean useAsMetadataCenter;/*** list of rpc protocols accepted by this registry, for example, "dubbo,rest"*/private String accepts;/*** Always use this registry first if set to true, useful when subscribe to multiple registries*/private Boolean preferred;/*** Affects traffic distribution among registries, useful when subscribe to multiple registries* Take effect only when no preferred registry is specified.*/private Integer weight;private String registerMode;private Boolean enableEmptyProtection;...
}
还是没有我们需要找的目标 blockUntilConnectedWait
,但是
/*** The customized parameters*/private Map<String, String> parameters;
这个字段看注释是自定义参数,于是试着修改一下配置
dubbo:registry:address: zookeeper://${zookeeper.address:192.168.61.80}:2181timeout: 60000 # 增加这个超时时间parameters:blockUntilConnectedWait: 60 # 单位为秒,也是可以设置的
终于项目启动成功
总结
遇到这样的问题起初还是很沮丧的,然后百度也给的那种版本不一致什么的答案,但是明显我这个是跑起来过的,于是只能自己摸索,花了大概 6 个多小时,终于成功解决这个问题,所以想着记录一下自己的解决过程
curator 是 zookeeper 的一层封装,可以理解就是 mybatis 和 mysql 的关系