此问题官方有给出解决方案:https://discuss.pivotal.io/hc/en-us/articles/221826748-Pivotal-HDB-state-indicates-the-database-is-down-but-Ambari-shows-all-Pivotal-HDB-services-as-being-up
Environment
Product | Version |
Pivotal HDB (HAWQ) | 2.x |
Symptom
Pivotal HDB shows as "up" in Ambari:
However, the following issues are seen:
- HAWQ state fails with the following:
[gpadmin@hdm1 ~]$ hawq state
Failed to connect to database, this script can only be run when the database is up.
[gpadmin@hdm1 ~]$
- Stopping HAWQ via command line or Ambari fails with the following:
20160630:05:45:42:039364 hawq_stop:hdm1:gpadmin-[ERROR]:-Failed to connect to the running database, please check master status
Error Message:
The HAWQ master log will show errors similar to this:
[gpadmin@hdm1 pg_log]$ grep -i /data/hawq/master/pg_log/hba hawq-2016-06-30_044655.csv
2016-06-30 05:44:03.818472 PDT,"gpadmin","template1",p38928,th1711609984,"172.28.21.118","51844",2016-06-30 05:44:03 PDT,0,,,seg-10000,,,,,"FATAL","28000","no pg_hba.conf entry for host ""172.28.21.118"", user ""gpadmin"", database ""template1"", SSL off",,,,,,,0,,"auth.c",603,
2016-06-30 05:45:42.491576 PDT,"gpadmin","template1",p39401,th1711609984,"172.28.21.118","51874",2016-06-30 05:45:42 PDT,0,,,seg-10000,,,,,"FATAL","28000","no pg_hba.conf entry for host ""172.28.21.118"", user ""gpadmin"", database ""template1"", SSL off",,,,,,,0,,"auth.c",603,
[gpadmin@hdm1 pg_log]$
Cause
pg_hba.conf is set up incorrectly, so gpadmin does not have access to template1 from the HAWQ master host.
Resolution
Review the pg_hba.conf and the Pivotal HDB Allowing connections to HAWQ documentation to allow gpadmin access to database "template1" from the HAWQ master.
个人解决方法:
1、查看 /data/hawq/master/pg_hba.conf 文件被人修改过,添加了有两条重复记录,注释掉以后,重启mater成功。
2、通过ambari重启所有服务,部分segment 执行stop命令失败,手动执行:
hawq stop segment -M immediate
成功,执行:
hawq start segment 成功。