Hi,
Some of the segment instances are failing to start when cd gpAux/gpdemo,make cluster were executed
20151110:00:51:23:031307 gpstart:centos7:kotti-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
............
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Process results...
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-DBID:2 FAILED host:'centos7' datadir:'/home/kotti/projects/gpdb/gpAux/gpdemo/datadirs/dbfast1/demoDataDir0' with reason:'Start failed; check segment logfile. "peer shut down connection before response was fully received Retrying no 1 failure: OtherTransitionInProgress failure: OtherTransitionInProgress"'
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:- Successful segment starts = 5
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Failed segment starts = 1 <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Successfully started 5 of 6 segment instances <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Segment instance startup failures reported
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Failed start 1 of 6 segment instances <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Review /home/kotti/gpAdminLogs/gpstart_20151110.log
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-For more details on segment startup failure(s)
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Run gpstate -s to review current segment instance status
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Starting Master instance centos7 directory /home/kotti/projects/gpdb/gpAux/gpdemo/datadirs/qddir/demoDataDir-1
20151110:00:51:37:031307 gpstart:centos7:kotti-[INFO]:-Command pg_ctl reports Master centos7 instance active
20151110:00:53:16:031307 gpstart:centos7:kotti-[WARNING]:-FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (cdbtm.c:1603)
20151110:00:53:16:031307 gpstart:centos7:kotti-[INFO]:-No standby master configured. skipping...
20151110:00:53:16:031307 gpstart:centos7:kotti-[WARNING]:-Number of segments which failed to start: 1
20151110:00:53:19:013057 gpinitsystem:centos7:kotti-[WARN]:
20151110:00:53:19:013057 gpinitsystem:centos7:kotti-[WARN]:-Failed to start Greenplum instance; review gpstart output to
/home/kotti/gpAdminLogs/gpstart_20151110.log has the following details :
20151110:00:51:20:031307 gpstart:centos7:kotti-[INFO]:-Starting gpstart with args: -a -d /home/kotti/projects/gpdb/gpAux/gpdemo/datadirs/qddir/demoDataDir-
1
20151110:00:51:20:031307 gpstart:centos7:kotti-[INFO]:-Gathering information and validating the environment...
20151110:00:51:20:031307 gpstart:centos7:kotti-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.99.00 build dev'
20151110:00:51:20:031307 gpstart:centos7:kotti-[INFO]:-Greenplum Catalog Version: '201310150'
20151110:00:51:20:031307 gpstart:centos7:kotti-[INFO]:-Starting Master instance in admin mode
20151110:00:51:21:031307 gpstart:centos7:kotti-[INFO]:-Obtaining Greenplum Master catalog information
20151110:00:51:21:031307 gpstart:centos7:kotti-[INFO]:-Obtaining Segment details from master...
20151110:00:51:22:031307 gpstart:centos7:kotti-[INFO]:-Setting new master era
20151110:00:51:22:031307 gpstart:centos7:kotti-[INFO]:-Master Started...
20151110:00:51:22:031307 gpstart:centos7:kotti-[INFO]:-Shutting down master
20151110:00:51:23:031307 gpstart:centos7:kotti-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Process results...
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-DBID:2 FAILED host:'centos7' datadir:'/home/kotti/projects/gpdb/gpAux/gpdemo/datadirs/dbfast1/demo
DataDir0' with reason:'Start failed; check segment logfile. "peer shut down connection before response was fully received Retrying no 1 failure: OtherTr
ansitionInProgress failure: OtherTransitionInProgress"'
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:- Successful segment starts = 5
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Failed segment starts = 1 <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Successfully started 5 of 6 segment instances <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Segment instance startup failures reported
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Failed start 1 of 6 segment instances <<<<<<<<
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Review /home/kotti/gpAdminLogs/gpstart_20151110.log
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-For more details on segment startup failure(s)
20151110:00:51:35:031307 gpstart:centos7:kotti-[WARNING]:-Run gpstate -s to review current segment instance status
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-----------------------------------------------------
20151110:00:51:35:031307 gpstart:centos7:kotti-[INFO]:-Starting Master instance centos7 directory /home/kotti/projects/gpdb/gpAux/gpdemo/datadirs/qddir/dem
oDataDir-1
20151110:00:51:37:031307 gpstart:centos7:kotti-[INFO]:-Command pg_ctl reports Master centos7 instance active
20151110:00:53:16:031307 gpstart:centos7:kotti-[WARNING]:-FATAL: DTM initialization: failure during startup recovery, retry failed, check segment status (
cdbtm.c:1603)
20151110:00:53:16:031307 gpstart:centos7:kotti-[INFO]:-No standby master configured. skipping...
20151110:00:53:16:031307 gpstart:centos7:kotti-[WARNING]:-Number of segments which failed to start: 1
pg_log from the failed segment seems to have the following details :
2015-11-10 00:53:24.004474 CST,,,p15205,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","mirror transition, primary address(port) 'centos7(25438)' mirror address(port) 'centos7(25441)'",,,,,"mirroring role 'primary role' mirroring state 'sync' segment state 'not initialized' process name(pid) 'filerep main process(15205)' filerep state 'not initialized' ",,0,,"cdbfilerep.c",3472,
2015-11-10 00:53:24.035837 CST,,,p15210,th0,,,,0,,,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Segment process received signal SIGSEGV",,,,,,,0,,,,"1 0x8c65a3 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x163
2 0x7f346dc88130 libpthread.so.0 + 0x6dc88130
3 0x978893 postgres FileRep_CalculateCrc + 0x143
4 0x97ca9b postgres + 0x97ca9b
5 0x97d4e6 postgres FileRepPrimary_MirrorOpen + 0x386
6 0x9baf5e postgres + 0x9baf5e
7 0x9bb895 postgres MirroredBufferPool_Open + 0x185
8 0x7cc531 postgres + 0x7cc531
9 0x7cdb90 postgres mdnblocks + 0xf0
10 0x7d0470 postgres smgrnblocks + 0x10
11 0x4ae9ee postgres heap_beginscan + 0x7e
12 0x4b7c89 postgres systable_beginscan + 0x79
13 0x8d4610 postgres FindMyDatabase + 0x60
14 0x97b679 postgres + 0x97b679
15 0x97bec5 postgres FileRepSubProcess_Main + 0x295
16 0x9756a6 postgres + 0x9756a6
17 0x97aff3 postgres FileRep_Main + 0xba3
18 0x53c8a8 postgres AuxiliaryProcessMain + 0x508
19 0x775980 postgres + 0x775980
20 0x780b0c postgres StartFilerepProcesses + 0x3c
21 0x7834a7 postgres doRequestedPrimaryMirrorModeTransitions + 0x8b7
22 0x77d412 postgres + 0x77d412
23 0x77f829 postgres PostmasterMain + 0x789
24 0x485fab postgres main + 0x3bb
25 0x7f346d5d7af5 libc.so.6 __libc_start_main + 0xf5
26 0x4860c9 postgres + 0x4860c9
"
2015-11-10 00:53:24.050735 CST,,,p15209,th0,,,,0,,,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Segment process received signal SIGSEGV",,,,,,,0,,,,"1 0x8c65a3 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x163
2 0x7f346dc88130 libpthread.so.0 + 0x6dc88130
3 0x978893 postgres FileRep_CalculateCrc + 0x143
4 0x97ca9b postgres + 0x97ca9b
5 0x97d4e6 postgres FileRepPrimary_MirrorOpen + 0x386
6 0x9baf5e postgres + 0x9baf5e
7 0x9bb895 postgres MirroredBufferPool_Open + 0x185
8 0x7cc531 postgres + 0x7cc531
9 0x7cdb90 postgres mdnblocks + 0xf0
10 0x7d0470 postgres smgrnblocks + 0x10
11 0x4ae9ee postgres heap_beginscan + 0x7e
12 0x4b7c89 postgres systable_beginscan + 0x79
13 0x8d4610 postgres FindMyDatabase + 0x60
14 0x97b679 postgres + 0x97b679
15 0x97bf05 postgres FileRepSubProcess_Main + 0x2d5
16 0x9756a6 postgres + 0x9756a6
17 0x97afe1 postgres FileRep_Main + 0xb91
18 0x53c8a8 postgres AuxiliaryProcessMain + 0x508
19 0x775980 postgres + 0x775980
20 0x780b0c postgres StartFilerepProcesses + 0x3c
21 0x7834a7 postgres doRequestedPrimaryMirrorModeTransitions + 0x8b7
22 0x77d412 postgres + 0x77d412
23 0x77f829 postgres PostmasterMain + 0x789
24 0x485fab postgres main + 0x3bb
25 0x7f346d5d7af5 libc.so.6 __libc_start_main + 0xf5
26 0x4860c9 postgres + 0x4860c9
"
2015-11-10 00:53:24.050809 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","terminating any other active server processes",,,,,,,0,,"postmaster.c",5563,
2015-11-10 00:53:24.093364 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","filerep main process (PID 15205) exited with exit code 2",,,,,,,0,,"postmaster.c",5854,
2015-11-10 00:53:24.093395 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","BeginResetOfPostmasterAfterChildrenAreShutDown: counter 234",,,,,,,0,,"postmaster.c",2177,
2015-11-10 00:53:24.093405 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","gp_session_id high-water mark is 0",,,,,,,0,,"postmaster.c",2203,
2015-11-10 00:53:24.158949 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","resetting shared memory",,,,,,,0,,"postmaster.c",4249,
2015-11-10 00:53:24.158994 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","temporary files using default filespace",,,,,,,0,,"primary_mirror_mode.c",2569,
2015-11-10 00:53:24.159003 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","transaction files using default pg_system filespace",,,,,,,0,,"primary_mirror_mode.c",2629,
2015-11-10 00:53:24.237936 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","PrimaryMirrorMode: Processing postmaster reset with recent mode of 4",,,,,,,0,,"primary_mirror_mode.c",1124,
2015-11-10 00:53:24.237973 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","PrimaryMirrorMode: Processing postmaster reset to non-fault state",,,,,,,0,,"primary_mirror_mode.c",1141,
2015-11-10 00:53:24.237987 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","removing all temporary files",,,,,,,0,,"fd.c",1883,
2015-11-10 00:53:24.237994 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","BeginResetOfPostmasterAfterChildrenAreShutDown: should restart peer",,,,,,,0,,"postmaster.c",2226,
2015-11-10 00:53:24.238000 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","all server processes terminated; requested filerep peer reset",,,,,,,0,,"postmaster.c",2229,
2015-11-10 00:53:24.238006 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","peer reset process pid is 15236",,,,,,,0,,"postmaster.c",2231,
2015-11-10 00:53:24.419306 CST,,,p31434,th1857087552,,,,0,,,seg-1,,,,,"LOG","00000","received immediate shutdown request",,,,,,,0,,"postmaster.c",4409,
2015-11-10 00:53:24.421245 CST,,,p15236,th1857087552,,,,0,,,seg-1,,,,,"WARNING","01000","during reset, unable to contact primary/mirror peer to coordinate reset; will transition to fault state. Error code 16 and message 'failure: interrupted","'",,,,,,0,,"cdbfilerepresetpeerprocess.c",158,
Please advise if I am missing something here.