Database failover
A clustered database, such as Oracle RAC, can couple one or more separate database servers into one database instance to gain resilience and scalability. A KCML process can connect to any of the nodes in the database using KI_CONNECT and nominate another secondary node to be used in the event of a failure of the primary server.
If KCML detects that the database server has failed it can switch to the secondary server and carry on as KCML has full knowledge of the application state. The application and the user will be unaware of the failure.
If the program is executing inside a transaction then it will restart the WHILE TRANS at the top. The old transaction will not have been committed on the failed server so this is safe provided the application is written to allow WHILE TRANS blocks to be restarted.
If outside a transaction a failed random read can just be repeated but a failed sequential read will need to reissue the START and skip forward to the same position in the result set. While this is done transparently, it may take some time.
If the program was executing any sort of complex DDL statement that involves several committing SQL statements then it may error and produce a PANIC.
Oracle errors treated as failover
The following Oracle error codes are treated as a cause of failover. If other error Oracle errors need to be added to this list they can be done at run-time. Use KI_CONNECT_GET_ATTR_STR to fetch the existing list as a comma separated list of decimal numbers, add new errors and set using KI_CONNECT_SET_ATTR_STR, using the constant _KDB_CONN_FAILOVER_ERRORS. The list is automatically sorted and de-duplicated.
1012 | not logged on to Oracle |
1033 | Oracle initialization or shutdown in progress |
1034 | ORACLE not available |
1089 | immediate shutdown in progress, no operations are permitted |
1092 | ORACLE instance terminated. Disconnection forced (since 07.18.01.24198) |
3113 | end-of-file on communication channel |
3114 | not connected to ORACLE |
3135 | connection lost contact |
12203 | TNS: unable to connect to destination |
12500 | TNS: listener failed to start a dedicated server process |
12514 | TNS: listener could not resolve SERVICE_NAME given in connect descriptor |
12528 | TNS: listener: all appropriate instances are blocking new connection (since 07.18.01.24198) |
12541 | TNS: no listener |
12543 | TNS: destination host unreachable (since 07.18.01.24198) |
12560 | TNS: protocol adaptor error |
12571 | TNS: packet writer failure |
65535 | failed to get Oracle error |