NoSQL: Exiting inner Replica loop with exception com.sleepycat.je.rep.RollbackProhibitedException

$ cd /opt/ece/data/nosql/storage2/kvroot/ECEStore/log
$ (for i in $(ls rg*-rn*_0.log); do tail -200 $i |  grep "Originally thrown by HA thread" | awk '{ print "java  -jar $KVHOME/lib/je.jar DbTruncateLog -h " $3 " -f " $69 " -o " $71}' | sed 's/.jdb,//g' | sed 's/,//g' | sed 's/rg.*-rn.*(.*)\://g'; done ) | sort | uniq
java  -jar $KVHOME/lib/je.jar DbTruncateLog -h /opt/ece/data/nosql/storage2/u01/rg2-rn5/env -f 00000000 -o 0x3bab9f7
java  -jar $KVHOME/lib/je.jar DbTruncateLog -h /opt/ece/data/nosql/storage2/u02/rg1-rn5/env -f 00000000 -o 0x307779c

 

If a NoSQL Storage Node is not running for a while, after startup it may complaining:

2016-11-17 13:22:34.934 UTC WARNING [admin1] JE: Exiting inner Replica loop with exception com.sleepycat.je.rep.RollbackProhibitedException: (JE 6.4.15) 1(1):/opt/ece/data/nosql/storage2/kvroot/ECEStore/sn1/admin1/env Node 1(1):/opt/ece/data/nosql/storage2/kvroot/ECEStore/sn1/admin1/env must rollback 41 commits to the earliest point indicated by transaction id=-392 time=2016-11-17 14:18:53.969 vlsn=884 lsn=0x0/0x6e4be1 in order to rejoin the replication group, but the transaction rollback limit of 10 prohibits this. Either increase the property je.rep.txnRollbackLimit to a value larger than 10 to permit automatic rollback, or manually remove the problematic transactions. To do a manual removal, truncate the log to file 00000000.jdb, offset 0x6e4944, vlsn 881 using the directions in com.sleepycat.je.util.DbTruncateLog.  ROLLBACK_PROHIBITED: Node would like to roll back past committed transactions, but would exceed the limit specified by je.rep.txnRollbackLimit. Manual intervention required. Environment is invalid and must be closed.

After examining the com.sleepycat.je.util.DbTruncateLog, suggested way to fix the problem turned out to be:

  • stop the SN
  • execute
java  -jar $KVHOME/lib/je.jar DbTruncateLog -h /opt/ece/data/nosql/storage2/kvroot/ECEStore/sn1/admin1/env -f 00000000 -o 0x6e4944

(obviously, exact parameters depends on the situation, check the log for detail)

Edit

A quick and dirty snippet to get the commands out of the log:

$ cd /opt/ece/data/nosql/storage2/kvroot/ECEStore/log
$ (for i in $(ls rg*-rn*_0.log); do tail -200 $i |  grep "Originally thrown by HA thread" | awk '{ print "java  -jar $KVHOME/lib/je.jar DbTruncateLog -h " $3 " -f " $69 " -o " $71}' | sed 's/.jdb,//g' | sed 's/,//g' | sed 's/rg.*-rn.*(.*)\://g'; done ) | sort | uniq
java  -jar $KVHOME/lib/je.jar DbTruncateLog -h /opt/ece/data/nosql/storage2/u01/rg2-rn5/env -f 00000000 -o 0x3bab9f7
java  -jar $KVHOME/lib/je.jar DbTruncateLog -h /opt/ece/data/nosql/storage2/u02/rg1-rn5/env -f 00000000 -o 0x307779c

 

NoSQL: Exiting inner Replica loop with exception com.sleepycat.je.rep.RollbackProhibitedException

Leave a Reply

Your email address will not be published. Required fields are marked *