- Add a "retry count" to the object (which implements Runnable). Increment the counter on failure and if the counter is less than N, then place the Runnable back in the queue. After N retries, log a failure and give up.
- Set up a separate thread, and upon failure, queue of the object to run again after a period of time. Continue to retry every M seconds up to N times, then give up.
- Similar to above; initially sleep for a long period of time (an hour), then try N times over M minutes before giving up.
- Increase the wait interval, M, betwen retries; keep trying until the interval reaches a defined maximum.
- ..etc..
As far as I could determine, there isn't one. Hasn't everyone at one point had to write some sort of retry logic? Why isn't there a framework of some sort for that?
4 comments:
The problem with a retry framework is that there is no real de facto technique for 'doing something later' in Java yet.
If you had such a technique, then writing a retry framework would be a sensible step.
Candidates for the technique to do something later are:
Quartz Jobs
Weblogic 'send later' JMS messages
...and other persistent events
... as an afterthought
JMX Timers are J2SE based, and if you could find a sensible way to persist them such that they survived a JVM restart then a retry framework is only a day's coding away.
There isn't a defacto way of doing something later - you're right - but there's a few common sense ways, like Runnables, and a stab at a retry framework could promote a common way of writing logic that a developer wants to keep attempting until success...
I'm currently working on a project with a few jboss clusters and tangosol caches on the server side, and a swing client for the users.
To handle connection issues and other remote explosions I decided to place all tasks that interact with a remote resource within a Callable object, and pass this object to a task service.
The components:
Manager
* facade to control access to remote resources
ManagerTaskService
* uses backing ExecutorService for async tasks
TaskFailedHandler
* controls reconnect/retry logic
ManagerConnection
* Maintains connection state safely in concurrent environments. Uses notify/wait for responsiveness, and immutable state objects.
Synchronous tasks are executed using ManagerTaskService.invokeAndWait, which handles waiting for the manager to be ready, deciding whether a failed task should be retried, and whether the manager needs to reconnect.
Asynchronous tasks are wrapped by a Callable that runs invokeAndWait, and then added to the ExecutorService.
Here's the code I used for invokeAndWait:
public <CVal> CVal invokeAndWait(final String name, final Callable<CVal> task) {
TaskFailedHandler failureHandler = failureHandlerFactory.newInstance();
for (;;) {
try {
manager.waitFor();
return task.call();
}
catch (Exception ex) {
failureHandler.exceptionThrown(ex);
if (failureHandler.shouldReconnect(ex)) {
manager.requestReconnect();
}
if (false == failureHandler.shouldRetry(ex)) {
log.error("invokeAndWait - not retrying due to TaskFailureHandler policy - " + name);
return null;
}
}
}
}
Post a Comment