Ostrich is a library that enables SOA architectures to be built quickly and easily.
Currently Ostrich requires that service consumers be JVM based. This will change in the future.
SOA Library
License: Apache License 2.0
Ostrich is a library that enables SOA architectures to be built quickly and easily.
Currently Ostrich requires that service consumers be JVM based. This will change in the future.
The TeamCity build randomly fails when no code has been changed (it's currently running hourly). We're currently experiencing 1-2 random failures per day.
I believe this is an issue between Curator's TestingServer and Apache's ZooKeeper library. I have a very simple test case that I've written that exposes this: https://gist.github.com/2891890. Given enough iterations this test will eventually get into an infinite loop. I believe this same problem is what is affecting our TeamCity build.
... to be consistent with ServiceEndPoint.
When closed it should terminate any background health check threads.
Currently in the ServicePool
every time we execute a callback we as the ServiceFactory<S>
to create a new instance of a service. Depending on the implementation of the ServiceFactory
this could be an expensive operation (it may need to establish a new connection to the remote server, etc.). We should offer the ability to cache these instances so that they don't have to be recreated every time.
This functionality should probably be something that individual service providers should control, not service consumers.
Use this in various places where we allow retries. This will make it so classes don't have to throw magic ServiceException instances to indicate that their exceptions are retryable.
We need several pieces of documentation:
Docs need to be updated for the new project dependencies (i.e., use of curator directly instead of through ZooKeeperConnection).
Right now ServiceException
is thrown when most things go wrong. We should be more specific than that, and have subclasses that represent different things. The following cases are useful to users:
We don't want to bombard a server that comes back up with a ton of health checks. It would be nice to space these out by waiting before the first check by some random amount of time.
Maybe some type of backoff strategy should be employed.
Objects.toStringHelper
was deprecated in Guava v18, and was removed in v21 (~June 2016).
Objects.toStringHelper
was replaced with MoreObjects.toStringHelper
We should be in an application/library specific path instead of the very generic, and non-identifiable /services.
It's highly likely that services will be partitioned (e.g. one node will only be able to service queries for a specific range of data). Ostrich needs to support people in authoring and consuming these types of services.
At the same time, a lot of services will only have a single partition. For these services complexity of using Ostrich shouldn't increase in a noticeable way.
At a high level this change will require:
ServicePool
to receive a partition key. This should probably be opaque from the perspective of Ostrich.LoadBalanceAlgorithm
to support receiving a partition key, so that the load balancer can choose a suitable service end point to use.ServiceFactory
know the partition key? I'm not sure this is useful.Currently when a user creates a ServicePool the code looks something like this:
ZooKeeperConnection connection = new ZooKeeperConfiguration()
.setConnectString(connectString)
.setRetryNTimes(new com.bazaarvoice.soa.zookeeper.RetryNTimes(3, 100))
.connect();
ThreadFactory daemonThreadFactory = new ThreadFactoryBuilder()
.setDaemon(true)
.build();
ServicePool<CalculatorService> pool = new ServicePoolBuilder<CalculatorService>()
.withHostDiscovery(new ZooKeeperHostDiscovery(connection, "calculator"))
.withServiceFactory(new CalculatorServiceFactory())
.withHealthCheckExecutor(Executors.newScheduledThreadPool(1, daemonThreadFactory))
.build();
There are a few things that I consider wrong with this picture:
ZooKeeperHostDiscovery
instance, the user needs to know where in ZooKeeper the registration nodes are being stored (e.g. the "calculator" parameter). The CalculatorServiceFactory
object actually has that knowledge inside of it, so we shouldn't bleed that information to the user of the service.I would like to see the above code be rewritten to something like:
ZooKeeperConnection connection = new ZooKeeperConfiguration()
.setConnectString(connectString)
.setRetryNTimes(new com.bazaarvoice.soa.zookeeper.RetryNTimes(3, 100))
.connect();
ServicePool<CalculatorService> pool = new ServicePoolBuilder<CalculatorService>()
.withZooKeeperHostDiscovery(connection)
.withServiceFactory(new CalculatorServiceFactory())
.build();
Exceptions thrown during a service pool execute method can result in that endpoint being marked unhealthy. When no more endpoints are available an OnlyBadHostsException (OBHE) is thrown. It would be useful for debugging to include the underlying exceptions in the OBHE so that the root cause of the failing services can be determined.
EmoDB has a handful of classes that make it easier to use Ostrich and Dropwizard together. They are useful to other projects that, right now, pull these classes from EmoDB.
See the code here:
ConfiguredFixedHostDiscoverySource
and ConfiguredPayload
make it easier to configure fixed end points in YAML config files. The interface is a bit awkward, though, because you must create a service-specific subclass (example).Payload
and PayloadBuilder
remove some of the tedious work required to create and parse ServiceEndPoint
payloads.ManagedRegistration
ties host discovery registration and unregistration to Dropwizard lifecycle events.ResourceRegistry
uses the ServiceName
annotation and Jersey Path
annotation to build ServiceEndPoint
objects and register a resource with both Jersey and host discovery.I don't expect you to take these classes as-is. Pick and choose and refactor as you see fit.
We should probably support a load balancing strategy other than random. The most logical one would be something like least loaded. This could be a measure of number of local or global connections to a service, or something like the load on the remote server.
The data team had an issue earlier today where an instance seemingly lost connection to ZooKeeper. There wasn't a good way to diagnose this at runtime, so adding some metrics to Ostrich to show connection states and things that are happening with the connection may be useful.
Action:
We have to make sure that this is needed.
If we wanted to support non-JVM language service providers it could probably be done really easily without having to write separate code for each language. We could write a simple dropwizard service that receives a POST with service endpoint info inside of it. The service would take the endpoint info and create an ephemeral node in ZooKeeper on behalf of the caller. After creating the node it would NOT close the HTTP connection. Instead it would monitor the connection, and if the caller ever closes it, the service would delete the ephemeral node. So in this model having an open connection to a webserver is a proxy for a service being alive. If that connection closes then the server is assumed to not be alive anymore. When HTTP timeouts happen the client will have to reestablish the connection if it still wants to be available.
Given that pretty much all modern languages have the ability to make a HTTP POST request this should enable a service written in any language to be made available through Ostrich. Of course the service provider would have to write a client library in all languages that they have users in.
If the service created allocates some resources there is no way today to reclaim them after it's evicted from the cache (in timeout cases for example).
A method should be exposed to implement that would be called when that happens.
The com.bazaarvoice.soa.Service interface presents no real use and requires modification of the service itself (assuming the consumer is following the Dropwizard project-api, project-client, project-service project structure).
com.bazaarvoice.soa.ServiceCallback requires some return type from a service method invocation.
In the case of a "void" return type on a service method (like Databus.subscribe), it would be nice to not have to "return null".
databusServicePool.execute(new RetryNTimes(3, 100, TimeUnit.MILLISECONDS), new ServiceCallback<Databus, Object>() {
@OverRide
public Object call(Databus service) throws ServiceException {
service.subscribe(DATABUS_SUBSCRIPTION_NAME, 86400, 86400);
return null;
}
});
would be nice to have the following instead:
databusServicePool.execute(new RetryNTimes(3, 100, TimeUnit.MILLISECONDS), new ServiceCallbackWithoutResult() {
@OverRide
public void call(Databus service) throws ServiceException {
service.subscribe(DATABUS_SUBSCRIPTION_NAME, 86400, 86400);
}
});
We should have a method to perform long running stability testing (hours if not days) for Ostrich. We need to make sure that we correctly handle all sorts of error conditions such as:
There are a few things wrong with ServiceEndpoint
ServiceEndPoint
with a capital P. This is consistent naming with other projects out there.The Netflix Curator library used by soa (curator-recipes 1.1.12) needs more from the curator-client than version 1.1.2 included with curator-framework 1.1.2 included with Bouncer.
Client applications that use a ServicePool should integrate into their own health check a verification that their dependencies are also healthy.
The implementation may be as simple as checking that there's at least one end point that's not marked bad. It would be better to actually ping through to at least one end point as part of computing isHealthy().
We should find a way to expose this for proxies that wrap a ServicePool, too. For example, assuming Dropwizard:
MyService service = ServicePool.create(MyService.class)...buildProxy(retryPolicy);
environment.addHealthCheck(new HealthCheck("my-service") {
@Override
protected Result check() {
// TODO: it's nice to include a string w/the name of the live endpoint + timing info
// like "localhost 493us"
return ServiceProxies.isHealthy(service) ? Result.healthy() : Result.unhealthy();
}
});
environment.addManaged(new Managed() {
@Override
public void start() {}
@Override
public void stop() {
ServiceProxies.closeQuietly(service);
}
});
Not completely sure how this should be done exactly, but we have a goal of being able to collect several metrics about the codebase: https://bits.bazaarvoice.com/confluence/display/DEV/SOA+Metrics
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.