Server/Client

Server

The AllQServer is a standalone process that know nothing about other AllQServers or AllQClients. It maintains the state of its own tubes and jobs. It responds to encrypted requests from AllQClients.

Client

The AllQClient can exist on any worker machine or as a standalone machine that multiple workers talk to.

The AllQClient knows about which AllQServers are available. If you have many AllQServers, each client must know about any that you wish the AllQClient to communicate with.

The AllQClient has a built in REST API server that your workers can use to manage jobs on the server.

Architectural Goals

AllQ uses a client/server federated architecture to achieve the following goals

  • Performance

  • Scalability

  • Availability

  • Failover

  • Low Complexity

Non-Goals

  • Durability

  • Redundancy

On the server side, there is no sharing of data. AllQServers do not talk to other AllQServers, and there is no "distributed" or duplicated data shared between them. They are silos of information, and catastrophic failure of a silo could lead to loss of data (there are mitigations, such as using a shared drive persistence layer to help recovery, but that is not offered by AllQ itself, but would be a installation/configuration design consideration). When persistence is turned on, data would survive a machine reboot, but not something like a catastrophic hardware/disk failure.

Why did we design it this way?

As a modern job queue, it is more common to have NO data stored in the queue (or very little) than to have a large "backed up" queue. If a queue is "always" backed up, you probably need more workers. In an ideal situation, when a queue starts to fill, you can add more workers until the queue starts to decline. In modern cloud architectures, this might mean adding 1,000+ workers. AllQ was built for 1,000+ workers.

If a queue is usually at an empty or low data state, data durability becomes less important, and ability to survive 1,000+ worker spikes becomes more important. In our opinion, data loss from unexpected large usage spikes becomes a bigger threat than data loss from hardware failure.

The more AllQServers you have, the bigger the spikes you can handle. During these spikes if you lose a machine, there are many others to take over for it, and its loss would be mitigated.

Additionally, since the AllQServers don't need to talk to each other, configuration and failover become much easier and are handled by the AllQClient.

Architecture Diagram

Last updated