Phatso
Phatso8mo ago

Websockets API with async processing (is there a better way?)

My Deno app has a websockets layer that receives "requests" from the websocket clients. The clients ask it to update data in the database, retrieve data, etc. similar to a REST api There will be 5-10 clients who will be sending a very large volume of requests. In my initial setup, I created this websockets layer that would offload the actual work for each request to a pool of Workers, who would then send the response back through the Worker pool. I'm hitting some kind of bottleneck. Despite my best efforts, I don't know how to identify it. So I have a few questions: 1. Do you know of an obvious bottleneck with this setup that I'm not realizing? 2. Is there a better way to go about this? I would really like to keep this app simple, and it doesn't feel simple right now A couple of bits of context: 1. I simply need to use Websockets. In the ecosystem I'm developing for, HTTP calls are not a good option 2. My main goal with the Worker Pool was to make sure my Websockets layer wasn't being blocked or delayed while each request is processed, but I'm starting to second guess this design Some code, in case it helps paint a better picture: The websockets layer:
import { WebSocketClient } from "websocket/mod.ts"

import type { WebSocketServer } from "websocket/mod.ts"
import type { WorkerPool } from "./worker_pool.ts"

export const handleSocket = (wss: WebSocketServer, workerPool: WorkerPool) => {
wss.on("connection", function (ws: WebSocketClient) {
const returnMessage = (message: string) => {
ws.send(message)
}

ws.on("message", function (message: string) {
const command = JSON.parse(message)
const { type } = command

if (type === "request") {
workerPool.handleMessage(command, returnMessage)
}
})
})
}
import { WebSocketClient } from "websocket/mod.ts"

import type { WebSocketServer } from "websocket/mod.ts"
import type { WorkerPool } from "./worker_pool.ts"

export const handleSocket = (wss: WebSocketServer, workerPool: WorkerPool) => {
wss.on("connection", function (ws: WebSocketClient) {
const returnMessage = (message: string) => {
ws.send(message)
}

ws.on("message", function (message: string) {
const command = JSON.parse(message)
const { type } = command

if (type === "request") {
workerPool.handleMessage(command, returnMessage)
}
})
})
}
The Worker Pool is attached as a file to this message. I'm feeling worried about my decisions here, I need to come up with a good solution soon. Any help at all is greatly appreciated 🙏
7 Replies
Leokuma
Leokuma8mo ago
I can't help much, but I suggest you limit the number of workers to the number of CPUs. You can get the number of CPU cores with navigator.hardwareConcurrency
Phatso
PhatsoOP8mo ago
Thanks!
raunioroo
raunioroo8mo ago
Just a thought: Could each worker have a standalone websocket server, just on their own ports? Clients would connect directly to own worker port, assigned somehow to balance workload to the number of cores. This would negate the need for all connections to go through the main process/thread, as well as negate redundant message serialization & deserialization between workers and the main process (JSON.parse and I assume passing messages to workers involve some semi-expensive IPC-like message serialization under the hood). Also, not sure if it applies to websockets app like yours, but I do some heavy image processing in thread pool of sorts, and found the optimal number of processing threads to be number_of_cores - 1. I assume it's because if I saturate ALL the cores completely with image processing, the various other system threads or processes may get bogged down also then everything can become a bit unresponsive. But if I always have 1 semi-idle core, the system always has a place to execute background stuff on, resulting in smoother overall performance and even if slightly less "images per second", the performance "per image" is more consistent and predictable.
jcayzac
jcayzac8mo ago
Are you sending requests to your workers using JSON? It must be pretty slow. You should be able to pass anything that's Transferrable as-is. For anything you need to serialize, personally I rely on the v8 serializer, but it's in Deno[Deno.internal].core so I guess it's a bad habit 🙂 (I really wish it was public and marked "stable")
Leokuma
Leokuma8mo ago
The bottleneck could be the DB too. Some operations can lock tables, especially updates. And then other queries have to wait until they can be executed
Phatso
PhatsoOP8mo ago
The websocket layer is receiving the message, deserializing it, and then sending it over to the Worker. Is that better or worse than sending the string? In my tests, I was just getting data - the db could still be holding things up though I guess
raunioroo
raunioroo8mo ago
Someone correct if I'm wrong, but I think worker comms involve some internal serialization under the hood (some relative of structuredClone?). Which would be faster than JSON, but slower than not reserializing. So I'd think it's probably a little faster to just send the string to the worker and parse the JSON there. Which makes sense also in that if you are using multiple workers to distribute load, you'd want to move as much processing to the workers as possible, keeping the main process as lightweight as possible (since every request goes through that single threaded "load balancer", you want to avoid it becoming a bottleneck).