Investigating memory leak after adding new app dependency
Hi, I added an OpenTelemetry metrics reporting loop to a Deno program and found that Deno began leaking memory (300MB+, until it hits OOM and gets killed).
This program was previously stable at 40MB RAM usage.
To begin, I graphed
Deno.memoryUsage()
and it seems like the leaked memory is all external
.
How else can I investigate the issue? Anything else I can add to my typescript to observe this?4 Replies
@danopia any chance you could say what the script does? There was another "memory leak" issue opened just yesterday and we're investigating
I added an OpenTelemetry-JS metric reporting loop which ultimately does a fetch() every 20 seconds. But there's a bunch of code inside all the OpenTelemetry libraries that I haven't seen, so I don't know what else it does before the fetch() call 😦
I also checked Deno.resources() which looks flat, and I have a list of dispatched ops in a 30-minute window via Deno.metrics().
I can try putting together a sharable reproduction later today but getting it minimal might be tricky. I was hoping for some way I can self-diagnose what all the external memory is
ok, I'll keep investigating later in the day and see if I can isolate which layer of my code is triggering it 🙂
@.bartlomieju Oh, do you mean #18369 memory leak for >1.31.0?
I already verified my behavior is the same on 1.30.0 so what I'm looking at is not a regression
I'm trying to get a minimal repro of my issue put together now and it seems like some irresponsible Response cloning here: https://github.com/open-telemetry/opentelemetry-js/blob/0c37daa0b1fe59a04810514ab4f995070ed84825/experimental/packages/opentelemetry-instrumentation-fetch/src/fetch.ts#L342-L344
I don't see anything cancelling the body of
resClone4Hook
Yea, that opentelemetry-js code looks like it was my memory leak ^^
The fetch response bodies must be sitting in external
Yes, that what I meant
Alright! Glad that you found it. If you have any suggestions how we could improve visibility to things like that I'm very open to feedback.
On related note, we thought about adding more first class support for visibility/observability - Node.js has "diagnostics channel", but it doesn't seem like a silver bullet. So again - any feedback and requests here are much appreciated