I'm trying to find a way to make one http stream (HLS) that combines views from 2-3 cameras (Android IP Webcam running on old mobile phones). Ideally I would like to minimize the latency; the streaming is only on my local network.

So far, I've looked into either streaming solutions (OBS Studio, Streamlabs, ...), surveillance software (Shinobi, Blueiris, AgentDVR, ...) or streaming servers (Mistserver, srs, restreamer, ...), but I always run into something that does not work.

I know I can do it with OBS Studio or Streamlabs: stream the cameras to it as rtsp and then stream the output to restreamer which creates an HLS stream that I want. This would be perfect, as I can select the scenes (which cameras or which layout), but the latency is very big.
Using surveillance software, I would get much lower latency (e.g. AgentDVR), but it does not stream to HLS...

Any ideas?
(the purpose is to monitor a 3D printer using multiple cameras, but the web-interface of my printer only allows for one webcam; for now I use multiple tabs with the different camera-views, but it would be nice to see it in one screen, with the printer controls - it is not such a big deal, but I somehow got focused on it. it should run either on Windows or Docker.)