Internals Working of Large Language Models (LLMs)
Many real-time applications like ChatGPT, requests are sent and received using event streams , especially when streaming responses How Requests Are Sent to LLMs like ChatGPT ๐งต 1. Event Stream (Server-Sent Events / Streaming API) When you chat with ChatGPT, especially in real-time apps, the request is sent once , and the response comes back gradually as a stream of text. This is known as event streaming or streamed responses , and it's often handled using: Server-Sent Events (SSE) WebSockets (less common for OpenAI API, more for custom LLM apps) HTTP Streaming (chunked responses)