Internals Working of Large Language Models (LLMs)
Many real-time applications like ChatGPT, requests are sent and received using event streams, especially when streaming responses
How Requests Are Sent to LLMs like ChatGPT
๐งต 1. Event Stream (Server-Sent Events / Streaming API)
When you chat with ChatGPT, especially in real-time apps, the request is sent once, and the response comes back gradually as a stream of text.
This is known as event streaming or streamed responses, and it's often handled using:
Server-Sent Events (SSE)
WebSockets (less common for OpenAI API, more for custom LLM apps)
HTTP Streaming (chunked responses)
Comments
Post a Comment