Internals Working of Large Language Models (LLMs)

 

Many real-time applications like ChatGPT, requests are sent and received using event streams, especially when streaming responses





How Requests Are Sent to LLMs like ChatGPT

๐Ÿงต 1. Event Stream (Server-Sent Events / Streaming API)

When you chat with ChatGPT, especially in real-time apps, the request is sent once, and the response comes back gradually as a stream of text.

This is known as event streaming or streamed responses, and it's often handled using:

  • Server-Sent Events (SSE)

  • WebSockets (less common for OpenAI API, more for custom LLM apps)

  • HTTP Streaming (chunked responses)


 



Comments

Popular posts from this blog

Find the Odd One Out using LLMs

What is ChatGPT?

What Is a Vector Database?