Model Context Protocol

There is a new kid on the block called MCP stands for Model Context Protocol. Before we deep dive into what is MCP, we should understand how LLM gets the knowledge at first place to be able to spit out the response to the user.

How LLMs get their knowledge

LLMs are basically trained on the internet data. So you can imagine all the websites, images, videos etc. is used by the companies creating the model to be able to train the LLMs.

LLMs have two training phases, pre-training and post training. In the pre-training phase, all the internet data is fed into the models to train it and in the post-training phase, the model is trained to be helpful to humans by feeding the conversation which in turn helps the model to develop a persona.

Now pre-training phase is very expensive like expensive expensive. Hence, AI engineers came up with a novel approach called tools. Using tools, you can fetch real time information like price of Bitcoin and provide it to LLMs and in this way we don’t need to do a pre-training phase every week.

Now that you understand how real time knowledge can fed into LLMs, let’s discuss the problem that MCP is trying to solve. Please read my previous article where I explained tool a bit more in detail before proceeding

Problem to be solved

Now that we have tools that can feed real time information to LLMs, the next problem to solve is creating tools. Imagine you want to create a web application and there are NO libraries exists out there. It’s just you and Javascript. Yikes, right!

Now imagine, there are open source tools for pretty much anything you can imagine like Notion, AWS, Jira, Postgres etc. which you can integrate in your AI agents. Boom! Your agents are now capable of fetching all the information from the MCP server you just added. Maybe below digram would help to understand how mind boggling it is

Official Introduction of MCP

MCP is an open protocol that standardizes how applications provide context to LLMs. Anthropic provides the best analogy to think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.

MCP Components

MCP Hosts: The MCP Host is at the core of the system. It can be an application like a chat assistant, an IDE-integrated code assistant, or any AI-powered tool requiring access to external data. The host includes one or more MCP clients.
MCP Clients: The MCP Client operates within the MCP Host, acting as an intermediary between the host application and the MCP Server. It facilitates communication and ensures that relevant tools and data sources are requested correctly.
MCP Servers: The MCP Server serves as a bridge between the MCP Client and external data sources.
Local Data Sources: Your computer’s files, databases, and services that MCP servers can securely access
Remote Services: External systems available over the internet (e.g., through APIs) that MCP servers can connect to

Current Limitations

At the time of writing this, MCP doesn’t support MCP server though there is a product called Composio which offers remote MCP server but it is not official implementation of hosting MCP servers remotely which is originally developed by Anthropic (company behind Claude LLM)

So you can only run MCP server locally to able to take advantage of it. Tools like Claude chat, Cursor etc. acts as MCP Host & Client and you can run a MCP server locally to be able to take advantage of it.

Model Context Protocol

How LLMs get their knowledge

Problem to be solved

Official Introduction of MCP

MCP Components

Current Limitations

More from this blog

Self-accountable Agent

AI Jargons

Was ist Vector Database

Juicebox 🧃

Command Palette

How LLMs get their knowledge

Problem to be solved

Official Introduction of MCP

MCP Components

Current Limitations

More from this blog