AI agents are autonomous systems that can execute complex logical tasks on behalf of a user by retrieving additional information, recalling historical interactions and programmatically invoking external tools to take action, plan and decide on what to do next, an AI agent can :
- Observe : Observe its environment (data, messages, user query, sensor values).
- Reasons : analyses and plans on what to do next based on constraints or heuristics.
- Act : Act can be in the form of interacting with environment or answering a question or calling a function.
- Learns(optional) : Agent can learn from its own mistakes, providing better output over time.

Workflows vs. Agents
The following table highlights the key differences between workflows and agents.
Criteria | Workflows | Agents |
|---|---|---|
Definition | Pre defined rule based sequence of steps | Autonomous systems that decide the steps |
Control | High Human control | shared control between human and the system |
Flexibility | Low - Fixed execution path | High - Complex branches & loops |
Best suited | Repeatable, Deterministic processes | Open-ended complex problem solving |
Examples | ETL jobs, data validation | AI coding, research agents |
Key components of an Agent
1. LLM (Large Language model)
An agent requires a LLM to function , a LLM can be thought of as the brain of the agent , it analyses, plans and decides the next action to take , a stronger LLM generally leads to better outcomes , but this is not always true .
- A bigger LLM trades better outputs for increased latency.
- In some cases , a smaller language model can outperform a large language model on niche tasks.
- Examples of popular open-source LLMs are : llama : 8B, GPT-OSS 20B, qwen - 2.5B.
2. Working Memory
Working memory or contextual memory stores information about previous steps taken or executed , it can be thought of as memory of a model , helping it remember contexts and provide accurate answers , for e.g. if you ask a question "what are my current sales in 2025" and then follow it up with "give top 10" , the model can automatically reason that you are talking about "top 10 sales in 2025".
- Context retention : Stores information from previous steps, messages or actions so the agent can maintain continuity across a task or conversation.
- State tracking : Helps the agent keep track of what has already been done, what data is available and what needs to happen next.
- Improved reasoning : Enables follow-up questions and implicit references (e.g., understanding that “give top 10” refers to sales in 2025), leading to more accurate and relevant responses.
3. Retrieval
Retrieval allows an agent to access information beyond what is stored inside the language model, enabling accurate and up-to-date responses.
- External knowledge access : Fetches relevant data from documents, databases, APIs or search systems when needed.
- Contextual relevance : Retrieves only the most relevant information to the task, reducing noise and improving efficiency.
- Grounded outputs : Ensures responses are based on real, verifiable data rather than assumptions or hallucinations.
4. Tools
Tools enable an agent to take actions and interact with external systems, extending its capabilities beyond reasoning and text generation.
- Action execution: Allows the agent to perform tasks such as calling APIs, running code, querying databases or triggering workflows.
- System interaction: Enables integration with external services like CRMs, analytics platforms, browsers or operating systems.
- Task completion: Helps the agent move from planning to execution, making it capable of completing real-world tasks rather than only providing suggestions.

Single vs. multi-agent AI pattern
It describes how intelligence is organized within a system. A single-agent AI consists of one autonomous decision-maker that perceives its environment, reasons and acts independently to optimize a defined objective. In contrast, multi-agent AI involves multiple autonomous agents operating within a shared environment, where overall system behavior emerges from their interactions.
Single-agent AI
- Centralized reasoning and control, which simplifies design, training and debugging.
- Best suited for well-defined tasks with limited interaction dynamics (e.g., single-player games, standalone optimization).
- Limited adaptability in highly dynamic or adversarial environments.
Multi-agent AI

- Decentralized decision-making with agents that may coordinate, negotiate or compete.
- Effective for complex systems requiring scalability, robustness or modeling of social/strategic interactions (e.g., traffic systems, markets, swarm robotics).
- Introduces parallelism, allowing agents to work simultaneously and solve problems faster and more efficiently.
Architecture Patterns
1. Prompt Chaining
Prompt chaining is a technique where a complex task is broken into multiple smaller prompts and the output of one prompt becomes the input to the next. Instead of asking the LLM to do everything at once, you guide it step by step.
- Improves accuracy by handling one reasoning step at a time.
- Increases control and transparency over the model’s thinking.
- Works well for multi-step tasks like analysis, planning and generation.

2. Routing pattern
Routing is a pattern where an input is analyzed first and then directed to the most appropriate prompt, tool or agent instead of using a single fixed response path.
- Selects the best handler based on intent, type or complexity.
- Improves efficiency by avoiding unnecessary steps or tools.
- Common in agent systems, customer support bots and workflows.

3. Parallelization
Parallel execution is a pattern where multiple tasks or prompts are run at the same time and their results are later combined to produce a final output.
- Reduces latency by processing independent steps simultaneously.
- Improves coverage by exploring multiple approaches at once.
- Common in evaluation, retrieval and multi-agent systems.

4. Orchestrator - worker pattern
The orchestrator pattern uses a central controller to plan, coordinate and manage multiple tasks, tools or agents to achieve a larger goal efficiently.
- Breaks complex problems into manageable subtasks.
- Controls execution order, dependencies and data flow.
- Common in agent systems, workflows and enterprise automation.

5. Reflection Pattern
The reflection pattern allows a model or agent to review its own outputs, evaluate quality and make improvements before producing the final response.
- Identifies errors, gaps or inconsistencies in reasoning.
- Improves reliability through self-correction loops.
- Common in agent systems, long-form generation and planning tasks.

Implementation
We can implement a simple workflow using langchain, for more complex workflows we use lanngraph.
Step 1: Download & Import the necessary libraries
we will begin by downloading and importing the required packages for our implementation
!pip install langchain transformers torch accelerate langchain_community
import torch
from transformers import pipeline
from langchain_community.llms import HuggingFacePipeline
from langchain_core.prompts import PromptTemplate
Step 2: Initializing a LLM
We will create a HF-text generation pipeline and wrap it in a Langchain LLM for easier access.
pipe = pipeline(
"text2text-generation",
model="google/flan-t5-base",
max_new_tokens=128
)
llm = HuggingFacePipeline(pipeline=pipe)
Step 3: Building Prompt and Creating a chain
We will build a prompt using Prompt template and use langchain's LCEL to initialize flow.
prompt = PromptTemplate.from_template(
"Explain this question clearly: {question}"
)
chain = prompt | llm
Step 4: Invoke the LLM
We will invoke the LLM with a custom query to test its output.
print(chain.invoke("What is computer science?"))
Output:
Computer science is the study of how computers compute, including algorithms, systems and information processing.
You can download full source code from here.