We build an advanced Agentic AI System in this tutorial by going beyond basic planner and executor loops. The agent can choose between deep and fast reasoning. A Zettelkasten agentic memory is used to store atomic knowledge, and links it automatically with related experiences. Combining structured state management with memory-aware retrieve, controlled tool invocation and reflexive learning we show how agentic systems today can act, reason and learn rather than just respond. Visit the FULL CODES here.
Pip installs langgraph, langchain and langchain's core. Numpy requests are also supported.
import os, getpass, json, time, operator
From typing, import Dict, List, Any, Optional or Literal
TypedDict Annotated can be imported from typing_extensions
Numpy can be imported as np
Networkx can be imported as nx
BaseModel Field
From langchain_openai, import ChatOpenAI and OpenAIEmbeddings
from langchain_core.messages import SystemMessage, HumanMessage, ToolMessage, AnyMessage
Import tool from langchain_core.tools
StateGraph from langgraph.graph START and END
from langgraph.checkpoint.memory import InMemorySaver
Set up the execution environments by installing the libraries required and importing all core modules. LangGraph orchestration is combined with LangChain model abstractions for tools and models, as well as supporting libraries like memory graphs. Visit the FULL CODES here.
if os.environ.get is not true"OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter OPENAI_API_KEY: ")
MODEL = os.environ.get("OPENAI_MODEL", "gpt-4o-mini")
EMB_MODEL = os.environ.get("OPENAI_EMBED_MODEL", "text-embedding-3-small")
llm_fast = ChatOpenAI(model=MODEL, temperature=0)
llm_deep = ChatOpenAI(model=MODEL, temperature=0)
llm_reflect = ChatOpenAI(model=MODEL, temperature=0)
emb = OpenAIEmbeddings(model=EMB_MODEL)
The OpenAI API keys are loaded securely at runtime, as well as the language models for deep reasoning and fast reasoning. Also, we configure the embedded model in memory that drives semantic similarity. We can switch between reasoning levels while still maintaining the shared memory representation. Visit the FULL CODES here.
Class Note (BaseModel:
Note_id : str
Title:
Content:
Labels[str] = Field(default_factory=list)
Created_at_unix float
The context is Dict[str, Any] = Field(default_factory=dict)
class MemoryGraph
def __init__(self):
self.g = nx.Graph()
self.note_vectors = {}
def _cos(self, a, b):
return float(np.dot(a, b) / ((np.linalg.norm(a) + 1e-9) * (np.linalg.norm(b) + 1e-9)))
def add_note(self, note, vec):
self.g.add_node(note.note_id, **note.model_dump())
self.note_vectors[note.note_id] = vec
Define topk_related as self, vec and k=5.
Score = [(nid, self._cos(vec, v)) for nid, v in self.note_vectors.items()]
scored.sort(key=lambda x: x[1], reverse=True)
Return to the Homepage [{"note_id": n, "score": s, "title": self.g.nodes[n]["title"]} for n, s in scored[:k]]
def link_note(self, a, b, w, r):
If a ==b
self.g.add_edge(a, b, weight=w, reason=r)
def evolve_links(self, nid, vec):
for r in self.topk_related(vec, 8):
If r["score"] >= 0.78:
self.link_note(nid"["note_id"], r["score"], "evolve")
MemoryGraph = MEM()
The Zettelkasten method is used to construct an agentic memory network, in which each interaction can be stored as a note. We connect each note to other semantically similar notes by using similarity scores. Take a look at the FULL CODES here.
@tool
def web_get(url: str) -> str:
Import urllib.request
with urllib.request.urlopen(url, timeout=15) as r:
return r.read(25000).decode("utf-8", errors="ignore")
@tool
def memory_search(query: str, k: int = 5) -> str:
qv = np.array(emb.embed_query(query))
Hits = MEM.topk_related (qv, K)
Return json.dumps (hits, ensure_ascii=False).
@tool
def memory_neighbors(note_id: str) -> str:
If note_id is not found in MEM.g
Return to the Homepage "[]"
return json.dumps([
{"note_id": n, "weight": MEM.g[note_id][n]["weight"]}
for n in MEM.g.neighbors(note_id)
])
TOOLS [web_get, memory_search, memory_neighbors]
TOOLS_BY_NAME = {t.name: t for t in TOOLS}
The external tools that the agent may use, such as web-based access or memory retrieval are defined. These tools are integrated in a structured manner so that the agent is able to retrieve information or query previous experiences. Visit the FULL CODES here.
class DeliberationDecision(BaseModel):
The word "literal" is used to describe the mode.["fast", "deep"]
Reason:
Steps to follow:[str]
Class RunSpec (BaseModel:
goal
Constraints:[str]
deliverable_format: str
Must_use_memory : Bool
Maximum number of tool calls: Int
class Reflection(BaseModel):
Note_title
Note_tags List[str]
new_rules: List[str]
What_worked? List[str]
What_failed List[str]
class AgentState(TypedDict, total=False):
run_spec: Dict[str, Any]
Messages:[List[AnyMessage], operator.add]
decision: Dict[str, Any]
Final:
budget_calls_remaining: int
Tool_calls_used : int
Maximum number of tool calls: Int
last_note_id: str
DECIDER_SYS = "Decide fast vs deep."
AGENT_FAST = "Operate fast."
AGENT_DEEP = "Operate deep."
REFLECT_SYS = "Reflect and store learnings."
We use structured schemas to formalize agent internal representations, including deliberation, goals of execution, global state, and reflection. Also, we specify the prompts for behavior that are used in both fast and deeper modes. It ensures that the agent’s decisions are consistent and logical. Look at theReturn FULL CODES here.
def deliberate(st):
spec = RunSpec.model_validate(st["run_spec"])
d = llm_fast.with_structured_output(DeliberationDecision).invoke([
SystemMessage(content=DECIDER_SYS),
HumanMessage(content=json.dumps(spec.model_dump()))
])
return {"decision": d.model_dump(), "budget_calls_remaining"The st["budget_calls_remaining"] - 1}
Def Agent(st).
spec = RunSpec.model_validate(st["run_spec"])
d = DeliberationDecision.model_validate(st["decision"])
llm = llm_deep if d.mode == "deep" else llm_fast
sys = AGENT_DEEP if d.mode == "deep" Other AGENT_FASTReturn
out = llm.bind_tools(TOOLS).invoke([
SystemMessage(content=sys),
*st.get("messages", []),
HumanMessage(content=json.dumps(spec.model_dump()))
])
return {"messages": [out], "budget_calls_remaining"The st["budget_calls_remaining"] - 1}
Def Route(st)
You can return to your original language by clicking here. "tools" If st["messages"][-1].tool_calls else "finalize"
Def Tools_node (st)
The message is: []
Used = st.get"tool_calls_used", 0)
For example, if you want to know how to c in St["messages"][-1].tool_calls:
TOOLS_BY_NAME[c["name"]].invoke(c["args"])
msgs.append(ToolMessage(content=str(obs), tool_call_id=c["id"]))
Use += 1.Return
return {"messages": msgs, "tool_calls_used": used}
Define finalize (st)
Out = llm_deep.invoke (stReturn["messages"] + [HumanMessage(content="Return final output")])
return {"final": out.content}
Def Reflect(st)
r = llm_reflect.with_structured_output(Reflection).invoke([
SystemMessage(content=REFLECT_SYS),
HumanMessage(content=st["final"])
])
Note = (Note)Return
note_id=str(time.time()),
title=r.note_title,
content=st["final"],
tags=r.note_tags,
created_at_unix=time.time()
)
vec = np.array(emb.embed_query(note.title + note.content))
MEM.add_note(note, vec)
MEM.evolve_links(note.note_id, vec)
return {"last_note_id": note.note_id}
LangGraph is used to implement core agentic behaviors, such as deliberation and action, execution of tools, reflection, and finalization. How information is exchanged between the stages, and how decision-making affects execution are orchestrated. Take a look at the FULL CODES here.
g = StateGraph(AgentState)
g.add_node("deliberate", deliberate)
g.add_node("agent", agent)
g.add_node("tools", tools_node)
g.add_node("finalize", finalize)
g.add_node("reflect", reflect)
g.add_edge(START, "deliberate")
g.add_edge("deliberate", "agent")
g.add_conditional_edges("agent", route, ["tools", "finalize"])
g.add_edge("tools", "agent")
g.add_edge("finalize", "reflect")
g.add_edge("reflect", END)
graph = g.compile(checkpointer=InMemorySaver())
Def run_agent (goal, restrictions=None. thread_id ="demo"):
If the constraints are None, then:
Constraints = []
spec = RunSpec(
goal=goal,
constraints=constraints,
deliverable_format="markdown",
must_use_memory=True,
max_tool_calls=6
).model_dump()
return graph.invoke({
"run_spec": spec,
"messages": [],
"budget_calls_remaining": 10,
"tool_calls_used": 0,
"max_tool_calls": 6
}, config={"configurable": {"thread_id": thread_id}})
All nodes are assembled into a LangGraph Workflow and compiled with state checkpointed management. The runner is a function we can reuse to run the agent, while maintaining memory.
As a conclusion, we demonstrated how an agent could continuously improve its behaviour through memory or reflection rather than by relying solely on hard-coded logic. LangGraph orchestrated deliberation and execution as well as tool governance and reflection in a cohesive graph. OpenAI models provided reasoning and synthesis at each step. This method demonstrated that agentic AI systems could move towards autonomy by adapting the reasoning depth of their AI system, reusing existing knowledge, and encoding learnings as persistent memories.
Click here to find out more FULL CODES here. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.
Our latest releases of ai2025.devThe platform is a focused analytics tool for 2025 that converts model launches, benchmarks and ecosystem activities into structured data you can compare and export.
Asif Razzaq, CEO of Marktechpost Media Inc. is a visionary engineer and entrepreneur who is dedicated to leveraging the power of Artificial Intelligence (AI) for the social good. Marktechpost was his most recent venture. This platform, dedicated to Artificial Intelligence, is both technical and understandable for a broad audience. Over 2 million views per month are a testament to the platform’s popularity.

