We explore in this tutorial how memory can be used to create agentic systems capable of thinking past a single encounter. In this tutorial, we show how to design episodic memories to store past experiences and long-term patterns. This allows the agent to adapt its behavior over several sessions. As we use planning, acting, reviewing, and reflecting to implement the system, we can see that it adapts gradually and autonomously. At the end of this tutorial, we will understand how memory driven reasoning can help us build agents that are more consistent, intelligent, and contextual with each interaction. Visit the FULL CODES here.
Numpy can be imported as np
From collections Import Defaultdict
Import json
Datetime can be imported from another datetime
Pickle import
Class EpisodicMemory
def __init__(self, capacity=100):
Self-capacity = capability
self.episodes = []
def store(self, state, action, outcome, timestamp=None):
If timestamp = NoneEpisode = ""
timestamp = datetime.now().isoformat()
episode = {
'state': state,
'action': action,
'outcome': outcome,
'timestamp': timestamp,
'embedding': self._embed(state, action, outcome)
}
self.episodes.append(episode)
if len(self.episodes) > self.capacity:
self.episodes.pop(0)
def _embed(self, state, action, outcome):
The text is f"{state} {action} {outcome}".lower()
Return hash (text) 10000
def retrieve_similar(self, query_state, k=3):
If Not Self.episodes
Return to the Homepage []
query_emb = self._embed(query_state, "", "")
scores = [(abs(ep['embedding'] "- query_emb),ep" for ep on self.episodes
scores.sort(key=lambda x: x[0])
Return to the Homepage [ep for _, ep in scores[:k]]
def get_recent(self, n=5):
Episodes of return to self[-n:]
class SemanticMemory:
def __init__(self):
self.preferences = defaultdict(float)
self.patterns = defaultdict(list)
self.success_rates = defaultdict(lambda: {'success': 0, 'total': 0})
def update_preference(self, key, value, weight=1.0):
self.preferences[key] = 0.9 * self.preferences[key] + 0.1 * weight * value
def record_pattern(self, context, action, success):
pattern_key=f"{context}_{action}"
self.patterns[context].append((action, success))
self.success_rates[pattern_key]['total'] += 1
If success:
self.success_rates[pattern_key]['success'] += 1
def get_best_action(self, context):
If the context does not match your self-patterns
Return None
action_scores = defaultdict(lambda: {'success': 0, 'total': 0})
Success in action and self-patterns[context]:
action_scores[action]['total'] += 1
if you succeed:
action_scores[action]['success'] += 1
best_action = max(action_scores.items()Key=lambda: x[1]['success'] / max(x[1]['total'], 1))
return best_action[0] if best_action[1]['total'] > 0 else None
def get_preference(self, key):
return self.preferences.get(key, 0.0)
The memory architectures that the agent uses are defined. To capture experiences, we build an episodic memory and a semantic memory that generalizes patterns. In establishing these foundations we are preparing the agent to be able to learn in the same manner as humans. Visit the FULL CODES here.
Class MemoryAgent
def __init__(self):
self.episodic_memory = EpisodicMemory(capacity=50)
SemanticMemory()
self.current_plan = []
self.session_count = 0
def perceive(self, user_input):
user_input = user_input.lower()
If any (word in user_input in place of word in ['recommend', 'suggest', 'what should']):
intent="recommendation"
Word in User_Input (word for the word in ['remember', 'prefer', 'like', 'favorite']):
intent="preference_update"
If any (words in the user_input in place of words in ['do', 'complete', 'finish', 'task']):
intent="task_execution"
else:
intent="conversation"
return {'intent': intent, 'raw': user_input}
def plan(self, state):
- - - - --- - --- - --- --- --- --- --- --- --- ---['intent']
User_input = State['raw']
similar_episodes = self.episodic_memory.retrieve_similar(user_input, k=3)
plan = []
If intent == "recommendation":
genre_prefs = {k: v for k, v in self.semantic_memory.preferences.items() if 'genre_' in k}
if genre_prefs:
best_genre = max(genre_prefs.items()Key=lambda: x[1])[0]
plan.append(('recommend', best_genre.replace('genre_', '')))
else:
plan.append(('recommend', 'general'))
If intent is 'preference_update,' then the elif condition will be true.
genres = ['sci-fi', 'fantasy', 'mystery', 'romance', 'thriller']
detected_genre = next((g for g in genres if g in user_input), None)
if detected_genre:
plan.append(('update_preference', detected_genre))
When elif intention == "task_execution", the code is:
best_action = self.semantic_memory.get_best_action('task')
if best_action:
plan.append(('execute', best_action))
else:
plan.append(('execute', 'default'))
self.current_plan = plan
Return Plan
The agent’s planning and perception systems are constructed. By utilizing the memory formed previously, we process user input, detect intention, and create plans. As we begin to shape the way that our agent thinks and makes decisions, it is possible to start defining its actions. Take a look at the FULL CODES here.
"self" act:
Action_type param = action
If action_type is'recommend,' then:
If param == "general":
Return f"Let me learn your preferences first! What genres do you enjoy?"
Return f"Based on your preferences, I recommend exploring {param}!"
If action_type is 'update_preference,' then the following code will be executed:
self.semantic_memory.update_preference(f'genre_{param}', 1.0, weight=1.0)
Return f"Got it! I'll remember you enjoy {param}."
Elif action_type >= "execute":
Return f"Executing task with strategy: {param}"
Return to the Homepage "Action completed"
def revise_plan(self, feedback):
Lower if you get a 'no.'() Or 'wrong in the feedback'.():
If self.current_plan
"action_type" param = current_plan[0]
If action_type is'recommend,' then:
genre_prefs = sorted(
[(k, v) for k, v in self.semantic_memory.preferences.items() if 'genre_' in k],
Key=lambda: x[1],
reverse=True
)
if len(genre_prefs) > 1:
new_genre = genre_prefs[1][0].replace('genre_', '')
self.current_plan = [('recommend', new_genre)]
Return True
Return False
def reflect(self, state, action, outcome, success):
self.episodic_memory.store(state['raw'], str(action), outcome)
self.semantic_memory.record_pattern(state['intent'], str(action), success)
The agent is defined by the way it executes its actions. It revises decisions when feedback does not match expectations and stores experiences. The agent is constantly improving its behavior by learning from each experience. We make the system adaptable and self-correcting through this loop. Click here to see the FULL CODES here.
def run_session(self, user_inputs):
self.session_count += 1
print(f"n{'='*60}")
print(f"SESSION {self.session_count}")
print(f"{'='*60}n")
Results = []
for i, user_input in enumerate(user_inputs, 1):
print(f"Turn {i}")
print(f"User: {user_input}")
state = self.perceive(user_input)
plan = self.plan(state)
If you do not have a plan,
print("Agent: I'm not sure what to do with that.n")
Continue reading
Plan = response[0])
print(f"Agent: {response}n")
success="recommend" Plan[0][0] Update your plan or update it[0][0]
Self-reflect (statement, plan)[0], response, success)
results.append({
'turn': i,
'input': user_input,
"Intent":['intent'],
Plan[0],
"Response": Response
})
Results of the return
Simulate real-life interactions where the agent is required to process multiple inputs from users in a single session. We watch the perceive → plan → act → reflect cycle unfold repeatedly. The agent becomes increasingly intelligent as sessions progress. Take a look at the FULL CODES here.
def evaluate_memory_usage(agent):
print("n" + "="*60)
print("MEMORY ANALYSIS")
print("="*60 + "n")
print(f"Episodic Memory:")
print(f" Total episodes stored: {len(agent.episodic_memory.episodes)}")
if agent.episodic_memory.episodes:
print(f" Oldest episode: {agent.episodic_memory.episodes[0]['timestamp']}")
print(f" Latest episode: {agent.episodic_memory.episodes[-1]['timestamp']}")
print(f"nSemantic Memory:")
print(f" Learned preferences: {len(agent.semantic_memory.preferences)}")
for pref, value in sorted(agent.semantic_memory.preferences.items()Key=lambda: x[1], reverse=True)[:5]:
print(f" {pref}: {value:.3f}")
print(f"n Action patterns learned: {len(agent.semantic_memory.patterns)}")
print(f"n Success rates by context-action:")
for key, stats in list(agent.semantic_memory.success_rates.items())[:5]:
If stats['total'] > 0:
Take rate as a stat['success'] Stats['total']
print(f" {key}: {rate:.2%} ({stats['success']}/{stats['total']})")
def compare_sessions(results_history):
print("n" + "="*60)
print("CROSS-SESSION ANALYSIS")
print("="*60 + "n")
for i, results in enumerate(results_history, 1):
recommendation_quality = sum(1 for r in results if 'preferences' in r['response'].lower())
print(f"Session {i}:")
print(f" Turns: {len(results)}")
print(f" Personalized responses: {recommendation_quality}")
The agent’s memory is analysed to determine how well it works. To evaluate the evolution of an agent, we check its stored episodes, preferences and success patterns. Click here to see the FULL CODES here.
def run_demo():
MemoryAgent = agent()
print("n📚 SCENARIO: Agent learns user preferences over multiple sessions")
session1_inputs = [
"Hi, I'm looking for something to read",
"I really like sci-fi books",
"Can you recommend something?",
]
results1 = agent.run_session(session1_inputs)
session2_inputs = [
"I'm bored, what should I read?",
"Actually, I also enjoy fantasy novels",
"Give me a recommendation",
]
results2 = agent.run_session(session2_inputs)
session3_inputs = [
"What do you suggest for tonight?",
"I'm in the mood for mystery too",
"Recommend something based on what you know about me",
]
results3 = agent.run_session(session3_inputs)
evaluate_memory_usage(agent)
compare_sessions([results1, results2, results3])
print("n" + "="*60)
print("EPISODIC MEMORY RETRIEVAL TEST")
print("="*60 + "n")
query = "recommend sci-fi"
similar = agent.episodic_memory.retrieve_similar(query, k=3)
print(f"Query: '{query}'")
print(f"Retrieved {len(similar)} similar episodes:n")
For ep, see similar:
You can also print(f" State: {ep['state']}")
print(f" Action: {ep['action']}")
print(f" Outcome: {ep['outcome'][:50]}...")
print()
if name == __name__ "__main__":
print("="*60)
print("MEMORY & LONG-TERM AUTONOMY IN AGENTIC SYSTEMS")
print("="*60)
run_demo()
print("n✅ Tutorial complete! Key takeaways:")
print(" • Episodic memory stores specific experiences")
print(" • Semantic memory generalizes patterns")
print(" • Agents improve recommendations over sessions")
print(" • Memory retrieval guides future decisions")
We combine everything by testing the memory retrieval in multiple sessions. As the agent learns from interactions, we refine our recommendations and observe how it improves. The comprehensive demo illustrates the natural evolution of long-term independence from our memory systems.
As a conclusion, we realize that combining episodic and semantic memory allows us build agents which continuously learn and improve over time. The agent is constantly refining its recommendations, adapting to new plans and retrieving previous experiences in order to better respond. These mechanisms show how simple but effective memory structures can lead to long-term autonomy.
Take a look at the FULL CODES here. Check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.
Asif Razzaq, CEO of Marktechpost Media Inc. is a visionary engineer and entrepreneur who is dedicated to using Artificial Intelligence (AI) for the greater good. Marktechpost is his latest venture, a media platform that focuses on Artificial Intelligence. It is known for providing in-depth news coverage about machine learning, deep learning, and other topics. The content is technically accurate and easy to understand by an audience of all backgrounds. Over 2 million views per month are a testament to the platform’s popularity.

