This tutorial explores how agents with neural memories can continuously learn without losing past experience. In this tutorial, we explore how a neural memory agent can learn continuously without forgetting past experiences. We demonstrate, using PyTorch to implement this method, how the content-based memory address and priority replay allow for the model’s performance to be maintained across different learning tasks and overcome catastrophic forgetting. Visit the FULL CODES here.
Import torch
Import torch.nn as Nn
Import Torch.nn. Functional as F
Import numpy as an np
From collections Import Deque
Matplotlib.pyplot can be imported as a plt
Import dataclasses from dataclass
@dataclass
Memory Config class:
Memory size: 128
memory_dim: int = 64
Num_read_heads : int = 4.
Num_write_heads : int = 1.
We start by importing the libraries that are essential and define a configuration class to represent our neural memory. We set the parameters for the memory size, the dimensionality and the read/write head count that will determine how the memory is trained. It is this setup that forms the base of our architecture for memory-augmented learning. Take a look at the FULL CODES here.
class NeuralMemoryBank(nn.Module):
def __init__(self, config: MemoryConfig):
Supermarkets are a great way to buy goods and services.().__init__()
self.memory_size = config.memory_size
self.memory_dim = config.memory_dim
self.num_read_heads = config.num_read_heads
self.register_buffer('memory', torch.zeros(config.memory_size, config.memory_dim))
self.register_buffer('usage', torch.zeros(config.memory_size))
def content_addressing(self, key, beta):
key_norm = F.normalize(key, dim=-1)
mem_norm = F.normalize(self.memory, dim=-1)
similarity = torch.matmul(key_norm, mem_norm.t())
return F.softmax(beta * similarity, dim=-1)
def write(self, write_key, write_vector, erase_vector, write_strength):
write_weights = self.content_addressing(write_key, write_strength)
erase = torch.outer(write_weights.squeeze(), erase_vector.squeeze())
self.memory = (self.memory * (1 - erase)).detach()
add = torch.outer(write_weights.squeeze(), write_vector.squeeze())
self.memory = (self.memory + add).detach()
self.usage = (0.99 * self.usage + write_weights.squeeze()).detach()
def read(self, read_keys, read_strengths):
The word reads is a synonym for the term. []
for i in range(self.num_read_heads):
weights = self.content_addressing(read_keys[i], read_strengths[i])
read_vector = torch.matmul(weights, self.memory)
reads.append(read_vector)
return torch.cat(reads, dim=-1)
class MemoryController(nn.Module):
def __init__(self, input_dim, hidden_dim, memory_config: MemoryConfig):
Supermarkets are a great way to buy goods and services.().__init__()
self.hidden_dim = hidden_dim
self.memory_config = memory_config
self.lstm = nn.LSTM(input_dim, hidden_dim, batch_first=True)
total_read_dim = memory_config.num_read_heads * memory_config.memory_dim
self.read_keys = nn.Linear(hidden_dim, memory_config.num_read_heads * memory_config.memory_dim)
self.read_strengths = nn.Linear(hidden_dim, memory_config.num_read_heads)
self.write_key = nn.Linear(hidden_dim, memory_config.memory_dim)
self.write_vector = nn.Linear(hidden_dim, memory_config.memory_dim)
self.erase_vector = nn.Linear(hidden_dim, memory_config.memory_dim)
self.write_strength = nn.Linear(hidden_dim, 1)
self.output = nn.Linear(hidden_dim + total_read_dim, input_dim)
def forward(self, x, memory_bank, hidden=None):
lstm_out, hidden = self.lstm(x.unsqueeze(0), hidden)
controller_state = lstm_out.squeeze(0)
read_k = self.read_keys(controller_state).view(self.memory_config.num_read_heads, -1)
read_s = F.softplus(self.read_strengths(controller_state))
write_k = self.write_key(controller_state)
write_v = torch.tanh(self.write_vector(controller_state))
erase_v = torch.sigmoid(self.erase_vector(controller_state))
write_s = F.softplus(self.write_strength(controller_state))
read_vectors = memory_bank.read(read_k, read_s)
memory_bank.write(write_k, write_v, erase_v, write_s)
Combination = Torch.cat[controller_state, read_vectors], dim=-1)
output = self.output(combined)
Return output is hidden
Memory Controllers and Neural Memory banks are implemented, and together they form the basis of the Agent’s memory. This memory is dynamically accessed by the controller using read-write operations. The Neural Bank uses content-based address to store and retrieve information. It allows the agent to adapt and remember relevant inputs. See the FULL CODES here.
Class ExperienceReplay
def __init__(self, capacity=10000, alpha=0.6):
Self-capacity = capability
self.alpha = alpha
self.buffer = deque(maxlen=capacity)
self.priorities = deque(maxlen=capacity)
def push(self, experience, priority=1.0):
self.buffer.append(experience)
self.priorities.append(priority ** self.alpha)
def sample(self, batch_size, beta=0.4):
if len(self.buffer) == 0:
Return to the Homepage [], []
probs = np.array(self.priorities)
The term probs is used to refer to the probs.sum or probs.()
indices = np.random.choice(len(self.buffer), min(batch_size, len(self.buffer)), p=probs, replace=False)
samples = [self.buffer[i] for i in indices]
weights = (len(self.buffer) * probs[indices]) ** (-beta)
Weights = Weights.max()
return samples, torch.FloatTensor(weights)
class MetaLearner(nn.Module):
def __init__(self, model):
Supermarkets are a great way to buy goods and services.().__init__()
self.model = model
Define adapt (self, support_x and support_y; num_steps=5, Lr=0.01)
adapted_params = {name: param.clone() for name, param in self.model.named_parameters()}
For _ within range (num_steps),:
pred, _ = self.model(support_x, self.model.memory_bank)
Loss = F.mse_loss (pred, Support_y)
grads = torch.autograd.grad(loss, self.model.parameters(), create_graph=True)
adapted_params = {name: param - lr * grad for (name, param), grad in zip(adapted_params.items(), grads)}
Return adapted_params
The Experience Replay component and Meta-Learner are designed to improve the agent’s learning ability. In order to reduce forgetting and improve the replay buffer, the model can revisit previous experiences by prioritizing sampling. Meta-Learner, on the other hand, uses MAML-style adaptive learning for rapid acquisition of new tasks. These modules work together to provide stability and flexibility in the training of agents. Click here to view the FULL CODES here.
class ContinualLearningAgent:
def __init__(self, input_dim=64, hidden_dim=128):
self.config = MemoryConfig()
self.memory_bank = NeuralMemoryBank(self.config)
self.controller = MemoryController(input_dim, hidden_dim, self.config)
self.replay_buffer = ExperienceReplay(capacity=5000)
self.meta_learner = MetaLearner(self.controller)
self.optimizer = torch.optim.Adam(self.controller.parameters(), lr=0.001)
self.task_history = []
Train_step def (self, x y, with_replay = True)
self.optimizer.zero_grad()
pred, _ = self.controller(x, self.memory_bank)
current_loss = F.mse_loss(pred, y)
self.replay_buffer.push((x.detach().clone(), y.detach().clone()), priority=current_loss.item() + 1e-6)
total_loss = current_loss
if use_replay and len(self.replay_buffer.buffer) > 16:
samples, weights = self.replay_buffer.sample(8)
for (replay_x, replay_y), weight in zip(samples, weights):
With torch.enable_grad():
replay_pred, _ = self.controller(replay_x, self.memory_bank)
replay_loss = F.mse_loss(replay_pred, replay_y)
total_loss = total_loss + 0.3 * replay_loss * weight
total_loss.backward()
torch.nn.utils.clip_grad_norm_(self.controller.parameters(), 1.0)
self.optimizer.step()
return total_loss.item()
def evaluate(self, test_data):
self.controller.eval()
total_error = 0
No_grad. With torch():
Test_data for x and y:
pred, _ = self.controller(x, self.memory_bank)
total_error += F.mse_loss(pred, y).item()
self.controller.train()
return total_error / len(test_data)
A Continual-Learning Agent is constructed that incorporates memory, controllers, replay, and Meta-Learning into an adaptive, single framework. This step defines how each agent is trained, replays previous data and measures its performance. This ensures the model retains prior information while learning without forgetting. See the FULL CODES here.
def create_task_data(task_id, num_samples=100):
torch.manual_seed(task_id)
x = torch.randn(num_samples, 64)
if task_id >= 0,
y = torch.sin(x.mean(dim=1, keepdim=True).expand(-1, 64))
Task_id >= 1
y = torch.cos(x.mean(dim=1, keepdim=True).expand(-1, 64)) * 0.5
else:
y = torch.tanh(x * 0.5 + task_id)
You can return to your original language by clicking here. [(x[i]The y[i]) for i in range(num_samples)]
def run_continual_learning_demo():
print("🧠 Neural Memory Agent - Continual Learning Demon")
print("=" * 60)
agent = ContinualLearningAgent()
num_tasks = 4
results = {result = tasks' [], 'without_memory': [], 'with_memory': []}
If task_id is in the range (num_tasks), then:
print(f"n📚 Learning Task {task_id + 1}/{num_tasks}")
train_data = create_task_data(task_id, num_samples=50)
test_data = create_task_data(task_id, num_samples=20)
For epoch within range(20)
total_loss = 0
Train_data for x and y:
loss = agent.train_step(x, y, use_replay=(task_id > 0))
total_loss += loss
if epoch % 5 == 0:
avg_loss = total_loss / len(train_data)
print(f" Epoch {epoch:2d}: Loss = {avg_loss:.4f}")
print(f"n 📊 Evaluation on all tasks:")
Range(task_id+1) for eval_task_id:
eval_data = create_task_data(eval_task_id, num_samples=20)
error = agent.evaluate(eval_data)
print(f" Task {eval_task_id + 1}: Error = {error:.4f}")
If eval_task_id >= task_id
The following are results of the search:['tasks'].append(eval_task_id + 1)
The following are results of the search:['with_memory'].append(error)
Figure, Axes= plt.subplots(1), (2), figsize= (14, 5).
ax = Axes[0]
memory_matrix = agent.memory_bank.memory.detach().numpy()
im = ax.imshow(memory_matrix, aspect="auto", cmap='viridis')
ax.set_title() ('Neural Memory State', size=14, weight="bold")
ax.set_xlabel('Memory Dimension')
ax.set_ylabel('Memory Slots')
plt.colorbar(im, ax=ax)
ax = Axes[1]
ax.plot(results['tasks']The results['with_memory'], marker="o", linewidth=2, markersize=8, label="With Memory Replay")
ax.set_title('Continual Learning Performance', fontsize=14, fontweight="bold")
ax.set_xlabel('Task Number')
ax.set_ylabel('Test Error')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('neural_memory_results.png', dpi=150, bbox_inches="tight")
print("n✅ Results saved to 'neural_memory_results.png'")
plt.show()
print("n" + "=" * 60)
print("🎯 Key Insights:")
print(" • Memory bank stores compressed task representations")
print(" • Experience replay mitigates catastrophic forgetting")
print(" • Agent maintains performance on earlier tasks")
print(" • Content-based addressing enables efficient retrieval")
If the __name__ equals "__main__":
run_continual_learning_demo()
In order to demonstrate the continuous learning process we generate synthetic tasks in multiple environments. We observe, as we visualize and train the results of our experiment, how memory replay enhances accuracy and stability across all tasks. The final graphical insight highlights how the differentiable memory improves an agent’s learning ability over time.
We have built and trained an agent that can adapt to changing tasks. The differentiable memory enabled efficient retrieval and storage of the learned representations while the replay mechanism reinforced stability and knowledge. Combining these agents with meta-learning allowed us to see how they can lead to more self-adaptive, resilient neural systems. They are able remember, reason and evolve, without forgetting what they have already learned.
Click here to find out more FULL CODES here. Check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter Don’t forget about our 100k+ ML SubReddit Subscribe Now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.
Asif Razzaq, CEO of Marktechpost Media Inc. is a visionary engineer and entrepreneur who is dedicated to harnessing Artificial Intelligence’s potential for the social good. Marktechpost was his most recent venture. This platform, which focuses on machine learning and deep-learning news, is both technical and understandable to a broad audience. This platform has over 2,000,000 monthly views which shows its popularity.

