Close Menu
  • AI
  • Content Creation
  • Tech
  • Robotics
AI-trends.todayAI-trends.today
  • AI
  • Content Creation
  • Tech
  • Robotics
Trending
  • Apple’s new CEO must launch an AI killer product
  • OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing
  • 5 Reasons to Think Twice Before Using ChatGPT—or Any Chatbot—for Financial Advice
  • OpenAI Releases GPT-5.5, a Absolutely Retrained Agentic Mannequin That Scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval
  • Your Favorite AI Gay Thirst Traps: The Men Behind them
  • Mend Releases AI Safety Governance Framework: Masking Asset Stock, Danger Tiering, AI Provide Chain Safety, and Maturity Mannequin
  • Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Coaching Structure Attaining 88% Goodput Below Excessive {Hardware} Failure Charges
  • Mend.io releases AI Security Governance Framework covering asset inventory, risk tiering, AI Supply Chain Security and Maturity model
AI-trends.todayAI-trends.today
Home»Tech»Learn how to create a federated pipeline that protects privacy and fine-tunes large language models using LoRA, Flower, and PEFT

Learn how to create a federated pipeline that protects privacy and fine-tunes large language models using LoRA, Flower, and PEFT

Tech By Gavin Wallace10/02/20267 Mins Read
Facebook Twitter LinkedIn Email
NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning
NVIDIA AI Introduces AceReason-Nemotron for Advancing Math and Code Reasoning
Share
Facebook Twitter LinkedIn Email

This tutorial shows how you can federate the fine tuning of a language model without ever storing private text information. In this tutorial, we simulate several organizations and demonstrate how they adapt a base model shared by all clients locally using only the lightweight LoRA parameters. Flower’s simulation engine for federated LLMs and its parameter-efficient refinement allow us to demonstrate an effective, scalable solution that allows organizations to tailor LLMs using sensitive data, while still preserving privacy. Click here to see the FULL CODES here.

!pip install -q -U "protobuf manager -> compliance for high-risk cases."
   ],
   2: [
       "Fleet ops: preventive maintenance reduces downtime; prioritize vehicles with repeated fault codes.",
       "Dispatch note: optimize routes by time windows and driver hours to reduce empty miles.",
       "Safety policy: enforce rest breaks and log inspections before long-haul trips.",
       "Inventory update: track spare parts usage; reorder thresholds should reflect lead time and seasonality.",
       "Customer SLA: late deliveries require proactive notifications and documented root cause."
   ],
}
for cid in list(CLIENT_TEXTS.keys()):
 CLIENT_TEXTS base = client_text[cid]
   CLIENT_TEXTS[cid] The base is + [f"Q: Summarize this for leadership. A: {t}" for t in base]
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tokenizer.pad_token is None:
   tokenizer.pad_token = tokenizer.eos_token
bnb_config : Optional[BitsAndBytesConfig] No.
If DEVICE== "cuda":
   compute_dtype = torch.bfloat16 If you want to know more about if torch.cuda.get_device_capability(0)[0] >= 8 else torch.float16
   bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=compute_dtype)
if "gpt2" MODEL_ID.lower():
   TARGET_MODULES = ["c_attn", "c_proj"]
else:
   TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj"]
LORA_R = 16
LORA_ALPHA = 32
LORA_DROPOUT = 0.05
lora_config = LoraConfig(r=LORA_R, lora_alpha=LORA_ALPHA, lora_dropout=LORA_DROPOUT, bias="none", task_type="CAUSAL_LM", target_modules=TARGET_MODULES)
def model_primary_device(model) -> torch.device:
   return next(model.parameters()).device
def build_model_with_lora():
 If DEVICE == "cuda":
       model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map="auto", quantization_config=bnb_config, torch_dtype="auto")
       model = prepare_model_for_kbit_training(model)
   else:
       model = AutoModelForCausalLM.from_pretrained(MODEL_ID, torch_dtype=torch.float32)
       model.to("cpu")
   model = get_peft_model(model, lora_config)
   model.train()
 Model Return
Make a dataset with def (texts) ListDataset.from_dict()[str]) -> Dataset:
   ds = Dataset.from_dict({"text": texts})
 Def tok (batch)
       return tokenizer(batch["text"], truncation=True, max_length=MAX_LEN, padding="max_length")
   ds = ds.map(tok, batched=True, remove_columns=["text"])
   return ds
collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)
def lora_state_keys(model) -> List[str]:
 Model.state_dict = sd()
   keys = sorted([k for k in sd.keys() if "lora_" in k])
 If you don't have keys
       raise RuntimeError("No LoRA keys found. Your model might not have the target_modules specified. " F"Current TARGET_MODULES={TARGET_MODULES}, MODEL_ID={MODEL_ID}")
   return keys
def get_lora_ndarrays(model) -> List[np.ndarray]:
 Model.state_dict = sd()
   keys = lora_state_keys(model)
 Return to the Homepage [sd[k].detach().float().cpu().numpy() for k in keys]
def set_lora_ndarrays(model, arrays: List[np.ndarray]) -> None:
   keys = lora_state_keys(model)
 If len(keys), then len(arrays).
       raise ValueError(f"Mismatch: got {len(arrays)} arrays but expected {len(keys)}.")
 Model.state_dict = sd()
   for k, arr in zip(keys, arrays):
       t = torch.from_numpy(arr).to(sd[k].device).to(sd[k].dtype)
 sd[k].copy_(t)
def cosine_warmup_lr(step: int, total_steps: int, base_lr: float, warmup_steps: int) -> float:
 If Step Float:
   model.eval()
   dl = DataLoader(ds, batch_size=BATCH_SIZE, shuffle=False, collate_fn=collator)
   losses = []
   dev = model_primary_device(model)
 For i, batch in the enumerate (dl):
       if i >= max_batches:
 Breaks
       batch = {Batch = k : v.to (dev) for batch.items k and v()}
       out = model(**batch, labels=batch["input_ids"])
       losses.append(float(out.loss.detach().cpu()))
   model.train()
   return float(np.mean(losses)) if losses else float("nan")
def train_one_client_round(model, ds: Dataset, epochs: int, lr: float, grad_accum: int, warmup_steps: int) -> Tuple[float, int]:
   dl = DataLoader(ds, batch_size=BATCH_SIZE, shuffle=True, collate_fn=collator)
   total_steps = max(1, (len(dl) * epochs) // max(1, grad_accum))
 step = 0
   optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=WEIGHT_DECAY)
   optimizer.zero_grad(set_to_none=True)
 Running = []
   examples = 0
   dev = model_primary_device(model)
 For _, in the range (epochs).
 Enumerate in batch(dl).
           batch = {Batch = k is v.to (dev) in order to k and v.()}
           out = model(**batch, labels=batch["input_ids"])
           loss = out.loss / grad_accum
           loss.backward()
           running.append(float(loss.detach().cpu()) * grad_accum)
           examples += batch["input_ids"].shape[0]
           if (bi + 1) % grad_accum == 0:
               lr_t = cosine_warmup_lr(step, total_steps, lr, warmup_steps)
               for pg in optimizer.param_groups:
 pg["lr"] = lr_t
               optimizer.step()
               optimizer.zero_grad(set_to_none=True)
 Step += 1
 If % LOG_EVERY > 0, step:
                   print(f"  step={step}/{total_steps} loss={np.mean(running[-LOG_EVERY:]):.4f} lr={lr_t:.2e}")
   return float(np.mean(running)) if running else float("nan"Examples

The full execution environment is set up and all configurations are defined for the test. The private client text silos and tokenizer are prepared so that they can automatically adjust to the CPU or GPU available. Also, we set up all the helper tools that allow parameter-efficient fine tuning and safe device-handling across federated users. Visit the FULL CODES here.

class FedLoRAClient(fl.client.NumPyClient):
   def __init__(self, cid: int):
 Self.cid = cid
       self._model = None
 This is the self-ds_train.
 self._ds_eval= None
   def _ensure(self):
 If self._model = None
           print(f"[Client {self.cid}] Loading model + LoRA (MODEL_ID={MODEL_ID})...")
           self._model = build_model_with_lora()
 text = CLIENT_TEXTMetrics =[self.cid].copy()
           random.shuffle(texts)
           split = max(1, int(0.8 * len(texts)))
           self._ds_train = make_dataset(texts[:split])
           self._ds_eval = make_dataset(texts[split:])
   def get_parameters(self, config):
       self._ensure()
       return get_lora_ndarrays(self._model)
   def fit(self, parameters, config):
       self._ensure()
       set_lora_ndarrays(self._model, parameters)
       loss_before = eval_loss(self._model, self._ds_eval, max_batches=10)
       print(f"[Client {self.cid}] eval_loss_before={loss_before:.4f}")
       train_loss, n_examples = train_one_client_round(self._model, self._ds_train, epochs=int(config.get("local_epochs", LOCAL_EPOCHS)), lr=float(config.get("lr", LR)), grad_accum=int(config.get("grad_accum", GRAD_ACCUM)), warmup_steps=int(config.get("warmup_steps", WARMUP_STEPS)))
       loss_after = eval_loss(self._model, self._ds_eval, max_batches=10)
       print(f"[Client {self.cid}] train_loss={train_loss:.4f} eval_loss_after={loss_after:.4f}")
       new_params = get_lora_ndarrays(self._model)
       metrics = {"eval_loss_before": loss_before, "eval_loss_after": loss_after, "train_loss": train_loss}
 Return new_params n_examples metrics
   def evaluate(self, parameters, config):
       self._ensure()
       set_lora_ndarrays(self._model, parameters)
       loss = eval_loss(self._model, self._ds_eval, max_batches=20)
       return float(loss), len(self._ds_eval), {"eval_loss": float(loss)}
Def Client_fn (Context: Context).
 Cd = None
   try:
       cid = int(context.node_config.get("partition-id"))
 The exception:
       try:
 Node_id = context.cid
 The exception:
 cid = 0.
   return FedLoRAClient(cid).to_client()

Define the federated clients logic which simulates different organizations taking part in training. Each client is given a LoRA augmented language model and we ensure local datasets stay isolated. The client is responsible for training, evaluating, and exchanging parameters. Only the LoRA adapter values are exposed to the server.Return

def fit_config(server_round: int):
   return {"local_epochs": LOCAL_EPOCHS, "lr": LR, "grad_accum": GRAD_ACCUM, "warmup_steps": WARMUP_STEPS}
strategy = fl.server.strategy.FedAvg(fraction_fit=1.0, fraction_evaluate=1.0, min_fit_clients=NUM_CLIENTS, min_evaluate_clients=NUM_CLIENTS, min_available_clients=NUM_CLIENTS, on_fit_config_fn=fit_config)
print("nStarting Flower simulation...n")
client_resources = {"num_cpus": 2, "num_gpus": 0.0}
If DEVICE == "cuda":
   client_resources = {"num_cpus": 2, "num_gpus": 0.25}
history = fl.simulation.start_simulation(client_fn=client_fn, num_clients=NUM_CLIENTS, config=fl.server.ServerConfig(num_rounds=ROUNDS), strategy=strategy, client_resources=client_resources, ray_init_args={"include_dashboard": False, "ignore_reinit_error": True})
print("nSimulation done.")

Wir configure the federated-learning strategy and orchestrate global training. The number of clients participating, the way parameters are aggregated and how to schedule training rounds is specified. The Flower simulation is launched to facilitate communication and aggregate data across virtual clients. See the FULL CODES here.

demo_model = build_model_with_lora()
demo_model.eval()
prompt = "Summarize this internal note for leadership in 2 bullets:nDispatch note: optimize routes by time windows and driver hours to reduce empty miles.nnAnswer:"
Inputs = tokenizer (prompt return_tensors="pt")
dev = model_primary_device(demo_model)
inputs = {If inputs.items is k and v, then inputs.items will be v.to (dev).()}
No_grad. With torch():
   out = demo_model.generate(**inputs, max_new_tokens=80, do_sample=True, temperature=0.8, top_p=0.95, repetition_penalty=1.05, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
print("n=== Generation output ===n")
print(tokenizer.decode(out[0], skip_special_tokens=True))

Load a final LoRA-augmented instance of the model in order to demonstrate the results after training. We create a realistic question and generate text using the same training architecture. The pipeline is tested by checking that it produces outputs which are coherent and task-aligned.

print(type(history))
print(history.__dict__.keys())

The federated run produces simulation and training outputs. Examine the history object returned to confirm rounds, metrics, aggregation, etc. were successfully completed. This step is used to verify the reproducibility and integrity of the workflow.

We concluded that we could run federated fine tuning of LLMs end-to-end today in the Colab environment. Without sharing any raw text, or model weights, we successfully coordinated server-side LoRA aggregation and evaluation, as well as client-side LoRA. This workflow demonstrates how, in combination with PEFT, federated models can be adapted to maintain privacy while retaining robustness. It also provides the foundation needed for further development of the system, including personalization and enterprise deployment.


Take a look at the FULL CODES here. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe Now our Newsletter. Wait! What? now you can join us on telegram as well.


ar large language model models privacy
Share. Facebook Twitter LinkedIn Email
Avatar
Gavin Wallace

Related Posts

OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing

24/04/2026

OpenAI Releases GPT-5.5, a Absolutely Retrained Agentic Mannequin That Scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval

24/04/2026

Mend Releases AI Safety Governance Framework: Masking Asset Stock, Danger Tiering, AI Provide Chain Safety, and Maturity Mannequin

24/04/2026

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Coaching Structure Attaining 88% Goodput Below Excessive {Hardware} Failure Charges

24/04/2026
Top News

Gear News of the week: Another AI Browser and Fujifilm’s X-T30 III debut.

ByteDance & DeepSeek Place Very Different AI Bets

OpenAI Anthropic Block are teaming up to create AI agents that play nicely

AI Slop is Ruining Reddit and Reddit for All

Wired Roundup: AI Psychosis and Missing Files from the FTC

Load More
AI-Trends.Today

Your daily source of AI news and trends. Stay up to date with everything AI and automation!

X (Twitter) Instagram
Top Insights

Trump paid for his ballroom with YouTube

03/10/2025

The Complexipy Guide for Measuring Cognitive Complexity and Visualizing it in Python Projects

06/02/2026
Latest News

Apple’s new CEO must launch an AI killer product

24/04/2026

OpenMythos Coding Tutorial: Recurrent-Depth Transformers, Depth Extrapolation and Mixture of Experts Routing

24/04/2026
X (Twitter) Instagram
  • Privacy Policy
  • Contact Us
  • Terms and Conditions
© 2026 AI-Trends.Today

Type above and press Enter to search. Press Esc to cancel.