We will build a cognitive complexity workflow with this tutorial. complexipy. We begin by measuring the complexity of code directly in raw strings. Then we scale this analysis up to include individual files, and even an entire directory. Along the way we create machine-readable reports and normalize them into DataFrames. Then, we visualize complexity distributions in order to better understand how decisions depths accumulate across functions. In this article, we demonstrate how cognitive complexity is incorporated naturally into Python development. Visit the FULL CODES here.
Installation of complexipy and pandas is done by executing : pip -q.
Import os
Import json
Textwrap
Subprocess import
From pathlib import Path
import pandas as pd
Import matplotlib.pyplot into plt
From complexipy, import file_complexity or code_complexity
print("✅ Installed complexipy and dependencies")
The environment is set up by installing all required libraries. We also import any dependencies that are needed to perform analysis or visualization. The notebook must be self-contained to work in Google Colab. This is the foundation of everything else that will follow.
The snippet is = """
def score_orders(orders):
total = 0
for o in orders:
If (o.get)"valid"):
If (o.get)"priority"):
If (o.get)"amount", 0) > 100:
total += 3
else:
total += 2
else:
if o.get()"amount", 0) > 100:
total += 2
else:
total += 1
else:
total -= 1
return total
"""
res = code_complexity(snippet)
print("=== Code string complexity ===")
print("Overall complexity:", res.complexity)
print("Functions:")
Functions for the letter f:
print(f" - {f.name}: {f.complexity} (lines {f.line_start}-{f.line_end})")
To understand cognitive complexity, we first analyze a Python function. In order to understand the cognitive complexity of nested control flows and conditionals, we directly examine them. We can validate core behaviour of Complexipy prior to scaling up real files.
The root is Path"toy_project")
The root of the page is src. "src"
Tests = root/ "tests"
src.mkdir(parents=True, exist_ok=True)
tests.mkdir(parents=True, exist_ok=True)
(src / "__init__.py").write_text("")
(tests / "__init__.py").write_text("")
(src / "simple.py").write_text(textwrap.dedent("""
Addition of a and b:
Return a + B
Defsafe_div (a, b).
If b >= 0, the following will occur:
Return No
Return to a / B
""").strip() + "n")
(src / "legacy_adapter.py").write_text(textwrap.dedent("""
Define legacy_adapter (x, y).
If x is true and y is false:
if x > 0:
if y > 0:
Return x+y
else:
Return x to y
else:
if y > 0:
return y - x
else:
Return (x + y).
Return 0
""").strip() + "n")
(src / "engine.py").write_text(textwrap.dedent("""
Def Route_event:
If event.get = "kind", then the value of this will be:"kind")
Payload = event.get"payload", {})
If kind == "A":
If payload.get()"x"The payload.get() method is used to get the data from the load."y"):
return _handle_a_payload
Return No
Kind == "B":
if (payload.get)"flags"):
Return _handle_b _ (Payload)
else:
Return None
Kind == "C":
For item in the payload.get ("items", []):
If item.get()"enabled"):
If item.get("mode") == "fast":
_do_fast(item)
else:
_do_safe(item)
Return to True
else:
Return No
Def _handle_a:
total = 0
For v in P.get("vals", []):
if v > 10:
total += 2
else:
total += 1
return total
Def _handle_b (p)
score = 0.
For f in P.get("flags", []):
If f == "x":
Score = 1
If f is equal to elif, then elif will be used. "y":
Score = 2
else:
Score = 1
Return score
Def _do_fast (item)
Get the item back ("id")
Def _do_safe():
If item.get()"id"The'is not None
Return No
Get the item back ("id")
""").strip() + "n")
(tests / "test_engine.py").write_text(textwrap.dedent("""
Import route_event from src.engine
def test_route_event_smoke():
assert route_event({"kind": "A", "payload": {"x": 1, "y": 2, "vals": [1, 20]}}) == 3
""").strip() + "n")
print(f"✅ Created project at: {root.resolve()}")
Then, using multiple Python modules and files for testing we programmatically create the project. To create meaningful complexity differences, we intentionally included varied control flow patterns. Take a look at the FULL CODES here.
engine_path=src/ "engine.py"
file_res = file_complexity(str(engine_path))
print("n=== File complexity (Python API) ===")
print("Path:", file_res.path)
print("File complexity:", file_res.complexity)
File_res.functions for f:
print(f" - {f.name}: {f.complexity} (lines {f.line_start}-{f.line_end})")
MAX_ALLOWED = 8
def run_complexipy_cli(project_dir: Path, max_allowed: int = 8):
cmd [
"complexipy",
".",
"--max-complexity-allowed", str(max_allowed),
"--output-json",
"--output-csv",
]
proc = subprocess.run(cmd, cwd=str(project_dir), capture_output=True, text=True)
preferred_csv = project_dir / "complexipy.csv"
preferred_json = project_dir / "complexipy.json"
csv_candidates = []
json_candidates = []
If preferred_csv.exists():
csv_candidates.append(preferred_csv)
if preferred_json.exists():
json_candidates.append(preferred_json)
csv_candidates += list(project_dir.glob("*.csv")) + list(project_dir.glob("**/*.csv"))
json_candidates += list(project_dir.glob("*.json")) + list(project_dir.glob("**/*.json"))
def uniq(paths):
seen = set()
Get out = []
For p, see:
p = p.resolve()
If p is not visible and p.is_file():
seen.add(p)
out.append(p)
Return to the page
csv_candidates = uniq(csv_candidates)
json_candidates = uniq(json_candidates)
def pick_best(paths):
If not, then:
Return None
paths = sorted(paths, key=lambda p: p.stat().st_mtime, reverse=True)
Return paths[0]
return proc.returncode, pick_best(csv_candidates), pick_best(json_candidates)
rc, csv_report, json_report = run_complexipy_cli(root, MAX_ALLOWED)
After analyzing a source file with the Python API we run complexipy on the whole project. To generate accurate reports, we run the CLI in the correct directory. This step bridges the local API with production style static analysis workflows.
Df = None
If csv_report exists, then csv_report.exists():
df = pd.read_csv(csv_report)
json_report.exists, json_report.exists are elifs.():
data = json.loads(json_report.read_text())
If data, then list is instance:
DataFrame = DataFrame()
If isinstance() returns a data, dict:
If "files" In data is the same as instance(data["files"], list):
DataFrame (data) = df["files"])
The elif "results" In data is the same as instance(data["results"], list):
DataFrame (data) = df["results"])
else:
df = pd.json_normalize(data)
df can be None:
raise RuntimeError("No report produced")
def explode_functions_table(df_in):
If "functions" in df_in.columns:
Tmp = df_in.explode"functions", ignore_index=True)
If you tmp["functions"].notna().any() And isInstance(tmp)["functions"].dropna().iloc[0], dict):
fn = pd.json_normalize(tmp["functions"])
Base = tmp.drop (columns=["functions"])
Return pd.concat[base.reset_index(drop=True), fn.reset_index(drop=True)], axis=1)
Return tmp
return df_in
fn_df = explode_functions_table(df)
col_map = {}
Columns for C in fn_df
Lowercase = lowercase()
lc ("path", "file", "filename", "module"):
col_map[c] = "path"
If ("function" In lc "name" In lc or ("function", "func", "function_name"):
col_map[c] = "function"
If lc== "name" The following are some examples of how to get started: "function" If you do not see fn_df.columns, please contact us.
col_map[c] = "function"
If "complexity" In lc "allowed" Not in lc "max" Not in lc
col_map[c] = "complexity"
If lc is ("line_start", "linestart", "start_line", "startline"):
col_map[c] = "line_start"
If lc is ("line_end", "lineend", "end_line", "endline"):
col_map[c] = "line_end"
fn_df = fn_df.rename(columns=col_map)
We normalize the tables into function-level data and load them in pandas. In order to maintain a robust workflow, we manage multiple different report schemas. Using standard data analysis software, we can reason about complex situations using this structured representation.
If you want to know more about if "complexity" In fn_df.columns
fn_df["complexity"] = pd.to_numeric(fn_df["complexity"], errors="coerce")
plt.figure()
fn_df["complexity"].dropna().plot(kind="hist", bins=20)
plt.title("Cognitive Complexity Distribution (functions)")
plt.xlabel("complexity")
plt.ylabel("count")
plt.show()
def refactor_hints(complexity):
if complexity >= 20:
You can return to your original language by clicking here. [
"Split into smaller pure functions",
"Replace deep nesting with guard clauses",
"Extract complex boolean predicates"
]
if complexity >= 12:
You can return to your original language by clicking here. [
"Extract inner logic into helpers",
"Flatten conditionals",
"Use dispatch tables"
]
if complexity >= 8:
You can return to your original language by clicking here. [
"Reduce nesting",
"Early returns"
]
You can return to your original language by clicking here. ["Acceptable complexity"]
If you want to know more about if "complexity" In the fn_df.columns "function" In fn_df.columns
For _, R in fn_df.sort_values"complexity", ascending=False).head(8).iterrows():
Cx = float (r["complexity"]) if pd.notna(r["complexity"]If ( ) then None
If cx = None, then:
You can continue reading
print(r["function"], cx, refactor_hints(cx))
print("✅ Tutorial complete.")
The distribution of cognitive complexes is visualized and refactoring advice is derived from numerical thresholds. Translating abstract complexity into concrete engineering action. The loop is closed by directly connecting maintenance decisions to measurement.
As a conclusion, we demonstrated a practical and reproducible pipeline that can be used to assess cognitive complexity within Python projects. We showed how to move away from an ad-hoc inspection and into data-driven reasoning, identifying high-risk code functions, as well as providing actionable refactoring advice based on quantitative thresholds. We can now reason early and consistently about maintainingability.
Click here to find out more FULL CODES here. Also, feel free to follow us on Twitter Join our Facebook group! 100k+ ML SubReddit Subscribe Now our Newsletter. Wait! Are you using Telegram? now you can join us on telegram as well.

