Working with JSON and CSV Files

Who this is for: Grade 8–9 students who already know Python basics (loops, functions, lists/dicts) and want real-world data skills.

What you'll learn:

What JSON and CSV are and when to use them
Reading and writing JSON files in Python
Reading and writing CSV files in Python
Converting between JSON and CSV
Handling common pitfalls (encodings, numbers, headers)
A hands-on mini project: Gradebook Analyzer

1) Quick concepts

JSON (JavaScript Object Notation)

Great for structured data with nested objects and lists.
Common in web APIs and app configs.
Type mapping (JSON → Python):
- object → dict
- array → list
- string → str
- number → int/float
- true/false → True/False
- null → None

CSV (Comma-Separated Values)

Great for tabular data (rows and columns), like spreadsheets.
Simple and human-readable, but not good for deeply nested structures.

2) JSON in Python (module: json)

Read JSON from a file:

Use json.load to read files, json.loads to parse a string.

Example: reading, updating, and saving JSON

import json
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# Create a sample JSON file
sample = {
    "class": "9A",
    "teacher": "Ms. Lee",
    "students": [
        {"name": "Alex", "age": 14, "scores": {"math": 88, "science": 92}},
        {"name": "Rita", "age": 15, "scores": {"math": 95, "science": 85}}
    ]
}

with (data_path / "class.json").open("w", encoding="utf-8") as f:
    json.dump(sample, f, indent=2, ensure_ascii=False)

# Read it back
with (data_path / "class.json").open("r", encoding="utf-8") as f:
    data = json.load(f)

# Pretty print to console
print(json.dumps(data, indent=2, ensure_ascii=False))

# Update a value
data["students"][0]["scores"]["math"] = 90

# Save updated JSON
with (data_path / "class_updated.json").open("w", encoding="utf-8") as f:
    json.dump(data, f, indent=2, ensure_ascii=False, sort_keys=True)

Notes:

indent makes it human-readable.
ensure_ascii=False keeps non-English characters readable.
sort_keys=True writes keys in alphabetical order (useful for consistent files).

Handling bad JSON safely:

bad = "{ name: 'Alex' }"  # invalid JSON (unquoted key, single quotes)
try:
    parsed = json.loads(bad)
except json.JSONDecodeError as e:
    print("Invalid JSON:", e)

3) CSV in Python (module: csv)

Read CSV as rows (each row is a list):

Good for simple data without headers.

Read CSV as dictionaries (DictReader):

Best when the first row has column names.
You get rows as dicts like {"name": "Alex", "math": "88"}

Example: reading and writing CSV

import csv
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# Create a sample CSV file
with (data_path / "students.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["name", "math", "science"])
    writer.writerow(["Alex", "88", "92"])
    writer.writerow(["Rita", "95", "85"])
    writer.writerow(["Sam", "", "78"])  # missing math score

# Read with DictReader
def to_float(s):
    try:
        return float(s)
    except (TypeError, ValueError):
        return None

rows = []
with (data_path / "students.csv").open("r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        math = to_float(row["math"])
        science = to_float(row["science"])
        avg = None
        nums = [x for x in (math, science) if x is not None]
        if nums:
            avg = sum(nums) / len(nums)
        rows.append({
            "name": row["name"],
            "math": math,
            "science": science,
            "average": avg
        })

print(rows)

# Write a new CSV with averages
with (data_path / "averages.csv").open("w", newline="", encoding="utf-8") as f:
    fieldnames = ["name", "math", "science", "average"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

Important CSV tips:

Always open CSV files with newline="" to avoid extra blank lines (especially on Windows).
Use encoding="utf-8" for international text.
DictReader requires that the first row has headers.
All values read from CSV are strings. Convert to int/float if needed.

4) Converting between JSON and CSV

CSV → JSON (flat rows):

Read with DictReader, convert types, dump with json.dump.

JSON → CSV:

Works well if JSON is a list of flat objects.
For nested objects, you can flatten keys like "scores.math" or split into multiple files.

Example: flat JSON to CSV

import json, csv
from pathlib import Path

data_path = Path("data")

# Suppose we have a flat JSON list
people = [
    {"name": "Lina", "city": "Lagos", "age": 14},
    {"name": "Jai", "city": "Mumbai", "age": 15}
]
with (data_path / "people.json").open("w", encoding="utf-8") as f:
    json.dump(people, f, indent=2, ensure_ascii=False)

# Read JSON and write CSV
with (data_path / "people.json").open("r", encoding="utf-8") as f:
    people_data = json.load(f)

fieldnames = sorted({k for p in people_data for k in p.keys()})

with (data_path / "people.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(people_data)

Example: flatten a simple nested JSON for CSV

def flatten(prefix, d, out):
    for k, v in d.items():
        key = f"{prefix}.{k}" if prefix else k
        if isinstance(v, dict):
            flatten(key, v, out)
        else:
            out[key] = v
    return out

record = {"name": "Alex", "scores": {"math": 90, "science": 92}}
flat = flatten("", record, {})
# flat -> {"name": "Alex", "scores.math": 90, "scores.science": 92}

5) Practical project: Gradebook Analyzer

Goal:

Read a CSV of students and their test scores.
Compute each student's average and letter grade.
Save a JSON report for the class.
Save a CSV leaderboard of top students.

Input CSV (data/gradebook.csv):

Columns: name, math, science, english
Missing values allowed (blank cells)

Step-by-step code:

import csv, json
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# 1) Create a sample gradebook
with (data_path / "gradebook.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["name", "math", "science", "english"])
    writer.writerow(["Alex", "88", "92", "84"])
    writer.writerow(["Rita", "95", "85", "91"])
    writer.writerow(["Sam", "", "78", "80"])      # missing math
    writer.writerow(["Maya", "100", "98", ""])    # missing english

# 2) Helpers
def to_float(s):
    try:
        return float(s)
    except (TypeError, ValueError):
        return None

def average(values):
    vals = [v for v in values if v is not None]
    return sum(vals) / len(vals) if vals else None

def letter_grade(avg):
    if avg is None:
        return "N/A"
    if avg >= 90: return "A"
    if avg >= 80: return "B"
    if avg >= 70: return "C"
    if avg >= 60: return "D"
    return "F"

# 3) Read CSV and compute metrics
students = []
with (data_path / "gradebook.csv").open("r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        scores = {
            "math": to_float(row.get("math")),
            "science": to_float(row.get("science")),
            "english": to_float(row.get("english"))
        }
        avg = average(scores.values())
        students.append({
            "name": row["name"],
            "scores": scores,
            "average": avg,
            "letter": letter_grade(avg)
        })

# 4) Save JSON report
report = {
    "class": "9A",
    "count": len(students),
    "students": students,
    "top_student": max(students, key=lambda s: (- (s["average"] or -1), s["name"]))["name"]
}

with (data_path / "grade_report.json").open("w", encoding="utf-8") as f:
    json.dump(report, f, indent=2, ensure_ascii=False)

# 5) Save CSV leaderboard (sorted by average desc)
leaderboard = sorted(
    (s for s in students if s["average"] is not None),
    key=lambda s: s["average"],
    reverse=True
)

with (data_path / "leaderboard.csv").open("w", newline="", encoding="utf-8") as f:
    fieldnames = ["name", "average", "letter"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    for s in leaderboard:
        writer.writerow({
            "name": s["name"],
            "average": round(s["average"], 2),
            "letter": s["letter"]
        })

# 6) Print a quick summary
print("Saved:", data_path / "grade_report.json")
print("Saved:", data_path / "leaderboard.csv")

What you practiced:

Reading CSV with DictReader
Handling missing numbers
Writing formatted JSON with json.dump
Writing a CSV with DictWriter and headers
Sorting data and computing fields

6) Common pitfalls and pro tips

CSV numbers are strings: Always convert to int/float before math.
Newline handling: When opening CSV files, use newline="" to prevent blank lines on Windows.
Encodings: Use encoding="utf-8" for both JSON and CSV to support all characters.
Missing data: Use helper functions to safely convert or default values.
Nested JSON to CSV: Flatten nested dicts or choose a subset of fields.
Pretty JSON: Use indent=2 in dumps/dump for readability; skip for maximum speed/size.
Validation: Wrap json.load in try/except json.JSONDecodeError for untrusted files.
Big files: Process CSV line by line (streaming). For JSON, consider JSON Lines (each line is a JSON object) if you need streaming.

Writing JSON Lines:

# write each record on its own line
with open("data/students.jsonl", "w", encoding="utf-8") as f:
    for s in students:
        f.write(json.dumps(s, ensure_ascii=False) + "\n")

7) Summary

JSON is best for nested, structured data; CSV is best for tables.
In Python, use json.load/dump for files and csv.DictReader/DictWriter for labeled rows.
Always handle encodings, newlines, and type conversions carefully.
You can convert between JSON and CSV when data is flat or after flattening.
The Gradebook Analyzer project shows how to read data, compute results, and produce both JSON and CSV outputs—skills you'll use in real data tasks.

Next steps:

Try adding median and standard deviation to the Gradebook Analyzer.
Add per-subject leaderboards.
Explore pandas for more powerful data analysis once you're comfortable with csv/json.

Working with JSON and CSV Files

Who this is for: Grade 8–9 students who already know Python basics (loops, functions, lists/dicts) and want real-world data skills.

What you'll learn:

What JSON and CSV are and when to use them
Reading and writing JSON files in Python
Reading and writing CSV files in Python
Converting between JSON and CSV
Handling common pitfalls (encodings, numbers, headers)
A hands-on mini project: Gradebook Analyzer

1) Quick concepts

JSON (JavaScript Object Notation)

Great for structured data with nested objects and lists.
Common in web APIs and app configs.
Type mapping (JSON → Python):
- object → dict
- array → list
- string → str
- number → int/float
- true/false → True/False
- null → None

CSV (Comma-Separated Values)

Great for tabular data (rows and columns), like spreadsheets.
Simple and human-readable, but not good for deeply nested structures.

2) JSON in Python (module: json)

Read JSON from a file:

Use json.load to read files, json.loads to parse a string.

Example: reading, updating, and saving JSON

import json
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# Create a sample JSON file
sample = {
    "class": "9A",
    "teacher": "Ms. Lee",
    "students": [
        {"name": "Alex", "age": 14, "scores": {"math": 88, "science": 92}},
        {"name": "Rita", "age": 15, "scores": {"math": 95, "science": 85}}
    ]
}

with (data_path / "class.json").open("w", encoding="utf-8") as f:
    json.dump(sample, f, indent=2, ensure_ascii=False)

# Read it back
with (data_path / "class.json").open("r", encoding="utf-8") as f:
    data = json.load(f)

# Pretty print to console
print(json.dumps(data, indent=2, ensure_ascii=False))

# Update a value
data["students"][0]["scores"]["math"] = 90

# Save updated JSON
with (data_path / "class_updated.json").open("w", encoding="utf-8") as f:
    json.dump(data, f, indent=2, ensure_ascii=False, sort_keys=True)

Notes:

indent makes it human-readable.
ensure_ascii=False keeps non-English characters readable.
sort_keys=True writes keys in alphabetical order (useful for consistent files).

Handling bad JSON safely:

bad = "{ name: 'Alex' }"  # invalid JSON (unquoted key, single quotes)
try:
    parsed = json.loads(bad)
except json.JSONDecodeError as e:
    print("Invalid JSON:", e)

3) CSV in Python (module: csv)

Read CSV as rows (each row is a list):

Good for simple data without headers.

Read CSV as dictionaries (DictReader):

Best when the first row has column names.
You get rows as dicts like {"name": "Alex", "math": "88"}

Example: reading and writing CSV

import csv
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# Create a sample CSV file
with (data_path / "students.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["name", "math", "science"])
    writer.writerow(["Alex", "88", "92"])
    writer.writerow(["Rita", "95", "85"])
    writer.writerow(["Sam", "", "78"])  # missing math score

# Read with DictReader
def to_float(s):
    try:
        return float(s)
    except (TypeError, ValueError):
        return None

rows = []
with (data_path / "students.csv").open("r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        math = to_float(row["math"])
        science = to_float(row["science"])
        avg = None
        nums = [x for x in (math, science) if x is not None]
        if nums:
            avg = sum(nums) / len(nums)
        rows.append({
            "name": row["name"],
            "math": math,
            "science": science,
            "average": avg
        })

print(rows)

# Write a new CSV with averages
with (data_path / "averages.csv").open("w", newline="", encoding="utf-8") as f:
    fieldnames = ["name", "math", "science", "average"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

Important CSV tips:

Always open CSV files with newline="" to avoid extra blank lines (especially on Windows).
Use encoding="utf-8" for international text.
DictReader requires that the first row has headers.
All values read from CSV are strings. Convert to int/float if needed.

4) Converting between JSON and CSV

CSV → JSON (flat rows):

Read with DictReader, convert types, dump with json.dump.

JSON → CSV:

Works well if JSON is a list of flat objects.
For nested objects, you can flatten keys like "scores.math" or split into multiple files.

Example: flat JSON to CSV

import json, csv
from pathlib import Path

data_path = Path("data")

# Suppose we have a flat JSON list
people = [
    {"name": "Lina", "city": "Lagos", "age": 14},
    {"name": "Jai", "city": "Mumbai", "age": 15}
]
with (data_path / "people.json").open("w", encoding="utf-8") as f:
    json.dump(people, f, indent=2, ensure_ascii=False)

# Read JSON and write CSV
with (data_path / "people.json").open("r", encoding="utf-8") as f:
    people_data = json.load(f)

fieldnames = sorted({k for p in people_data for k in p.keys()})

with (data_path / "people.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(people_data)

Example: flatten a simple nested JSON for CSV

def flatten(prefix, d, out):
    for k, v in d.items():
        key = f"{prefix}.{k}" if prefix else k
        if isinstance(v, dict):
            flatten(key, v, out)
        else:
            out[key] = v
    return out

record = {"name": "Alex", "scores": {"math": 90, "science": 92}}
flat = flatten("", record, {})
# flat -> {"name": "Alex", "scores.math": 90, "scores.science": 92}

5) Practical project: Gradebook Analyzer

Goal:

Read a CSV of students and their test scores.
Compute each student's average and letter grade.
Save a JSON report for the class.
Save a CSV leaderboard of top students.

Input CSV (data/gradebook.csv):

Columns: name, math, science, english
Missing values allowed (blank cells)

Step-by-step code:

import csv, json
from pathlib import Path

data_path = Path("data")
data_path.mkdir(exist_ok=True)

# 1) Create a sample gradebook
with (data_path / "gradebook.csv").open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(["name", "math", "science", "english"])
    writer.writerow(["Alex", "88", "92", "84"])
    writer.writerow(["Rita", "95", "85", "91"])
    writer.writerow(["Sam", "", "78", "80"])      # missing math
    writer.writerow(["Maya", "100", "98", ""])    # missing english

# 2) Helpers
def to_float(s):
    try:
        return float(s)
    except (TypeError, ValueError):
        return None

def average(values):
    vals = [v for v in values if v is not None]
    return sum(vals) / len(vals) if vals else None

def letter_grade(avg):
    if avg is None:
        return "N/A"
    if avg >= 90: return "A"
    if avg >= 80: return "B"
    if avg >= 70: return "C"
    if avg >= 60: return "D"
    return "F"

# 3) Read CSV and compute metrics
students = []
with (data_path / "gradebook.csv").open("r", newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        scores = {
            "math": to_float(row.get("math")),
            "science": to_float(row.get("science")),
            "english": to_float(row.get("english"))
        }
        avg = average(scores.values())
        students.append({
            "name": row["name"],
            "scores": scores,
            "average": avg,
            "letter": letter_grade(avg)
        })

# 4) Save JSON report
report = {
    "class": "9A",
    "count": len(students),
    "students": students,
    "top_student": max(students, key=lambda s: (- (s["average"] or -1), s["name"]))["name"]
}

with (data_path / "grade_report.json").open("w", encoding="utf-8") as f:
    json.dump(report, f, indent=2, ensure_ascii=False)

# 5) Save CSV leaderboard (sorted by average desc)
leaderboard = sorted(
    (s for s in students if s["average"] is not None),
    key=lambda s: s["average"],
    reverse=True
)

with (data_path / "leaderboard.csv").open("w", newline="", encoding="utf-8") as f:
    fieldnames = ["name", "average", "letter"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    for s in leaderboard:
        writer.writerow({
            "name": s["name"],
            "average": round(s["average"], 2),
            "letter": s["letter"]
        })

# 6) Print a quick summary
print("Saved:", data_path / "grade_report.json")
print("Saved:", data_path / "leaderboard.csv")

What you practiced:

Reading CSV with DictReader
Handling missing numbers
Writing formatted JSON with json.dump
Writing a CSV with DictWriter and headers
Sorting data and computing fields

6) Common pitfalls and pro tips

CSV numbers are strings: Always convert to int/float before math.
Newline handling: When opening CSV files, use newline="" to prevent blank lines on Windows.
Encodings: Use encoding="utf-8" for both JSON and CSV to support all characters.
Missing data: Use helper functions to safely convert or default values.
Nested JSON to CSV: Flatten nested dicts or choose a subset of fields.
Pretty JSON: Use indent=2 in dumps/dump for readability; skip for maximum speed/size.
Validation: Wrap json.load in try/except json.JSONDecodeError for untrusted files.
Big files: Process CSV line by line (streaming). For JSON, consider JSON Lines (each line is a JSON object) if you need streaming.

Writing JSON Lines:

# write each record on its own line
with open("data/students.jsonl", "w", encoding="utf-8") as f:
    for s in students:
        f.write(json.dumps(s, ensure_ascii=False) + "\n")

7) Summary

JSON is best for nested, structured data; CSV is best for tables.
In Python, use json.load/dump for files and csv.DictReader/DictWriter for labeled rows.
Always handle encodings, newlines, and type conversions carefully.
You can convert between JSON and CSV when data is flat or after flattening.
The Gradebook Analyzer project shows how to read data, compute results, and produce both JSON and CSV outputs—skills you'll use in real data tasks.

Next steps:

Try adding median and standard deviation to the Gradebook Analyzer.
Add per-subject leaderboards.
Explore pandas for more powerful data analysis once you're comfortable with csv/json.

python Topics

python Tutorial

Working with JSON and CSV Files

What you'll learn:

1) Quick concepts

JSON (JavaScript Object Notation)

CSV (Comma-Separated Values)

2) JSON in Python (module: json)

Example: reading, updating, and saving JSON

Notes:

Handling bad JSON safely:

3) CSV in Python (module: csv)

Example: reading and writing CSV

Important CSV tips:

4) Converting between JSON and CSV

Example: flat JSON to CSV

Example: flatten a simple nested JSON for CSV

5) Practical project: Gradebook Analyzer

Goal:

Input CSV (data/gradebook.csv):

Step-by-step code:

What you practiced:

6) Common pitfalls and pro tips

Writing JSON Lines:

7) Summary

Next steps:

python Topics

python Tutorial

Working with JSON and CSV Files

What you'll learn:

1) Quick concepts

JSON (JavaScript Object Notation)

CSV (Comma-Separated Values)

2) JSON in Python (module: json)

Example: reading, updating, and saving JSON

Notes:

Handling bad JSON safely:

3) CSV in Python (module: csv)

Example: reading and writing CSV

Important CSV tips:

4) Converting between JSON and CSV

Example: flat JSON to CSV

Example: flatten a simple nested JSON for CSV

5) Practical project: Gradebook Analyzer

Goal:

Input CSV (data/gradebook.csv):

Step-by-step code:

What you practiced:

6) Common pitfalls and pro tips

Writing JSON Lines:

7) Summary

Next steps: