Section 3: Control Flow
Your data isn’t always clean and predictable. Sometimes you need to make decisions: “If sales are above target, send a congratulations email.” Sometimes you need to repeat actions: “Process every row in this 10,000-record dataset.” Control flow gives your code the intelligence to handle real-world data scenarios - making decisions and automating repetitive tasks that would take hours in Excel.
Introduction
Control flow determines the order in which your code executes. In data science, you need to make decisions based on data and repeat operations across datasets. This section covers conditionals and loops - the building blocks of program logic.
Conditional Statements
Conditional statements let your program make decisions based on data.
If Statements
# Basic if statement
temperature = 85
if temperature > 80:
print("It's hot outside!")
print("Stay hydrated")
# If-else statement
if temperature > 80:
print("Hot weather")
else:
print("Cool weather")
# If-elif-else statement
if temperature > 90:
print("Very hot!")
elif temperature > 80:
print("Hot")
elif temperature > 70:
print("Warm")
else:
print("Cool")Comparison with Other Tools
| Task | Excel | Python |
|---|---|---|
| Simple condition | =IF(A1>80,"Hot","Cool") |
if temp > 80: print("Hot") |
| Multiple conditions | =IF(A1>90,"Very Hot",IF(A1>80,"Hot","Cool")) |
if temp > 90: print("Very Hot") elif temp > 80: print("Hot") |
Real-World Example: Customer Segmentation
def categorize_customer(total_spent, orders_count):
"""Categorize customer based on spending and order count"""
if total_spent > 5000 and orders_count > 10:
return "VIP"
elif total_spent > 2000 or orders_count > 5:
return "Premium"
elif total_spent > 500:
return "Regular"
else:
return "New"
# Test the function
customers = [
{"name": "Alice", "spent": 7500, "orders": 15},
{"name": "Bob", "spent": 2500, "orders": 8},
{"name": "Carol", "spent": 800, "orders": 3},
{"name": "David", "spent": 200, "orders": 1}
]
for customer in customers:
category = categorize_customer(customer["spent"], customer["orders"])
print(f"{customer['name']}: {category}")Loops
Loops let you repeat operations across data collections.
For Loops
# Basic for loop
fruits = ["apple", "banana", "orange"]
for fruit in fruits:
print(f"I like {fruit}")
# Loop with index
for i, fruit in enumerate(fruits):
print(f"{i+1}. {fruit}")
# Loop through range
for i in range(5):
print(f"Number: {i}")
# Loop through dictionary
customer_data = {"name": "Alice", "age": 25, "city": "New York"}
for key, value in customer_data.items():
print(f"{key}: {value}")While Loops
# Basic while loop
count = 0
while count < 5:
print(f"Count: {count}")
count += 1
# While loop with condition
sales_target = 10000
current_sales = 0
month = 1
while current_sales < sales_target and month <= 12:
monthly_sales = 1200 # Simulate monthly sales
current_sales += monthly_sales
print(f"Month {month}: ${current_sales:,} total sales")
month += 1
if current_sales >= sales_target:
print("Target achieved!")
else:
print("Target not met by year end")Advanced Loop Techniques
# List comprehensions (Pythonic way)
numbers = [1, 2, 3, 4, 5]
squares = [x**2 for x in numbers]
print(squares) # [1, 4, 9, 16, 25]
# Conditional list comprehension
even_squares = [x**2 for x in numbers if x % 2 == 0]
print(even_squares) # [4, 16]
# Nested loops
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
for row in matrix:
for cell in row:
print(cell, end=" ")
print() # New line after each rowLoop Control Statements
# Break - exit loop early
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for num in numbers:
if num > 5:
break
print(num) # Prints 1, 2, 3, 4, 5
# Continue - skip current iteration
for num in numbers:
if num % 2 == 0:
continue
print(num) # Prints only odd numbers
# Else with loops (runs if loop completes normally)
for num in numbers:
if num > 15:
break
else:
print("All numbers processed") # This will runData Processing with Loops
Loops are fundamental for processing data collections.
Processing Sales Data
# Sample sales data
sales_data = [
{"month": "Jan", "sales": 15000, "region": "North"},
{"month": "Feb", "sales": 18000, "region": "North"},
{"month": "Mar", "sales": 12000, "region": "South"},
{"month": "Apr", "sales": 22000, "region": "North"},
{"month": "May", "sales": 19000, "region": "South"}
]
# Calculate total sales
total_sales = 0
for record in sales_data:
total_sales += record["sales"]
print(f"Total sales: ${total_sales:,}")
# Find best month
best_month = None
best_sales = 0
for record in sales_data:
if record["sales"] > best_sales:
best_sales = record["sales"]
best_month = record["month"]
print(f"Best month: {best_month} with ${best_sales:,}")
# Regional analysis
north_sales = 0
south_sales = 0
for record in sales_data:
if record["region"] == "North":
north_sales += record["sales"]
else:
south_sales += record["sales"]
print(f"North region: ${north_sales:,}")
print(f"South region: ${south_sales:,}")Data Filtering and Transformation
# Filter high-value customers
customers = [
{"name": "Alice", "purchases": [100, 200, 150]},
{"name": "Bob", "purchases": [50, 75, 100]},
{"name": "Carol", "purchases": [300, 400, 500]},
{"name": "David", "purchases": [25, 30, 35]}
]
high_value_customers = []
for customer in customers:
total_spent = sum(customer["purchases"])
if total_spent > 200:
high_value_customers.append({
"name": customer["name"],
"total_spent": total_spent
})
print("High-value customers:")
for customer in high_value_customers:
print(f"{customer['name']}: ${customer['total_spent']}")
# Transform data
product_prices = [10, 25, 50, 100]
discounted_prices = []
for price in product_prices:
if price > 50:
discounted_prices.append(price * 0.8) # 20% discount
else:
discounted_prices.append(price * 0.9) # 10% discount
print(f"Original prices: {product_prices}")
print(f"Discounted prices: {discounted_prices}")Practice Exercise
Create a comprehensive sales analysis program:
# Monthly sales data
monthly_sales = [
{"month": "January", "sales": 45000, "expenses": 30000},
{"month": "February", "sales": 52000, "expenses": 32000},
{"month": "March", "sales": 48000, "expenses": 31000},
{"month": "April", "sales": 61000, "expenses": 35000},
{"month": "May", "sales": 55000, "expenses": 33000},
{"month": "June", "sales": 67000, "expenses": 38000}
]
print("Monthly Sales Analysis")
print("=" * 50)
# Calculate totals
total_sales = 0
total_expenses = 0
profitable_months = 0
for record in monthly_sales:
sales = record["sales"]
expenses = record["expenses"]
profit = sales - expenses
total_sales += sales
total_expenses += expenses
if profit > 0:
profitable_months += 1
print(f"{record['month']}: Sales ${sales:,}, Expenses ${expenses:,}, Profit ${profit:,}")
# Summary statistics
total_profit = total_sales - total_expenses
average_sales = total_sales / len(monthly_sales)
profit_margin = (total_profit / total_sales) * 100
print(f"\nSummary:")
print(f"Total Sales: ${total_sales:,}")
print(f"Total Expenses: ${total_expenses:,}")
print(f"Total Profit: ${total_profit:,}")
print(f"Average Monthly Sales: ${average_sales:,.0f}")
print(f"Profit Margin: {profit_margin:.1f}%")
print(f"Profitable Months: {profitable_months}/{len(monthly_sales)}")
# Find best and worst months
best_month = max(monthly_sales, key=lambda x: x["sales"])
worst_month = min(monthly_sales, key=lambda x: x["sales"])
print(f"\nBest Month: {best_month['month']} (${best_month['sales']:,})")
print(f"Worst Month: {worst_month['month']} (${worst_month['sales']:,})")
# Growth analysis
print(f"\nMonth-over-Month Growth:")
for i in range(1, len(monthly_sales)):
current_sales = monthly_sales[i]["sales"]
previous_sales = monthly_sales[i-1]["sales"]
growth = ((current_sales - previous_sales) / previous_sales) * 100
print(f"{monthly_sales[i]['month']}: {growth:+.1f}%")Assets
Summary
Control flow statements let you make decisions and repeat operations in your code. Key concepts include if-elif-else statements, for and while loops, and advanced techniques like list comprehensions. These are required for processing data and building logic into your programs.
© 2025 Prof. Tim Frenzel. All rights reserved. | Version 1.0.5