Section 1: Object-Oriented Programming

What is Object-Oriented Programming and why do data scientists use it? Object-Oriented Programming (OOP) is a way of organizing code that groups related information and actions together. Think of it like organizing your desk - instead of scattered papers everywhere, you put related documents in labeled folders.

What is Object-Oriented Programming?

Object-Oriented Programming organizes code by creating “objects” that combine: - Data (information stored) - Actions (things you can do with that information)

Real-World Analogy: A Filing Cabinet

Imagine a customer filing cabinet: - Data: Customer name, email, purchase history - Actions: Add new purchase, calculate total spent, check VIP status

This is exactly how OOP works - everything related to customers goes in one “customer object.”

Advantages of OOP

Pros:

  • Organization: Related code stays together (easier to find and fix)
  • Reusability: Write code once, use it many times
  • Collaboration: Team members can work on different objects without conflicts
  • Real-world modeling: Code structure matches how we think about business problems

Cons:

  • Learning curve: Takes time to understand the concepts
  • Complexity: Can be overkill for simple scripts
  • Performance: Slightly slower than simple procedural code

When to Use OOP vs Simple Scripts

Use OOP When Use Simple Scripts When
Building data analysis systems Quick one-time calculations
Working with customer/product data Simple data transformations
Team projects Personal analysis tasks
Reusable analytics tools Proof-of-concept work

Classes vs Objects: The Blueprint Concept

Think of building houses:

  • Class = House blueprint (shows where rooms go)
  • Object = Actual house built from that blueprint

Each house follows the same blueprint but has different details (address, paint color, furniture).

Simple Example: Customer Data

Customer Class (Blueprint):

  • Name
  • Email
  • Total purchases

Customer Objects (Actual customers):

  • Alice Johnson, alice@email.com, $1,250
  • Bob Smith, bob@email.com, $890

Your First Simple Class

Let’s create a Customer class to store customer information:

# Step 1: Define the class (blueprint)
class Customer:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.total_spent = 0

# Step 2: Create customer objects
alice = Customer("Alice Johnson", "alice@email.com")
bob = Customer("Bob Smith", "bob@email.com")

# Step 3: Use the customer data
print(alice.name)        # Shows: Alice Johnson
print(alice.email)       # Shows: alice@email.com
print(alice.total_spent) # Shows: 0

What each part means:

  • class Customer: - Creates the customer blueprint
  • def __init__: - Sets up each new customer (like filling out a form)
  • self.name = name - Stores the customer’s name
  • alice = Customer(...) - Creates a specific customer named Alice

Adding Actions (Methods)

Methods are actions that your objects can perform. Let’s add an action to track purchases:

class Customer:
    def __init__(self, name, email):
        self.name = name
        self.email = email
        self.total_spent = 0
    
    # Method to add a purchase
    def add_purchase(self, amount):
        self.total_spent = self.total_spent + amount
        print(f"{self.name} spent ${amount}. Total: ${self.total_spent}")

# Create a customer
alice = Customer("Alice Johnson", "alice@email.com")

# Use the method
alice.add_purchase(100)  # Shows: Alice Johnson spent $100. Total: $100
alice.add_purchase(50)   # Shows: Alice Johnson spent $50. Total: $150

What happened: 1. We created a method called add_purchase 2. This method takes an amount and adds it to the total 3. Each customer object can use this same action

When You’ll Use OOP in Data Science

Common scenarios:

  • Customer analysis: Track customer behavior over time
  • Product catalogs: Store product information and calculate metrics
  • Data processing: Create reusable data cleaning tools
  • Reporting systems: Generate consistent reports across different datasets

Assets

Resources

  • Python OOP basics: https://docs.python.org/3/tutorial/classes.html
  • Simple OOP examples: https://realpython.com/python3-object-oriented-programming/

Summary

Object-Oriented Programming helps organize code by grouping related data and actions together. Think of it as creating digital filing cabinets for different types of information.

Key concepts:

  • Classes: Blueprints that define what data and actions go together
  • Objects: Specific instances created from those blueprints
  • Methods: Actions that objects can perform

OOP is useful when building reusable data analysis tools and working with business data like customers, products, or transactions. Start simple - you don’t need to use every OOP feature right away.


© 2025 Prof. Tim Frenzel. All rights reserved. | Version 1.0.5