Working with Collections in Python
When we work with data, especially in fields like the humanities, we rarely deal with just one piece of information at a time. Instead, we work with entire sets of data: a list of authors, a collection of historical dates, the lines of a poem, or the characters in a play. A single variable can't hold all these related items, so how do we store and manage them in our code? The answer is that we need a container designed to hold multiple things at once.
Python's most fundamental container is the list
. A list is an ordered, flexible collection that is perfect for holding multiple items.
Understanding the Python list
You can think of a Python list
in the same way you think of a shopping list or a to-do list. It contains a number of items, they appear in a specific order, and you can easily add or remove items from it.
To create a list in Python, you use square brackets []
and separate each item with a comma. It's important to remember to enclose any text (strings) in quotes.
Let's look at an example of creating a few different lists.
# A list of key figures from the Renaissance
renaissance_figures = ["Leonardo da Vinci", "Michelangelo", "Raphael", "Donatello"]
# A list of important years (integers)
key_ww1_years = [1914, 1916, 1918]
# We can print the whole list to see it
print(renaissance_figures)
The key takeaway here is that a list
allows us to group related data together under a single, convenient variable name.
But what if we don't want the whole list, but just one specific item from it? For that, we use its index. By specifying the numerical position of an item, we can retrieve just that single element. For instance, if we write print(renaissance_figures[1])
, which name from our list do you think will be printed?
Let's see it in action.
# A list of key figures from the Renaissance
renaissance_figures = ["Leonardo da Vinci", "Michelangelo", "Raphael", "Donatello"]
# We can print the whole list to see it
print(renaissance_figures)
# Print element number 1
print(renaissance_figures[1]) # Note which one this is!!!
If you guessed "Michelangelo," you're correct! Python, like many programming languages, starts counting from 0. So, index 0
is "Leonardo da Vinci," and index 1
is "Michelangelo."
Using for
Loops to Work with Collections
Now that we have a collection of items, how do we do something with each item inside it? This is where the for
loop comes in. The for
loop is specifically designed to iterate, or travel through, a collection from its beginning to its end. The core idea is simple: "For each item in my list, perform this specific action."
The general syntax for a for
loop looks like this:
for temporary_variable in list_name:
# Indented code block runs for each item
# The temporary_variable holds the current item
Let's use this to print each artist from our list in a nicely formatted way.
renaissance_figures = ["Leonardo da Vinci", "Michelangelo", "Raphael", "Donatello"]
print("Key Artists of the High Renaissance:")
# The loop will visit each name in the list
for figure in renaissance_figures:
# 'figure' will be "Leonardo da Vinci" first, then "Michelangelo", etc.
print(f"- {figure}")
print("The loop has finished.")
So, how does this actually work behind the scenes?
First, the for
loop begins and picks the first item from the renaissance_figures
list, which is "Leonardo da Vinci"
. It stores this value in the temporary variable we named figure
. Then, it runs the indented code block, which is our print()
command. Python sees the f-string f"- {figure}"
and knows it needs to substitute the variable in the curly braces {}
. It replaces {figure}
with its current value and prints the final result: - Leonardo da Vinci
.
The loop doesn't stop there. It runs again, this time assigning the next item, "Michelangelo"
, to the figure
variable. It performs the same substitution and prints - Michelangelo
. This process repeats for every single item in the list until all have been processed. The f-string is what allows us to so neatly format our output, combining the static text (-
) with the dynamic content of the figure
variable in each step of the loop.
This looping capability isn't limited to lists. A string of text can also be treated as a collection of individual characters. This means we can use a for
loop to analyze a string, character by character.
book_title = "Ulysses"
character_count = 0
print(f"Analyzing the title: {book_title}")
# The loop will visit each character: 'U', then 'l', then 'y', etc.
for char in book_title:
print(f"Found character: '{char}'")
character_count = character_count + 1
print(f"Total characters found: {character_count}")
Can you explain how this code works? Pay particular attention to the line character_count = character_count + 1
. This is a common pattern for counting things. With each loop, it takes the current value of character_count
, adds one to it, and saves the new result back into the same variable, effectively incrementing our counter for each character it finds.
What is a Method?
It's helpful to think of a piece of data, like a string, not just as a passive container for text, but as an object that comes with its own set of built-in "actions" or "tools." A method is simply a function that "belongs to" a specific piece of data.
You access these built-in actions using a dot (.
). The syntax is always data.action()
. This notation tells Python: "Take this specific piece of data and perform this action on it."
For example, let's say we have a string: my_name = "ada lovelace"
. This string object has a built-in method called .upper()
that creates an all-caps version of it. To use it, we would write my_name.upper()
, and the result would be the string 'ADA LOVELACE'
.
The dot (.
) is the bridge that connects your data to the method you want to use. Some methods, like .upper()
, are simple and don't need any extra information to do their job. Other methods need you to provide arguments inside the parentheses to tell them how to perform the action.
Here are a few more useful string methods:
.lower()
: Makes the entire string lowercase. Formy_text = "A very long sentence."
,my_text.lower()
would produce'a very long sentence.'
..replace(old, new)
: Replaces a part of the string with something else.my_text.replace("long", "short")
would give'A very short sentence.'
..startswith(text)
: Checks if the string begins with a certain sequence of characters and returns a boolean value,True
orFalse
.my_text.startswith("A very")
would returnTrue
.
But why does data have its own methods in the first place? The most important reason is that the actions that make sense for one type of data often don't make sense for another. Attaching methods directly to a data type ensures that you only have access to relevant and logical actions.
For instance, a string has text-related actions. It makes perfect sense to capitalize a book title.
my_title = "frankenstein"
my_title.capitalize() # Gives 'Frankenstein'
However, an integer has number-related properties, not text actions. What would it even mean to "capitalize" the number 42? It's an illogical operation.
my_number = 42
my_number.capitalize() # This would cause an error!
This system keeps your code organized and prevents you from trying to perform an action that is nonsensical for your data type.
A Shortcut for Creating Lists: List Comprehensions
A very common task in programming is to create a new list by transforming items from an existing list. We can certainly do this with a standard for
loop.
# The "long way"
names = ["Eliot", "Pound", "Joyce"]
uppercase_names = [] # 1. Create an empty list
for name in names: # 2. Loop through the old list
uppercase_names.append(name.upper()) # 3. Append the transformed item to the new list
print(uppercase_names)
# Output: ['ELIOT', 'POUND', 'JOYCE']
Because this pattern of creating an empty list, looping, and appending is so common, Python provides a more elegant and compact way to do the exact same thing in a single, readable line. This is called a List Comprehension.
A list comprehension has a unique structure: [do_this_action for item in old_list]
.
Let's break down our names
example using this structure: [name.upper() for name in names]
.
for name in names
: This is the familiar loop part. It tells Python, "We are going to go through thenames
list one item at a time."name.upper()
: This is the action or expression. It says, "For eachname
we get from the loop, run the.upper()
method on it."[...]
: Finally, the square brackets around the whole expression tell Python to collect all the results from the action and put them into a brand new list.
This powerful syntax combines the three steps of creating a list, looping, and appending into one clean line.
Here is another simple example with numbers. Imagine we have a list of publication years and we want to create a new list showing how many years have passed since each publication, relative to the year 2025.
# The list of original publication years
pub_years = [1922, 1949, 1851, 1913]
# Using a list comprehension to calculate the age of each book
years_ago = [2025 - year for year in pub_years]
print(years_ago)
# Output: [103, 76, 174, 112]
In this case, 2025 - year
is the action performed for each year
in the pub_years
list. The result is a new, clean list containing the calculated ages.
Summary
To wrap up, we've seen that we often need to store multiple pieces of related data, and a Python list
is the perfect tool for that job. To perform an action on every single item in a collection (like a list
or a string
), we use a for
loop. This combination of a list
and a for
loop is one of the most common and powerful patterns in all of programming.
In the next section, we'll look at a different kind of loop for situations where we don't have a list to go through, but instead want to loop based on a condition.