Search⌘ K
AI Features

Strings and String Methods

Explore Python string basics including creation, indexing, slicing, and immutability. Understand key string methods for data cleaning and transformations, and use f-strings for dynamic, readable output. Gain essential skills for working with text in Python applications.

Text is the primary medium through which software interacts with humans. Whether we are processing user input, reading files, or generating reports, our programs constantly read, modify, and display text. Python is renowned for handling text efficiently, enabling us to perform complex transformations, often utilizing various Python string methods for data cleaning or applying Python string formatting for dynamic messages, with clear, readable syntax. In this lesson, we will master the tools needed to construct strings, extract specific data from them, and transform them to suit our application's logic.

Creating string literals

We define strings in Python by enclosing text in either single quotes (') or double quotes ("). Python treats both identically, but having two options gives us flexibility. If our text contains a single quote (like an apostrophe), we can wrap it in double quotes to avoid a syntax error, and vice versa.

Python 3.14.0
# Using single and double quotes
greeting = "Hello, Developer"
alert = 'System status: "Critical"'
contraction = "It's a great day to code"
print(greeting)
print(alert)
print(contraction)
  • Line 2: We define a standard string using double quotes, which is the most common way to represent text in Python.

  • Line 3: We leverage quote nesting by using single quotes on the outside. This tells Python that any double quotes found inside the string should be treated as literal text (part of the message) rather than the end of the code instruction.

  • Line 4: We reverse the strategy to handle a contraction. By using double quotes for the variable, we can safely include a single quote (an apostrophe) without the interpreter mistakenly thinking the string has closed early.

  • Lines 6–8: We output the variables to confirm that Python has correctly preserved the internal punctuation while successfully identifying the boundaries of each text block.

Concatenation: Joining strings

We often need to combine small pieces of text into a larger message. If you are wondering how to concatenate strings in Python, we perform it using the + operator.

Python 3.14.0
first_name = "Ada"
last_name = "Lovelace"
# Concatenating strings
full_name = first_name + " " + last_name
print(full_name)
  • Line 5: We perform string concatenation by using the + operator to merge three distinct pieces of text. By including a literal space " " between the variables, we ensure the final result is a correctly formatted name rather than two words squashed together.

  • Line 7: We output the value of full_name to confirm that Python has successfully allocated a new object in memory to hold the combined result of our concatenation.

Indexing and length

Python assigns a numerical position, or index, to every character in a string. Crucially, Python uses zero-based indexing, meaning the first character is at index 0, not 1.

To know how many characters a string contains (and thus what the valid indexes are), we use the built-in len() function. This counts every character, including spaces and punctuation.

We access characters using square brackets []. Uniquely, Python also supports negative indexing, which allows us to access characters starting from the end. Index -1 refers to the last character, -2 to the second to last, and so on.

This is incredibly useful because it lets us access the end of a string without needing to calculate its length manually.

Python 3.14.0
filename = "report_data.txt"
# Accessing by positive index
first_char = filename[0]
# Accessing by negative index
last_char = filename[-1]
print("First:", first_char)
print("Last:", last_char)
print("Length:", len(filename))
  • Line 4: We use zero-based indexing to target the absolute beginning of the string. By requesting index 0, we retrieve the character 'r' without affecting the original variable.

  • Line 7: We apply negative indexing to look at the string from the end. Using -1 provides a "shortcut" to the final character, 't', which remains reliable even if the content of the string changes in length.

  • Lines 9–10: We utilize the flexible nature of the print() function by passing it multiple arguments. Python handles the formatting for us by inserting a space between the label (the text) and the value (the character we extracted).

  • Line 11: We call the built-in len() function to calculate the total count of characters within the object. This includes the letters, the underscore, and the file extension dot, providing the total size of the string in memory.

To help you visually understand the concept more, let’s have a look at the image below:

Visual guide to positive and negative string indexing in Python
Visual guide to positive and negative string indexing in Python

Slicing: Extracting substrings

When we need a range of characters rather than just one, we use slicing. The syntax is [start:stop].

Python extracts characters starting at the start index up to, but not including, the stop index. If we omit the start, it defaults to the beginning (0). If we omit the stop, it defaults to the end of the string.

Python 3.14.0
email = "user@example.com"
# Extracting 'user' (indices 0, 1, 2, 3)
username = email[0:4]
# Extracting 'example.com' (index 5 to the end)
domain = email[5:]
# Re-assigning or combining to create the final variable
complete_email = username + "@" + domain
print(f"Username: {username}")
print(f"Domain: {domain}")
print(f"Full Address: {complete_email}")
  • Line 4: We use slicing syntax [start:stop] to extract a segment. By setting the stop index to 4, we instruct Python to include every character up to, but not including, the @ symbol. This "stop-before" rule ensures that the length of the slice is exactly stop - start (in this case, 40=44 - 0 = 4 characters).

  • Line 7: We perform a partial slice. By omitting the stop value after the colon, we instruct Python to capture every character from index 5 through to the very end of the string. This is the most efficient way to capture the remainder of a string regardless of its length.

  • Line 10: We use string concatenation to build a new variable, complete_email. This demonstrates how we can break a string apart, process its pieces, and stitch them back together into a new, valid data object.

  • Lines 12–14: We use f-strings to output the results. This clarifies the "role" of each piece of data, showing that our slices successfully separated the user identity from the hosting domain.

Note that [:] will default to the original string.

Immutability of strings

In Python, strings are immutable. This means that once a string is created, its contents cannot be changed. We cannot simply overwrite a character at a specific index.

Attempting to change a string directly will crash the program:

Python
text = "Hyllo"
# This causes an error!
text[1] = "e"
  • Line 4: Python raises a TypeError: 'str' object does not support item assignment because we cannot modify the memory of the existing string.

To "modify" a string, we must create a new string containing the desired changes and assign it back to the variable.

Python 3.14.0
text = "Hyllo"
# Correct way: Create a new string by slicing and concatenation
fixed_text = text[0] + "e" + text[2:]
print(fixed_text)
  • Line 4: Since strings in Python are immutable (cannot be changed after they are created), we cannot simply swap out a single letter. Instead, we perform a "reconstruction." We extract the valid parts of the original string using slicing, insert our corrected character, and join them back together into an entirely new object in memory.

  • Line 6: We output the result to confirm that fixed_text now holds the correct spelling. Note that the original text variable still contains "Hyllo" in memory unless we explicitly choose to overwrite it.

Built-in string methods

When working with text, Python string methods are built-in functions that perform common transformations. Since strings are immutable, these methods always return a new string; they do not change the original variable.

We access these methods using the dot operator (.). While we will explore the mechanics of objects and dots later, for now, you can think of it as "asking" the string to perform an action on itself.

  • strip(): Removes whitespace (spaces, tabs, newlines) from the beginning and end of the string.

  • upper()/lower(): Converts text to all upper or lower case.

  • replace(old, new): Swaps occurrences of a substring with another.

  • split(delimiter): Breaks a string into a list of substrings based on a separator character (like a comma or space).

Python 3.14.0
raw_data = " Python,Java,C++ "
# Strip removes surrounding whitespace
clean_data = raw_data.strip()
print(clean_data)
# Upper converts to uppercase
upper_data = clean_data.upper()
print(upper_data)
# Replace swaps text
modified_data = upper_data.replace("JAVA", "RUST")
print(modified_data)
# Split breaks the string into a list
languages = modified_data.split(",")
print(languages)
  • Line 4: strip() removes the leading and trailing spaces from the original string.

  • Lines 8–9: upper() converts "Python,Java,C++" to "PYTHON,JAVA,C++". We print the result to confirm capitalization.

  • Lines 12–13: replace() finds "JAVA" and swaps it with "RUST".

  • Line 16: split(",") divides the string at every comma, resulting in a list: ['PYTHON', 'RUST', 'C++'].

String formatting with f-strings

When learning how to format a string in python, the most readable way to combine variables with text is using f-strings (formatted string literals). Introduced in Python 3.6, this syntax has become the standard for formatting text because it is concise and easy to read.

By placing the letter f before the opening quote, we can embed variables directly inside the string using curly braces {}.

Python 3.14.0
item = "Laptop"
quantity = 2
# Using an f-string to embed values
summary = f"Order details: {quantity} x {item}"
print(summary)
  • Line 5: We start the string with f. Inside the braces {quantity} and {item}, Python inserts the variable values 2 and Laptop, respectively.

  • Line 7: We print the final formatted string: Order details: 2 x Laptop.

We can even perform expressions or function calls directly inside the braces. For now, save this for later, as you will see this syntax frequently in subsequent lessons.

We have explored the foundations of working with text in Python. We know how to define strings, access their internal structure via indexing and slicing, and use powerful methods to clean and transform raw data. We also learned to use f-strings for clear, dynamic output. Since almost every application involves reading or writing text, these skills will serve as the building blocks for much of the code we write in the future.