Strings
Strings are immutable sequences of Unicode characters. They support concatenation, formatting, modification, search, slicing, and type-checking methods. Mastery of these operations is essential for text processing and interview problems.
Creating Strings
Strings are created with single, double, or triple quotes. Triple quotes allow multiline strings.
s = "hello"
s = 'world'
s = """line two
line four
line six"""Immutability: Strings cannot be changed in place. Operations such as replace() and upper() return new strings; the original is unchanged.
Slicing and Indexing
Indexing
Indices start at 0. Negative indices count from the end: -1 is the last character.
s = "Python"
s[0] # "P"
s[-1] # "n"
s[2] # "t"Out of range: s[20] raises IndexError.
Slicing
s[start:stop:step] - elements from start up to but not including stop, with optional step. Omitted values use defaults: start=0, stop=len(s), step=1.
| Pattern | Description | Example |
|---|---|---|
s[i] | Single character at index i | s[2] → “t” |
s[:n] | First n characters | s[:2] → “Py” |
s[n:] | From index n to end | s[2:] → “thon” |
s[-m:-n] | Slice with negative indices | s[-4:-2] → “th” |
s[::-1] | Reversed string | s[::-1] → “nohtyP” |
s = "Python"
s[:2] # "Py"
s[2:] # "thon"
s[-4:-2] # "th"
s[::-1] # "nohtyP"reversed() and join()
reversed(s) returns a reverse iterator. Combine with join() to build a reversed string.
"".join(reversed("Python")) # "nohtyP"len()
len(s) returns the number of characters in the string.
len("Python") # 6Concatenation
The + Operator
+ joins two strings into a new string. Both operands must be strings.
"Hello" + "World" # "HelloWorld"
"2" + "4" + "6" # "246"Type error: "2" + 4 raises TypeError. Convert numbers with str(): "2" + str(4) → "24".
The join() Method
sep.join(iterable) joins an iterable of strings with the separator. The separator is the string calling join(); the iterable provides the parts.
"-".join(["a", "b", "c"]) # "a-b-c"
"".join(["2", "4", "6"]) # "246"
" | ".join(["x", "y", "z"]) # "x | y | z"Order: sep.join(parts) produces parts[0] + sep + parts[1] + sep + .... The separator appears only between elements.
Non-string elements: All elements must be strings. ",".join([2, 4, 6]) raises TypeError. Use ",".join(str(x) for x in [2, 4, 6]).
Empty separator: "".join(parts) concatenates with no separator.
Formatting
F-Strings
F-strings embed expressions in {} within a string prefixed with f or F.
name = "Alice"
score = 86.8
f"Name: {name}, Score: {score}" # "Name: Alice, Score: 86.8"Format specifiers: Use : followed by a format spec. :.2f formats a float to two decimal places.
x = 2.444
f"{x:.2f}" # "2.44"Expressions in braces: Any valid expression can appear inside {}.
x, y = 4, 6
f"{x}*{y}={x*y}" # "4*6=24"The format() Method
template.format(*args, **kwargs) fills placeholders in the template. Placeholders use {}; positional and keyword arguments fill them.
"{} and {}".format("a", "b") # "a and b"
"{0} and {1}".format("x", "y") # "x and y"
"{name} is {age}".format(name="Alice", age=24) # "Alice is 24"Alignment and Padding
| Method | Description |
|---|---|
s.center(width) | Center the string in a field of given width |
s.ljust(width) | Left-justify; pad on the right |
s.rjust(width) | Right-justify; pad on the left |
s.zfill(width) | Pad with zeros on the left to reach width |
"hi".center(6) # " hi "
"hi".ljust(6) # "hi "
"hi".rjust(6) # " hi"
"42".zfill(6) # "000042"Default padding: center, ljust, and rjust use spaces by default. An optional second argument specifies the fill character.
"hi".center(6, "-") # "--hi--"Modify Strings
Strings are immutable; these methods return new strings.
Case Conversion
| Method | Description |
|---|---|
s.upper() | All characters to uppercase |
s.lower() | All characters to lowercase |
s.casefold() | Aggressive lowercase for caseless comparison |
s.capitalize() | First character uppercase; rest lowercase |
s.swapcase() | Swap uppercase and lowercase |
s.title() | First letter of each word uppercase |
"hello".upper() # "HELLO"
"HELLO".lower() # "hello"
"Straße".casefold() # "strasse" - ß → ss
"hello".capitalize() # "Hello"
"HeLLo".swapcase() # "hEllO"
"hello world".title() # "Hello World"casefold() vs lower(): casefold() handles more Unicode cases (e.g. German ß). Use it for caseless comparison when lower() is insufficient.
Stripping Whitespace and Characters
| Method | Description |
|---|---|
s.strip() | Remove leading and trailing whitespace |
s.lstrip() | Remove leading whitespace |
s.rstrip() | Remove trailing whitespace |
s.strip(chars) | Remove leading/trailing chars in chars |
" hi ".strip() # "hi"
" hi".lstrip() # "hi"
"hi ".rstrip() # "hi"
"xxhixx".strip("x") # "hi"strip(chars): Removes all characters in chars from the start and end until a character not in chars is found. Order in chars does not matter.
Replace
| Method | Description |
|---|---|
s.replace(old, new) | Replace all occurrences of old with new |
s.replace(old, new, count) | Replace at most count occurrences |
"aabb".replace("a", "x") # "xxbb"
"one one one".replace("one", "two", 2) # "two two one"Empty string: s.replace("", "x") inserts "x" between every character and at the ends. Use with care.
Split and Partition
| Method | Description |
|---|---|
s.split(sep) | Split by separator; returns list of parts |
s.split(sep, maxsplit) | Split at most maxsplit times from the left |
s.rsplit(sep, maxsplit) | Split at most maxsplit times from the right |
s.splitlines() | Split on line boundaries (\n, \r, \r\n) |
s.partition(sep) | Split at first sep → (before, sep, after) |
s.rpartition(sep) | Split at last sep → (before, sep, after) |
"a,b,c".split(",") # ["a", "b", "c"]
"a,b,c,d".rsplit(",", 2) # ["a,b", "c", "d"]
"line2\nline4\nline6".splitlines() # ["line2", "line4", "line6"]
"a:b:c".partition(":") # ("a", ":", "b:c")
"a:b:c".rpartition(":") # ("a:b", ":", "c")split() with no argument: Splits on whitespace; leading/trailing whitespace is ignored. Multiple spaces count as one separator.
" a b c ".split() # ["a", "b", "c"]partition when sep not found: Returns (s, "", "") - the whole string plus two empty strings.
Prefix and Suffix Removal
| Method | Description |
|---|---|
s.removeprefix(prefix) | Remove prefix if present (Python 3.9+) |
s.removesuffix(suffix) | Remove suffix if present (Python 3.9+) |
"https://example.com".removeprefix("https://") # "example.com"
"file.txt".removesuffix(".txt") # "file"No match: If the prefix or suffix is not present, the original string is returned unchanged.
Search
find() and index()
Both return the lowest index where the substring is found. The difference is behavior when the substring is absent.
| Method | When found | When not found |
|---|---|---|
s.find(sub) | Index (int) | -1 |
s.index(sub) | Index (int) | ValueError |
"hello".find("ll") # 2
"hello".find("x") # -1
"python".index("t") # 2
# "python".index("x") # ValueErrorrfind() and rindex()
Return the highest index where the substring is found. rfind returns -1 when absent; rindex raises ValueError.
"hello world".rfind("l") # 9
"hello woorld".rindex("o") # 8count(), in, startswith(), endswith()
| Method | Description |
|---|---|
s.count(sub) | Number of non-overlapping occurrences of sub |
sub in s | True if sub is a substring |
s.startswith(prefix) | True if string starts with prefix |
s.endswith(suffix) | True if string ends with suffix |
"hello".count("l") # 2
"ell" in "hello" # True
"hello.py".startswith("hello") # True
"hello.py".endswith(".py") # TrueOverlapping matches: count() counts non-overlapping occurrences. "aaa".count("aa") → 1, not 2.
Type-Checking Methods
These methods return True or False based on character properties. An empty string returns False for all except isspace() (empty string → False).
| Method | Description |
|---|---|
s.isalpha() | All characters are alphabetic |
s.isalnum() | All characters are alphanumeric |
s.isdigit() | All characters are digits (0–9, ², ₀, etc.) |
s.isnumeric() | All characters are numeric (incl. ½, Ⅷ, etc.) |
s.islower() | All cased characters are lowercase |
s.isupper() | All cased characters are uppercase |
s.istitle() | String is title-cased (each word capitalized) |
s.isspace() | All characters are whitespace |
"letters".isalpha() # True
"abc246".isalnum() # True
"468".isdigit() # True
"½¾".isnumeric() # True (isdigit() → False)
"lowercase".islower() # True
"UPPERCASE".isupper() # True
"Title Case".istitle() # True
" ".isspace() # Trueisdigit() vs isnumeric()
| String | isdigit() | isnumeric() |
|---|---|---|
| “246” | True | True |
| ”²” | True | True |
| ”½” | False | True |
| ”Ⅷ” | False | True |
isdigit() covers digits (0–9) and some Unicode digit symbols. isnumeric() is broader: vulgar fractions, Roman numerals, and other numeric symbols.
Empty strings: All return False for "" except that "".isspace() is also False.
Tricky Behaviors
Strings are immutable
s.upper() returns a new string; it does not modify s. Assign the result: s = s.upper().
split() with empty separator
"ab".split("") raises ValueError. Use list("ab") to get a list of characters.
strip() removes characters, not substrings
strip(chars) removes leading and trailing characters that appear in chars, not the substring chars as a whole. For example, "hello".strip("ho") removes leading h and trailing o, yielding "ell".
replace() with count
replace(old, new, count) replaces from the left. The first count occurrences are replaced.
find() vs index()
Use find() when absence is expected; it returns -1. Use index() when absent is an error; it raises ValueError.
Interview Questions
How do you reverse a string?
s[::-1] or "".join(reversed(s)). Both create a new string.
What does "".join(parts) do?
Concatenates all strings in parts with no separator. Equivalent to parts[0] + parts[1] + ... when all elements are strings.
When to use casefold() instead of lower()?
Use casefold() for caseless comparison when lower() is insufficient (e.g. German ß). casefold() produces a more aggressive normalization.
What is the difference between find() and index()?
Both return the lowest index of the substring. find() returns -1 when not found; index() raises ValueError.
What is the difference between isdigit() and isnumeric()?
isdigit() is True for digits (0–9) and some Unicode digit symbols. isnumeric() is broader: vulgar fractions (½), Roman numerals (Ⅷ), and other numeric symbols.