I want to insert a word at the beginning of a string for each row in a column where the first word is not x. If it is x, then move on. For below, x = BP. So if first word in cat is not BP, then insert it.
BP
df = pd.DataFrame({ 'cat': ['BP STATION', 'STATION', 'BP OLD', 'OLD OLD'], }) df['cat'] = df['cat'].str.replace(r'^\w+', 'BP')
intent:
cat 0 BP STATION 1 BP STATION 2 BP OLD 3 BP OLD OLD
You need your search regex to be “not BP” and you need to capture it so that it doesn’t get removed in the replacement. So you wantr'^([^B][^P]). And the replacement regex is then r'BP \1'.
r'^([^B][^P])
r'BP \1'
>>> df['cat'] = df['cat'].str.replace(r'^([^B][^P])', r'BP \1') <stdin>:1: FutureWarning: The default value of regex will change from True to False in a future version. >>> df cat 0 BP STATION 1 BP STATION 2 BP OLD 3 BP OLD OLD >>>