I want to insert a word at the beginning of a string for each row in a column where the first word is not x. If it is x, then move on. For below, x = BP
. So if first word in cat is not BP
, then insert it.
df = pd.DataFrame({
'cat': ['BP STATION', 'STATION', 'BP OLD', 'OLD OLD'],
})
df['cat'] = df['cat'].str.replace(r'^\w+', 'BP')
intent:
cat
0 BP STATION
1 BP STATION
2 BP OLD
3 BP OLD OLD
You need your search regex to be “not BP” and you need to capture it so that it doesn’t get removed in the replacement. So you wantr'^([^B][^P])
. And the replacement regex is then r'BP \1'
.
>>> df['cat'] = df['cat'].str.replace(r'^([^B][^P])', r'BP \1')
<stdin>:1: FutureWarning: The default value of regex will change from True to False in a future version.
>>> df
cat
0 BP STATION
1 BP STATION
2 BP OLD
3 BP OLD OLD
>>>