我想要达到的目标是这样的:
>>> camel_case_split("CamelCaseXYZ") ['Camel', 'Case', 'XYZ'] >>> camel_case_split("XYZCamelCase") ['XYZ', 'Camel', 'Case']
所以我搜索并找到了这个完美的正则表达式:
(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])
作为下一个逻辑步骤,我尝试了:
>>> re.split("(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])", "CamelCaseXYZ") ['CamelCaseXYZ']
为什么这不起作用,如何从python中的链接问题中获得结果?
编辑:解决方案摘要
我用一些测试用例测试了所有提供的解决方案:
string: '' AplusKminus: [''] casimir_et_hippolyte: [] two_hundred_success: [] kalefranz: string index out of range # with modification: either [] or [''] string: ' ' AplusKminus: [' '] casimir_et_hippolyte: [] two_hundred_success: [' '] kalefranz: [' '] string: 'lower' all algorithms: ['lower'] string: 'UPPER' all algorithms: ['UPPER'] string: 'Initial' all algorithms: ['Initial'] string: 'dromedaryCase' AplusKminus: ['dromedary', 'Case'] casimir_et_hippolyte: ['dromedary', 'Case'] two_hundred_success: ['dromedary', 'Case'] kalefranz: ['Dromedary', 'Case'] # with modification: ['dromedary', 'Case'] string: 'CamelCase' all algorithms: ['Camel', 'Case'] string: 'ABCWordDEF' AplusKminus: ['ABC', 'Word', 'DEF'] casimir_et_hippolyte: ['ABC', 'Word', 'DEF'] two_hundred_success: ['ABC', 'Word', 'DEF'] kalefranz: ['ABCWord', 'DEF']
总而言之,您可以说@kalefranz的解决方案与问题不符(请参阅最后一种情况),而@casimir et hippolyte的解决方案占用了一个空格,因此违反了拆分不应更改各个部分的想法。其余两个替代方案之间的唯一区别是,我的解决方案返回一个在空字符串输入中包含空字符串的列表,而@ 200_success的解决方案返回一个空列表。我不知道python社区在这个问题上的立场,所以我说:我对任何一个都很好。而且由于200_success的解决方案更简单,所以我接受了它作为正确的答案。
正如@AplusKminus解释的那样,re.split()切勿在空模式匹配上拆分。因此,您应该尝试查找感兴趣的组件,而不是拆分。
re.split()
Here is a solution using re.finditer() that emulates splitting:
re.finditer()
def camel_case_split(identifier): matches = finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier) return [m.group(0) for m in matches]