我需要将叙述字段(自由文本)拆分为多行。目前的格式大致如下:
Case_Reference | Narrative ```````````````|````````````````````````````````````` XXXX/XX-123456 | [Endless_Text up to ~50k characters]
在文本的叙述字段中,单个条目(当各种代理对情况有所帮助时)以条目日期开头,后跟两个空格(即'dd/mm/yyyy '),并且日期值随同一字段中的每个条目而变化。
'dd/mm/yyyy '
换句话说,在寻找更好的定界符之后,我只能使用这种格式的字符串,因此我需要在Narrative文本中标识格式(掩码是否是更好的词?)匹配的多个位置'dd/mm/yyyy '。
我可以确定一致字符串的多次出现没有问题,但是它可以从本质上确定要查找的位置:
'%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %'
PATINDEX当然会返回此事件的第一个出现/位置,但据我所知,没有办法“修改”此事件(即创建的函数)以允许他们选择此事件的其余出现/位置我们可以CHARINDEX(因为PATINDEX没有起始位置参数)。
PATINDEX
CHARINDEX
为了清楚起见,我并不是在寻找直接定界的代码,因为我需要进一步操作每个条目,因此,它纯粹是我要查找的Narrative文本中多次出现的字符串的位置。
任何帮助将不胜感激。
为了清楚起见,没有选择执行此预导入,因此需要在此着陆数据上进行。
所需的输出将是
Case_Reference1 | 1st_Position_of_Delimiter_String Case_Reference1 | 2nd_Position_of_Delimiter_String Case_Reference2 | 1st_Position_of_Delimiter_String Case_Reference2 | 2nd_Position_of_Delimiter_String Case_Reference2 | 3rd_Position_of_Delimiter_String
您可以通过递归CTE解决此问题
DECLARE @tbl TABLE (Case_Reference NVARCHAR(MAX),Narrative NVARCHAR(MAX)); INSERT INTO @tbl VALUES (N'C1',N'01/02/2000 Some text with blanks 02/03/2000 More text 03/04/2000 An even more') ,(N'C2',N'01/02/2000 Test for C2 02/03/2000 One more for C2 03/04/2000 An even more 04/05/2000 Blah') ,(N'C3',N'01/02/2000 Test for C3 02/03/2000 One more for C3 03/04/2000 An even more') ; WITH recCTE AS ( SELECT 1 AS Step,Case_Reference,Narrative,CAST(1 AS BIGINT) AS StartsAt,NewPos.EndsAt+10 AS EndsAt,LEN(Narrative) AS MaxLen ,SUBSTRING(Narrative,NewPos.EndsAt+10+1,999999) AS RestString FROM @tbl AS tbl CROSS APPLY(SELECT PATINDEX('%[0-3][0-9]/[0-1][0-9]/[1-2][0-9][0-9][0-9] %',SUBSTRING(Narrative,12,9999999))) AS NewPos(EndsAt) UNION ALL SELECT r.Step+1,r.Case_Reference,r.Narrative,r.EndsAt+1,CASE WHEN NewPos.EndsAt>0 THEN r.EndsAt+NewPos.EndsAt+10 ELSE r.MaxLen END,r.MaxLen ,SUBSTRING(r.RestString,NewPos.EndsAt+10+1,999999) FROM recCTE AS r CROSS APPLY(SELECT PATINDEX('%[0-3][0-9]/[0-1][0-9]/[1-2][0-9][0-9][0-9] %',SUBSTRING(r.RestString,12,99999999))) AS NewPos(EndsAt) WHERE r.EndsAt<r.MaxLen ) SELECT Step,Case_Reference,StartsAt,EndsAt ,SUBSTRING(Narrative,StartsAt,EndsAt-StartsAt+1) AS OutputString FROM recCTE ORDER BY Case_Reference,Step
结果
+------+----------------+----------+--------+---------------------------------------+ | Step | Case_Reference | StartsAt | EndsAt | OutputString | +------+----------------+----------+--------+---------------------------------------+ | 1 | C1 | 1 | 38 | 01/02/2000 Some text with blanks | +------+----------------+----------+--------+---------------------------------------+ | 2 | C1 | 39 | 60 | 02/03/2000 More text | +------+----------------+----------+--------+---------------------------------------+ | 3 | C1 | 61 | 84 | 03/04/2000 An even more | +------+----------------+----------+--------+---------------------------------------+ | 1 | C2 | 1 | 24 | 01/02/2000 Test for C2 | +------+----------------+----------+--------+---------------------------------------+ | 2 | C2 | 25 | 52 | 02/03/2000 One more for C2 | +------+----------------+----------+--------+---------------------------------------+ | 3 | C2 | 53 | 77 | 03/04/2000 An even more | +------+----------------+----------+--------+---------------------------------------+ | 4 | C2 | 78 | 93 | 04/05/2000 Blah | +------+----------------+----------+--------+---------------------------------------+ | 1 | C3 | 1 | 24 | 01/02/2000 Test for C3 | +------+----------------+----------+--------+---------------------------------------+ | 2 | C3 | 25 | 52 | 02/03/2000 One more for C3 | +------+----------------+----------+--------+---------------------------------------+ | 3 | C3 | 53 | 76 | 03/04/2000 An even more | +------+----------------+----------+--------+---------------------------------------+