Tuesday, May 1, 2012

SSIS Strip digits from a string.

Today's scenario is a "File Name" that contains digits and need to strip these digits out, it can be at the beginning of the string, middle or at the end.


If you are wondering, why there's digits! it's the size of that file and only comes at the end of the string, but just to demonstrate various scenarios that you might encounter.


For our demo, I'll create a table, has 2 columns, an ID and a FilePath column.
the output will be 4 columns, our 2 input columns along with 2 new columns, FileSize and CleanedFilePath.  Of course you can overwrite your FilePath with the digits stripped one.



Let's run the below script to create the test demo and insert 3 test rows.
<><>
SSIS Strip Digits



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    /* Create Table */ 

    IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[STripDigits]') AND type in (N'U'))
    BEGIN 

    CREATE TABLE [dbo].[STripDigits](
     [ID] [int] NULL,
     [FilePath] [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    ) ON [PRIMARY]
    END
    GO


    /* Insert Test Records */ 

    INSERT INTO STripDigits VALUES
    (1,'6590FolderProcessed')
    ,(2,'Patients32PCP')
    ,(3,'Manual1817')


    /* Verify Records */ 

    Select * from STripDigits

Steps:

1-Basically I'll add a "OLE DB Source" to read from my SQL table, I'll assume you know how to configure that.



2-Will add a "Script Component", will use "Visual Basic .NET" as my scripting language.
Will grab the Input Column (FilePath)
Will Create 2 new output columns, a string for the cleaned File Path, and a File Size for the digits portion of the string.






 Add that VB Script...
I'm using Regular Expressions here, so simple 2 lines of code..


   
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
Private StripDigits As New System.Text.RegularExpressions.Regex("\d")

    Private StripTxt As New System.Text.RegularExpressions.Regex("\D")




 Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

        '

        Row.CleanedFilePath = StripDigits.Replace(Row.FilePath, "")

        Row.FileSize = CInt(StripTxt.Replace(Row.FilePath, ""))

    End Sub







Your package should look like that...Note that i'm terminating using a Multitask so I can use a Data Viewer t ocheck the results..




 And the results.... 




The package was created using SQL 2012, but ti's the same for SQL 2008.
The package can be downloaded from https://www.box.com/s/df9e48f7800927aac46c

Hope that helps someone...

No comments:

Post a Comment