I am pretty new to SSIS, so I really appreciate the help!
I have been building a few packages for the ETL in a mini data warehouse and
I have come across a few tasks that are very common. These include things
that, ensuring the business/natural key(s) are unique, or ensuring
many-to-one hierarchies are not violated. These kinds of tasks would be used
over and over again in almost any ETL project with different inputs.
Does anyone know of some good websites were users share their custom
components? I would like to avoid building this kind of thing if someone els
e
has done it already.
ThanksThere are a couple of sites that share some custom SSIS components.
http://www.sqlis.com/
http://www.sqlbi.eu/
However I feel I should add that for both the examples that you gave, I
normally enforce at the relational database level using unique
contraints and foreign key relationships.
Regards
Darren Gosbell [MCSD]
Blog: http://geekswithblogs.net/darrengosbell
*** Sent via Developersdex http://www.codecomments.com ***|||Thanks Darren.
The two you mentioned are two of my favorites at the moment. I just want to
keep looking for more!
As for enforcing these using a relational database...It would be great if I
could do this, but I am going to be doing these kinds of checks on many
different inputs. It would take a LOT of time (both design and run time) to
put all of these into 3NF just to check for these flags. At the same time,
with the type 2 changes, I could only use the built ing relational
constraints if I filtered by "validAt" before importing the flat files into
the database.
Thank you very much for your help!
"Darren Gosbell" wrote:
> There are a couple of sites that share some custom SSIS components.
> http://www.sqlis.com/
> http://www.sqlbi.eu/
> However I feel I should add that for both the examples that you gave, I
> normally enforce at the relational database level using unique
> contraints and foreign key relationships.
> --
> Regards
> Darren Gosbell [MCSD]
> Blog: http://geekswithblogs.net/darrengosbell
> *** Sent via Developersdex http://www.codecomments.com ***
>|||> As for enforcing these using a relational database...It would be great if I">
> could do this, but I am going to be doing these kinds of checks on many
> different inputs. It would take a LOT of time (both design and run time) t
o
> put all of these into 3NF just to check for these flags. At the same time,
> with the type 2 changes, I could only use the built ing relational
> constraints if I filtered by "validAt" before importing the flat files int
o
> the database.
If you are tracking type 2 changes you would generally build a composite
unique constraint over something like SourceSystemId, BusinessKey and
ValidFromDate.
The problem that you would have with doing these sort of checks in an
SSIS component is that it would probably need to be a synchronous
component, which would dramatically increase memory usage as the entire
input would need to be buffered in order to check uniqueness. Typically
SSIS likes to work in smaller batches, pushing each batch right through
the entire pipeline if it can in order to reuse its internal memory
buffers.
Regards
Darren Gosbell - SQL Server MVP
Blog: http://www.geekswithblogs.net/darrengosbell|||Hello Darren,
Do you mean an Async component?
Regards
Allan Mitchell
Konesans Ltd
T +44 7966 476 572
F +44 2071 008 479
http://www.konesans.com
> If you are tracking type 2 changes you would generally build a
> composite unique constraint over something like SourceSystemId,
> BusinessKey and ValidFromDate.
> The problem that you would have with doing these sort of checks in an
> SSIS component is that it would probably need to be a synchronous
> component, which would dramatically increase memory usage as the
> entire input would need to be buffered in order to check uniqueness.
> Typically SSIS likes to work in smaller batches, pushing each batch
> right through the entire pipeline if it can in order to reuse its
> internal memory buffers.
>|||Hi Allan
I did mean synchronous. I was thinking that it would need to block
subsequent components from executing until it had processed all the
incoming rows so that it could check for uniqueness.
Although now you bring it up, I see that it could be written as an async
component. It would only need to maintain a buffer of all the business
keys that have passed through it and not necessarily buffer the entire
input, so the memory usage would not be as bad as I first thought.
So it might be a viable approach to check uniqueness in either an SSIS
component or in a relational store.
Regards
Darren Gosbell - SQL Server MVP
Blog: http://www.geekswithblogs.net/darrengosbell
In article <5aaa4480100ca8c8d70bf92cbae8@.msnews.microsoft.com>,
allan@.no-spam.sqldts.com says...
> Hello Darren,
> Do you mean an Async component?
>
> Regards
> Allan Mitchell
> Konesans Ltd
> T +44 7966 476 572
> F +44 2071 008 479
> http://www.konesans.com
>
>
>|||Hello Darren,
If we didn't want everything popping out the bottom of the component that
came in at the top we would have to be async. A sync component simply puts
everything it saw coming in, out, no change of buffer.
Regards
Allan Mitchell
Konesans Ltd
T +44 7966 476 572
F +44 2071 008 479
http://www.konesans.com
[vbcol=seagreen]
> Hi Allan
> I did mean synchronous. I was thinking that it would need to block
> subsequent components from executing until it had processed all the
> incoming rows so that it could check for uniqueness.
> Although now you bring it up, I see that it could be written as an
> async component. It would only need to maintain a buffer of all the
> business keys that have passed through it and not necessarily buffer
> the entire input, so the memory usage would not be as bad as I first
> thought.
> So it might be a viable approach to check uniqueness in either an SSIS
> component or in a relational store.
> In article <5aaa4480100ca8c8d70bf92cbae8@.msnews.microsoft.com>,
> allan@.no-spam.sqldts.com says...
>|||Thanks Allan, sounds like I did have my component terminology round the
wrong way.

But you can see where I was heading. Orginally I was thinking that it
would need to be a blocking component that would need to examine the
entire input before allowing subsequent component to execute. But I now
think it could be done in a much lighter fashion, only maintaining an
internal list of business keys and generating an error if it hits a
duplicate.
Regards
Darren Gosbell - SQL Server MVP
Blog: http://www.geekswithblogs.net/darrengosbell
In article <5aaa4480102118c8d7deb1db066c@.msnews.microsoft.com>,
allan@.no-spam.sqldts.com says...[vbcol=seagreen]
> Hello Darren,
>
> If we didn't want everything popping out the bottom of the component that
> came in at the top we would have to be async. A sync component simply pu
ts
> everything it saw coming in, out, no change of buffer.
>
> Regards
> Allan Mitchell
> Konesans Ltd
> T +44 7966 476 572
> F +44 2071 008 479
> http://www.konesans.com
>|||Thank you both for your comments. I did decide to build it in T-SQL for the
hierarchy check. Essentially I have a stored Procedure that uses dynamic SQL
to look for duplicates as at particular date. SSIS just loops through the
dates. For each insert, I just combine this stored procedure into the same
transaction (in SSIS) as the insert. this way, if there is ever a hierarchy
violation, the insert will fail. It will generate an SSIS error that I can
catch and handle.
But my original question still stands. If anyone knows of any other websites
for good custom components, please post here.
Thanks again,
"Darren Gosbell" wrote:
> Thanks Allan, sounds like I did have my component terminology round the
> wrong way.

> But you can see where I was heading. Orginally I was thinking that it
> would need to be a blocking component that would need to examine the
> entire input before allowing subsequent component to execute. But I now
> think it could be done in a much lighter fashion, only maintaining an
> internal list of business keys and generating an error if it hits a
> duplicate.
> --
> Regards
> Darren Gosbell - SQL Server MVP
> Blog: http://www.geekswithblogs.net/darrengosbell
> In article <5aaa4480102118c8d7deb1db066c@.msnews.microsoft.com>,
> allan@.no-spam.sqldts.com says...
>
No comments:
Post a Comment