Moving denormalized data to Reporting DB











up vote
1
down vote

favorite












Current situation:



We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.



To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.



The problem:



This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)



The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:




  • Using a stored procedure:


Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.




  • Creating a client application:


Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.




  • Using an SSIS job:


Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.



I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.



Any advice is really apreciated!










share|improve this question




















  • 1




    many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
    – Mitch Wheat
    21 hours ago












  • We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
    – colosso
    20 hours ago










  • "but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
    – Mitch Wheat
    20 hours ago












  • Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
    – JonTout
    20 hours ago










  • The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
    – colosso
    19 hours ago















up vote
1
down vote

favorite












Current situation:



We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.



To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.



The problem:



This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)



The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:




  • Using a stored procedure:


Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.




  • Creating a client application:


Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.




  • Using an SSIS job:


Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.



I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.



Any advice is really apreciated!










share|improve this question




















  • 1




    many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
    – Mitch Wheat
    21 hours ago












  • We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
    – colosso
    20 hours ago










  • "but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
    – Mitch Wheat
    20 hours ago












  • Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
    – JonTout
    20 hours ago










  • The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
    – colosso
    19 hours ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











Current situation:



We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.



To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.



The problem:



This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)



The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:




  • Using a stored procedure:


Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.




  • Creating a client application:


Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.




  • Using an SSIS job:


Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.



I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.



Any advice is really apreciated!










share|improve this question















Current situation:



We have a SQL database containing all of our business data. There are dozens of objects like customers, tickets and products. Each object is saved in multiple tables, usually between 8 and 12 tables per objects. So our database has way over 200 tables right now.



To make Reporting easier, we want to create a Reporting Database where we denormalize those objects to make it more accessible and improve the performance of our live DB since all reports take data from the live db right now.



The problem:



This is there the tricky past starts: we also need to create new columns with data in the target / reporting db where we can store calculated data for reporting. For example how long did a ticket take from start to finish or who did order a product (denormalized as text comma separated)



The question is , what is the best solution to transfer this data to a new database? We had different approaches in mind:




  • Using a stored procedure:


Creating a sql script that transfers all defined columns to the new database using for example merge. Downside here is that it will probabely be performance heavy since all caculations would have to be done by the database itself.




  • Creating a client application:


Creating a client application that selects the data from the live db, calculates all the necessary columns during runtime and inserts it into the reporting database.




  • Using an SSIS job:


Which is kind of a mix between the two above: A ssis job that gets the data, calculates all extended columns and puts them into the reporting db.



I know this might be a bit of a "opinion based" questions but there has to be a best practice solution since we cant be the only ones having this need. Unfortunately I really cand find any good answer on the web.



Any advice is really apreciated!







sql sql-server database reporting-services






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 21 hours ago

























asked 21 hours ago









colosso

2,16631739




2,16631739








  • 1




    many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
    – Mitch Wheat
    21 hours ago












  • We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
    – colosso
    20 hours ago










  • "but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
    – Mitch Wheat
    20 hours ago












  • Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
    – JonTout
    20 hours ago










  • The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
    – colosso
    19 hours ago














  • 1




    many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
    – Mitch Wheat
    21 hours ago












  • We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
    – colosso
    20 hours ago










  • "but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
    – Mitch Wheat
    20 hours ago












  • Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
    – JonTout
    20 hours ago










  • The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
    – colosso
    19 hours ago








1




1




many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
21 hours ago






many ETL processes are a combination of all 3 of the approaches you listed. Have you considered first using replication to a staging database? (or use AG) Otherwise, it's likely that the ETL will put load on your source DB (the problem you are trying to avoid!). Related: dba.stackexchange.com/questions/52449/…
– Mitch Wheat
21 hours ago














We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
20 hours ago




We have considered using Always On to have a replication but unfortunately its too expensive (moneywise). But the load on the live DB could be avoided using a client application those few selects would not cost any perormance and all calculations could be done by the application server, right?
– colosso
20 hours ago












"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
20 hours ago






"but unfortunately its too expensive (moneywise)" - but you need another server anyway? right? ... "But the load on the live DB could be avoided using a client application those few selects would not cost any performance" - why do you think that? It's you are performing transformations those selects could still take time....
– Mitch Wheat
20 hours ago














Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
20 hours ago




Write the output of the query into a table in the reporting server? We have at least 3 different student management systems where I work, all of the servers are linked and there are overnight (or hourly) queries run from one of those systems to store data for reporting in a "reports database"
– JonTout
20 hours ago












The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
19 hours ago




The ReportingDB would be on the same instance as the live database. Do you think that would be a problem? Its more a political problem, we wont get any more finaces for a new server right now so we need to work with what we have. I thought about transfering a delta every 15 minutes. A select on the last changed rows within the last 15 minutes even if we target all 200 tables would not have any inpact at all in my opinion.
– colosso
19 hours ago

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371067%2fmoving-denormalized-data-to-reporting-db%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371067%2fmoving-denormalized-data-to-reporting-db%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Volksrepublik China

How to test boost logger output in unit testing?

Write to the output between two pipeline