Building a SQL Server Data Warehouse solution can be accomplished in a number of ways in Azure. For good data warehouse reference architecture you have a number of choices. At your disposal are SQL DB, Data Lake, SQL Server (both Platform as a Service (PaaS) and Infrastructure as a Service (IaaS)). Choosing the right one for your business comes down to cost and computing power.
SQL Data Warehouse
The first and the easiest of the different reference architectures is to do everything in SQL DW. Since data staging, refinement and querying is done within this technology, it is very straight forward and easy to implement. You can also easily scale up and down as your needs change. Sounds good right? Well it is also the most expensive. SQL DW is not cheap. Depending on your needs you can also run into performance problems if you are doing data refinement while users are trying to do reporting. You can also run into concurrent query limit that is present in SQL DW.
Azure Data Lake & SQL Data Warehouse
The second option is to use Azure Data Lake for data staging and refinement and using SQL DW for reporting. This is a nice option. The heavy processing is done in Data Lake which scales nicely and avoids the high cost of using SQL DW for data staging and refinement. SQL DW only being used for reporting means you can pause it while users are not accessing the reporting structure. It also means you don’t need to store as much data in SQL DW. It is also possible to use PolyBase in SQL DW to access the data in Data Lake especially for archived data. It might be a little slower but again cost is the driving factor here.
SQL Data Warehouse & SSAS Cube
The third option is to use SQL DW for staging and refinement and then copying the data for SQL Server and create an SSAS cube (either IaaS or PaaS). SQL DW is nice option for data refinement since you can do the refinement using SQL Queries and have a very powerful engine for doing big queries. The SQL Server and SSAS option for reporting gives you the benefit of not running into the concurrent query limit of SQL DW and creating reports and queries off a less expensive query solution. It also gives you the advantage of row level security and a semantic layer that SQL DW does not. The drawback is extra ETL to do the data movement and multiples copies of data.
Azure Data Lake and SSAS Cube
The forth option is to use Azure Data Lake for staging and refinement and then copying the data for SQL Server and create an SSAS cube (either IaaS or Paas). This is a great option for data staging and refinement since you can do the heaving processing using the relatively cheap Azure Data Lake. The same advantages of the third option exist for reduced costs using the SQL Server and SSAS option as do the same drawbacks exist.
The bottom line is you have to really understand what you are trying to accomplish for your Data Warehouse needs and how much you are willing to spend as well as the maintenance and ongoing care and feeding needed with each reference architecture. If you need further guidance or assistance implementing your Data Warehouse in Azure, please feel free to reach out to me. I would love to help.