Congratulations American Airlines

I did not think it possible. American Airlines has beat all the odds and actually made the center seat more hellish! Some of you may not remember, roughly 10 years ago there were several law suits, some not valid, brought against airlines for the cramped travel conditions. Doctors even had a name for the injury done to frequent fliers who had to sit in the wrong position for so long. American Airlines, almost immediately, yanked out 2-3 rows of seats to spread them all out giving everyone more leg room.

On Monday I flew 4 hours in the center seat of a 1 month old aircraft ran by American and I can safely say the  leg room has been reduced back to pre “we want your business” levels. Not only was the distance between the seats ridiculously small, but the space under the seat in front was reduced 1/3 to accommodate some idiotic metal box that was bolted to the seat support that was also offset 3+ inches from where it should have been.

There was no lumbar support in the seat back. I was writhing to gain relief between lower back pain, upper back pain and calf spasms the entire journey. They only make the aisle seats available to customers with status. Well there’s a great way to discourage a person from flying with you enough to earn status. They’ll be wheelchair bound by 25k miles.

Coming Home

Almost 2 months ago I had my first day at Avanade. For those of you who don’t know, Avanade was cerated as a join venture between Microsoft and Accenture. Avanade has thier own business development streams but 99.9% of the Microsoft projects Accenture wins, are sent to the the Avanade team for execution.

Well let me just say what an absolute joy it has been to come back to the Microsoft family of products. After 13 months of wasting my life away fighting with Open Source garbage, I’ve come home to integrated enterprise solutions that work as advertised or at least have some reliable sources for support when they don’t. I was actually told to stop blogging about how much the Open Stack is a waste of time and money… Anyway, that’s behind me.

To add to the good vibes, Avanade is connected to Microsft in so many ways. We’ve actually had advanced looks at new technologies before the rest of the community. There 20+ MVPs in just the Midwest region, Avanade requires 80+ hours of training every year, and employees are encouraged to participate in developer community organizations.

I’m excited to talk about the first area of expertise they’d like me to look at, Avanade Touch Analytics (ATA). I haven’t completed the training yet, but this offering is fantastic. The easiest interface I’ve ever used to create dashboards that look and feel like Tableau or Spotfire, but perform lightyears ahead of both. Once the data sources are made available to the ATA server for any customer’s instance, the dashboards can be authored for or on any device. Switch between layout views to see how your dashboards will look on any device before releasing them. Publish multiple dashboards to different Active Directory security groups and let your users pick the information that’s important to them. It’s exciting, and I’m glad to see an offering addressing the shortcomings of the competition in a hosted or onsite instalations.

Well that’s enough advertising. Now that my censorship is at an end, I’ll be blogging mroe often I really want to discuss SQL Server’s memory resident database product, interesting things I’ve learned about the SSIS Service recently, and Service Broker.

Consulting 101: Credibility and Integrity

Let me preface this treatise with a message to those in my audience who actually know me in person. I’ve been doing what I do for almost 18 years. My blog posts are a compilation of observations stretching that whole time and back into my years in grade school. I do not refer to anyone in particular any of you and I may know. My blogs are mostly about me.
How many times can a restaurant you frequent get your order wrong before you stop spending your money there? How many times can a garage fail to fix your car before you take it somewhere else for service? As a consultant, contractor, or subject matter expert, how many mistakes is your customer willing to forgive? I don’t know either, so I always shoot for perfection.
In my practice, the struggle for perfection means I will not quickly offer my gut feeling on a solution to a problem. I want to research the situation and think it over for some time until I am comfortable taking a position. The discipline to be 99% sure about something before I share it helps me avoid mistakes. The more often I’m right the more my credibility builds. The buildup of credibility eventually leads to my customers’ increasing confidence in my work. And that’s great because, a lack of confidence in my expertise always manifests itself as more time wasted in explanations, healthy debate, and sometimes fruitless arguments about things I’m at least 99% sure of.
Relatively, I do not propose solutions that I cannot implement 100% myself. There is a theme of helplessness prevailing through some workplace environments; taking the shape of people who will not lift a finger to figure something out without being fully trained and having a stack of documentation. I’m going to put on my old fogey hat now and relate to you, my audience, how my first ASP web sites were written in notepad. My “simulator” was an actual Windows NT server with IIS and FrontPage extensions. In those days there wasn’t any documentation really because we were figuring it out as we went. I was handed a challenge that usually looked nothing like requirements and told to go figure it out. I did figure it out without training and it made a better professional out of me.
So when I say, “Let’s do it this way.” I mean I can do the whole thing this way myself if I have to. I’m 99% sure it will meet all the requirements on paper and the several that you haven’t thought of yet.
Now, I am human and I do make mistakes. Under the perfection mandate, I strive to find my mistakes and fix them before everyone notices. I once worked for a company where the products all had a call home feature. When there was an error the system would either dial in or FTP a message to a system in the home office that would create a ticket and kick off a workflow for resolution. I was so impressed by the fact that a customer could come in the office in the morning to find an email from tech support notifying them an error was detected and fixed remotely overnight and the customer suffered no outage as a result. I strive to conduct my business the same way by fixing an issue as soon as I determine it’s my responsibility and then explaining what happened and how I fixed it. That’s exercising integrity to build credibility. The value of building credibility is always greater than the perceived liability of admitting to bugs with integrity.
All that said, every action has its equal and opposite reaction. There will always be competitive forces… or persons who will work to build credibility through damaging yours. After all, it seems hard to build credibility by simply agreeing with someone else all the time even if the other person has a 99% success rate. The perception is that always agreeing with another makes one a follower or toady. Likewise, some resources are hiding the fact that they will not succeed with your proposal because it involves things they haven’t been trained on. Yes, the corporate business environment often mirrors school yard factions carving out various spheres of dominance. Woe unto the executive staff that has to always play teacher or referee. Truly, you have to pity decision makers who are constantly dealing with weak personalities who cannot tolerate others discovering they may not be perfect, so seek to advance solely through bringing down others.
The school yard provides the tactic for dealing with this. Get to teacher first! Luckily, if you’re catching, fixing, and admitting to your short comings before anyone notices, your competition shows up to tattle on you and looks rather foolish. Teacher says, “Yes I know. He told me and corrected the issue in such a seamless matter we never knew anything was wrong.”
Don’t misunderstand. It makes me sick that adults conduct themselves in this matter. It’s one of the reasons I sought the freedom of working for myself. Even now, when these situations arise, I suffer less than healthy rises in blood pressure. Why do we have to go through this schoolyard battle again after I’ve already built up all this credibility? The point is to revert back to the idea of not immediately going with gut reactions mentioned above. Don’t fall into the competitive traps. Diligently building credibility through accuracy and integrity should, in theory, pay off in the long run. Optionally, find a sub-contractor and throw them to the wolves.

I’m not a DBA, But I Play One on TV: Part 3 – Database Files

When a customer invites me to review their SQL Server or Oracle databases and server architecture, I start with the servers. I review the hard disk layout and a few server settings. The very next thing I do is review the data files and log files for the databases. In the case of SQL Server, when I see one data file and one log file in the same directory and the database has one file group called Primary, I know I am once again presiding over amateur hour at the local chapter of the Jr. Database Developer Wannabe Club.

 

One file pointing to one file group indicates to me:

  1. Someone went through the “create new database” wizard.
  2. There wasn’t any pre-development design analysis done before the database was created
  3. No one bothered to check readily available best practices for SQL Server
  4. I can anticipate equally uninformed approaches to table and index design and query authoring

 

This will antagonize the hardware striping advocacy group, but there are reasons to split up split up your data files and log files. Specifically in the case of TempDB files, you can greatly improve performance by creating the same number of log files as you have processors. With this configuration each processor will control the I/O for each file.

 

Check out number 8 here: http://technet.microsoft.com/en-US/library/cc966534

 

In addition to performance, recovery processes greatly benefit for splitting up the database files. Previously, if a data file failed, if everything was in one file or not, SQL Server would take the database offline. With SQL Server 2012 a new feature was added that will leave your database accessible, just not the data located in the corrupt or otherwise unavailable file. Well if all the data is in that one file your dataset is down until you can recover. Even if that data file contains only a subset of the data in a table, the rest of the data in that table is still available for querying.

 

Now, you might say ok we’re going to have a separate file for every table and multiple files for some. Ok, I’ve seen that configuration and there isn’t anything wrong with it. If your IT department isn’t using SQL Server to manage their backups, instead they’re backing up the actual files across all the drives, they’re going to be annoyed with you. However, this configuration gives you maximum flexibility.  For instance, placing tables that are commonly used at the same time on different spindles won’t conflict for disk I/O.

 

Splitting up your log files is also beneficial. Log files are populated in a round robin fashion. When one reaches the level you’ve set it starts filling up the next. Hopefully you have at least 4 and they are of a sufficient size. This gives you time to archive the transaction logs between backups making sure no transactions are lost due to the file rolling over before the backup removes completed transactions and shrinks the file.

 

Next episode will cover backup basics. The purpose in all these posts is to provide the understanding to apply the best configuration to the database system your building.

 

I’m not a DBA, But I Play One on TV: Part 2 – CPU and RAM

In Part 1 I discussed SQL Server and Hard Disk configurations. Now let’s have a look at CPU and RAM. This topic is actually kind of easy. More is better… most of the time.

CPU

It’s my opinion that most development environments should have a minimum of 4, 2.5+ GHz Processors, If that’s one socket with two cores, or one socket with 4 cores or, or two sockets with 2 cores, doesn’t really make that much of a difference. For a low utilization production system you’ll need 8, 2.5+ GHz processors. Look, you can get this level of chip in a mid-high grade laptop. Now if you’re looking at a very high utilization system it’s time to think about 16 processors or 32 split up over 2 or more sockets. Once you get to the land of 32 processors advanced SQL Server configuration knowledge is required. In particular you will need to know how to tweak the MAXDOP (Maximum Degree of Parallelism) setting.

Here’s a great read for setting a query hint: http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/

And here are instructions for a system wide setting: http://technet.microsoft.com/en-us/library/ms189094(v=sql.105).aspx

What does this setting do? It controls the number of parallel processes SQL Server will use when servicing your queries. So why don’t we want SQL Server to maximize the number of parallel processes all the time? There is another engine involved in the process that is responsible for determining which processes can and cannot be done in parallel and the order of the parallel batches. In a very highly utilized SQL Server environment this engine can get bogged down. Think of it like air traffic control at a large airport… but there’s only one controller in the tower and it’s Thanksgiving the biggest air travel holiday in the US. Well the one air traffic controller has to assign the runway for every plane coming in and going out. Obviously, he/she becomes the bottleneck for the whole airport. If this individual only had one or two runways to work with, they wouldn’t be the bottleneck; the airport architecture is the bottleneck. I have seen 32 processor systems grind to a halt with MAXDOP set at 0 because the parallelism rule processing system was overwhelmed.

For more information on the parallel processing process: http://technet.microsoft.com/en-us/library/ms178065(v=sql.105).aspx

RAM

RAM is always a “more is better” situation. Keep in mind that if you don’t set the size and location of the page file manually, the O/S is going to try and take 1.5 times of the RAM from the O/S hard drive. The more RAM on the system, the less often the O/S will have to utilize the much slower page file. For a development system 8GB will probably be fine, but now a days you can get a mid-high level Laptop with 16GB even 32GB is getting pretty cheap. For production 16GB is the minimum, but I’d really urge you to get 24GB. And like I said 32GB configurations are becoming very affordable.

I’m not a DBA, But I Play One on TV: Part 1 – Hard Drives

This is the first in a series of posts relating to hardware considerations for a SQL Server 2008 R2 or later server. In Part 1 – Hard Drives I’m going to discuss RAID levels and what works for the Operating System (O/S) versus what works for various SQL Server components.

As a consultant I always go through the same hardware spec dance. It sounds like this:

Q: How much disk space does your application database require?

A: Depends on your utilization.

Q: Ok, what’s the smallest server we can give you for a proof of concept or 30 day trial?

A: Depends on your utilization.

Q: Well we have this VM with a 40 GB disk, 8 GB RAM, and a dual Core virtual processor available. Will that work?

A: Depends on your utilization, but I seriously doubt it.

SQL Server 2008 R2, depending on the flavor will run on just about any Windows Server O/S 2005 and newer, Windows 7 and Windows 8. This isn’t really a discussion about the O/S, more of how the O/S services SQL Server hardware requests. At the hardware level the O/S has two main functions managing memory and the hard disks and servicing requests to those resources to applications.

In a later post we’ll look at memory in a little more depth, but for the hard disk discussion we’ll need to understand the page file. The page file has been part of Microsoft’s O/S products since NT maybe windows for workgroups, but I don’t want to go look it up. The page file is an extension of the physical memory that resides one or more of the system’s hard disks. The O/S will decide when to access this portion of the Memory available to services and applications (processes) requesting memory resources. Many times when a process requires more memory than is currently available the O/S will use the page file to virtually increase the size of the memory on the system in a manner transparent to the requesting process.

Let’s sum that up. The page file is a portion of disk space used by the O/S to expand the amount of memory available to processes running on the system. The implication here is that the O/S will be performing some tasks meant for lightning fast chip RAM, on the much slower hard disk virtual memory because there is insufficient chip RAM for the task. By default the O/S wants to set aside 1.5 times the physical chip RAM in virtual memory disk space. For 16GB of RAM that’s a 24GB page file. On a 40GB drive that doesn’t leave much room for anything else. The more physical chip RAM on the server the bigger the O/S will want to make the page file, but the O/S will actually access it less often.

Now let’s talk RAID settings! You may find voluminous literature arguing the case for software RAID versus Hardware Raid. I’ll leave that to the true server scientists. I’m just going to give quick list of which RAID configurations O/S and SQL Server components will perform well with and which will cause issues. I’m going for understanding here. There are plenty of great configuration lists you can reference, but if you don’t understand how this stuff works you’re relying on memorization or constantly referencing the lists.

Summarization from: http://en.wikipedia.org/wiki/RAID

But this has better pictures: http://technet.microsoft.com/en-us/library/ms190764(v=SQL.105).aspx

RAID 0 – Makes multiple disks act like one, disk size is the sum of all identical disk sizes and there isn’t any failover or redundancy. One disk dies and all info is lost on all drives.

RAID 1 – Makes all the disks act like one, disk size is that of one of the identical disks in the array. Full fail over and redundancy.

RAID 2 – Theoretical, not used. Ha!

RAID 3 – Not very popular, but similar RAID 1, except that each third byte switches to the next disk in the array.

RAID 4 – One drive holds pointers to which drive holds each file. All disks act independently buy access by one drive letter.

RAID 5 – Requires at least 3 identical drives. All but one are live at all times the last acts as a backup should one of the other drives fail.

RAID 6 – Like RAID 5 except, you need at least 4 identical disks and two are offline backup disks.

RAID 10 or 1+0 – A tiered approach where two groups of RAID 1 arrays form a RAID 0 array. So two fully redundant RAID 1 arrays of 500GB made up of 3 500GB disks come together to form 1 RAID 0 array of 1TB. Sounds expensive, 3TB in physical disks to get 1TB accessible drive.

At this point I’ll paraphrase the information found here: http://technet.microsoft.com/en-US/library/cc966534

SQL Server Logs are written synchronously. One byte after the other. There isn’t any random or asynchronous read requests performed against these files by SQL Server. RAID 1 or 1+0 is recommended for this component for two reasons 1. Having a full redundant backup of the log files for disaster recovery. 2. RAID 1 mirrored drives support the sequential write I/O (I/O is short for disk read and write Input and Output. I’m not going to write that 50 times.) of the log file process better than RAID configuration that will split one file over multiple disks.

TempDB is the workhorse of SQL Server. When a query is sent to the databases engine all the work of collecting, linking, grouping, aggregating and ordering happens in the TempDB before the results are sent to the requestor. This makes TempDB a heavy write I/O process. So the popular recommendation is RAID 1+0. Here’s the consideration, TempDB is temporary, and that’s where it gets its name from. So redundancy isn’t required for disaster recovery. However if the disk your TempDB files are on fails, no queries can be processed until the disk is replaced and TempDB restored/rebuilt. RAID 1+0 helps fast writes and ensures uptime. RAID 5 provides the same functionality with fewer disks, but decreased performance when a disk fails.

TempDB and the Logs should NEVER EVER reside on the same raid arrays, so if we’re talking a minimum two RAID 1+0 arrays, might be more cost effective to put TempDB on RAID 5.

Application OLTP (On-line Transaction Processing) databases will benefit the most from RAID 5, which equally supports read and write I/O. Application databases should NEVER EVER reside on the same arrays as the Log files and co-locating with TempDB is also not recommended.

SQL Server comes with other database engine components like the master database and MSDB. These are SQL Server configuration components and mostly utilize read I/O. It’s good to have these components on a mirrored RAID configuration that doesn’t need a lot of write performance, like RAID 1.

A best practice production SQL Server configuration minimally looks like this:

Drive 1: O/S or C: Drive where the virtual memory is also serviced – RAID 1, 80 to 100 GB.

Drive 2: SQL Server Components (master, MSDB, and TempDB) data files – RAID 1+0, 100-240 GB

Drive 3: SQL Server Logs – RAID 1+0, 100-240 GB

Drive 4: Application databases – RAID 5, As much as the databases need…

Where to skimp on a development system? Maybe RAID isn’t available either?

Drive 1: O/S or C: Drive where the virtual memory is also serviced, 80 to 100 GB.

Drive 2: SQL Server Components (master, MSDB, and TempDB) data files Application database files, As much as the databases need…

Drive 3: SQL Server Logs, 100-240 GB

Optimal Production configuration?

Drive 1: O/S or C: Drive – RAID 1, 60 GB.

Drive 2: SQL Server Components (master, MSDB) data files – RAID 5, 100GB

Drive 3: SQL Server Logs – RAID 1+0, 100-240 GB

Drive 4: Application databases – RAID 5, As much as the databases need…

Drive 5: TempDB RAID 1+0, 50–100 GB

Drive 6: Dedicated Page File only RAID 1, 40GB. You don’t want to see what happens to a Windows O/S when the page file is not available.

Buffer I/O is the bane of my existence. I have left no rock unturned on the internet trying to figure out how this process works. So if someone reading can leave a clarifying comment for an edit I’d appreciate it. This I do know, the buffer is kind of like SQL Server’s own page file. A place on a hard disk where information is staged before it is written to the memory pool managed by the O/S. If your system is low on memory and using the page file extensively you will see Buffer I/O waits in the SQL Server Management Studio activity monitor. Basically, this indicates that the staging process is waiting on memory to become available to move data out of the buffer and into the memory pool. The query can’t write more information to the buffer until there is space open in the buffer for it. In fact if the query resultset is big enough, the whole system will begin to die a slow and horrible death as information cannot move in and out of memory or in and out of the buffer because so much information is going in and out of the page file. This is why I highly recommend splitting up the disks so that SQL Server does not have to fight with the page file for Disk I/O.

Look if you have 10 records in one table used by one user 2 times a day that VM with a 40 GB disk, 8 GB RAM, and a dual Core virtual processor available is going to do just fine. But you might as well save some cash and move that sucker onto Access or MYSQL or some other non-enterprise level RDBMS.

 

 

Kimball Dimensional Modeling Practices Waterfall Only?

Kimball Dimensional Modeling theory and practices are the most widely accepted processes for consolidating data from different sources into a central “Delivery Area” for consolidated cross functional reporting. Or simply a process for normalizing and standardizing data from several data marts into a data warehouse for Business Intelligence (BI) reporting.

From “The Data Warehouse Tool Kit”; Second Edition; Kimball, Ross; 2002; John Wiley and Sons, Inc.; Page 22:

Finally, dimensional models are gracefully extensible to accommodate change. The predictable framework of a dimensional model withstands unexpected changes in user behavior. Every dimension is equivalent; all dimensions are symmetrically equal entry points into the fact table. The logical model has no built-in bias regarding expected query patterns. There are no preferences for the business questions we’ll ask this month versus the questions we’ll ask next month. We certainly don’t want to adjust our schemas if business users come up with new ways to analyze the business.

The main tool used to discover the applicable Dimensions for a model is the Business Process Dimensions Matrix, see Figure 1. Each row represents a Business Process and Each Column a Dimension available within the Delivery Area that is populated manually or through the consolidation of the source data.

BPM

Figure 1. Kimball, Ross; et al.; Page 79.

On this surface this looks like a Waterfall approach… Identifying all the requirements upfront before development starts. I disagree. This document should be an organic repository that is constantly updated with changes like new Dimensions or Business process as additional data sources are added to the system. Further, I believe this matrix is the perfect primer for authoring User Stories. For Instance the first Business Process would translate to:

As an Inbound Contact Center Supervisor I want to see Voice, Chat, Email and Fax metrics summarized by Date, Time, Agents (Users), Goals and Locations So that I can…

The last section of the User Story where the reason or benefit is recorded also derive from a central tenant of BI practices. That central tenant is, “Every report must answer a question to aid in the conclusion of one or more business decisions.” The question we’re answering does not show up on the matrix, but should be part of the BI project management artifacts and the User story is the perfect place to record it.

If you’re reading this you may be a BI solution developer and suffered the frustrations of pointless and repetitive presentations (reports and dashboards) because your customers don’t know what they want. Someone on the project must take it upon themselves to get the stake holders to commit to the questions they want to answer. In Agile Scrum, it would make sense that the Product Owner maintains the matrix and the user stories and therefore should be responsible for those commitments. Here’s an Example of the resulting user story.

As an Inbound Contact Center Supervisor I want to see Voice, Chat, Email and Fax metrics summarized by Date, Time, Agents (Users), Goals and Locations So that I can more accurately forecast future staffing needs.

The wide acceptance of Kimball practices predates the wide acceptance of Agile Iterative Development practices. Therefore, several professionals in the space are unwilling to adapt their practice of Kimball methodologies. Hopefully this discussion will aid in efforts to convince these BI resources to modify their approach to conform to the Software Development Life Cycle (SDLC) methodology the rest of the development team uses.