Coming Home

Almost 2 months ago I had my first day at Avanade. For those of you who don’t know, Avanade was cerated as a join venture between Microsoft and Accenture. Avanade has thier own business development streams but 99.9% of the Microsoft projects Accenture wins, are sent to the the Avanade team for execution.

Well let me just say what an absolute joy it has been to come back to the Microsoft family of products. After 13 months of wasting my life away fighting with Open Source garbage, I’ve come home to integrated enterprise solutions that work as advertised or at least have some reliable sources for support when they don’t. I was actually told to stop blogging about how much the Open Stack is a waste of time and money… Anyway, that’s behind me.

To add to the good vibes, Avanade is connected to Microsft in so many ways. We’ve actually had advanced looks at new technologies before the rest of the community. There 20+ MVPs in just the Midwest region, Avanade requires 80+ hours of training every year, and employees are encouraged to participate in developer community organizations.

I’m excited to talk about the first area of expertise they’d like me to look at, Avanade Touch Analytics (ATA). I haven’t completed the training yet, but this offering is fantastic. The easiest interface I’ve ever used to create dashboards that look and feel like Tableau or Spotfire, but perform lightyears ahead of both. Once the data sources are made available to the ATA server for any customer’s instance, the dashboards can be authored for or on any device. Switch between layout views to see how your dashboards will look on any device before releasing them. Publish multiple dashboards to different Active Directory security groups and let your users pick the information that’s important to them. It’s exciting, and I’m glad to see an offering addressing the shortcomings of the competition in a hosted or onsite instalations.

Well that’s enough advertising. Now that my censorship is at an end, I’ll be blogging mroe often I really want to discuss SQL Server’s memory resident database product, interesting things I’ve learned about the SSIS Service recently, and Service Broker.

Consulting 101: Credibility and Integrity

Let me preface this treatise with a message to those in my audience who actually know me in person. I’ve been doing what I do for almost 18 years. My blog posts are a compilation of observations stretching that whole time and back into my years in grade school. I do not refer to anyone in particular any of you and I may know. My blogs are mostly about me.
How many times can a restaurant you frequent get your order wrong before you stop spending your money there? How many times can a garage fail to fix your car before you take it somewhere else for service? As a consultant, contractor, or subject matter expert, how many mistakes is your customer willing to forgive? I don’t know either, so I always shoot for perfection.
In my practice, the struggle for perfection means I will not quickly offer my gut feeling on a solution to a problem. I want to research the situation and think it over for some time until I am comfortable taking a position. The discipline to be 99% sure about something before I share it helps me avoid mistakes. The more often I’m right the more my credibility builds. The buildup of credibility eventually leads to my customers’ increasing confidence in my work. And that’s great because, a lack of confidence in my expertise always manifests itself as more time wasted in explanations, healthy debate, and sometimes fruitless arguments about things I’m at least 99% sure of.
Relatively, I do not propose solutions that I cannot implement 100% myself. There is a theme of helplessness prevailing through some workplace environments; taking the shape of people who will not lift a finger to figure something out without being fully trained and having a stack of documentation. I’m going to put on my old fogey hat now and relate to you, my audience, how my first ASP web sites were written in notepad. My “simulator” was an actual Windows NT server with IIS and FrontPage extensions. In those days there wasn’t any documentation really because we were figuring it out as we went. I was handed a challenge that usually looked nothing like requirements and told to go figure it out. I did figure it out without training and it made a better professional out of me.
So when I say, “Let’s do it this way.” I mean I can do the whole thing this way myself if I have to. I’m 99% sure it will meet all the requirements on paper and the several that you haven’t thought of yet.
Now, I am human and I do make mistakes. Under the perfection mandate, I strive to find my mistakes and fix them before everyone notices. I once worked for a company where the products all had a call home feature. When there was an error the system would either dial in or FTP a message to a system in the home office that would create a ticket and kick off a workflow for resolution. I was so impressed by the fact that a customer could come in the office in the morning to find an email from tech support notifying them an error was detected and fixed remotely overnight and the customer suffered no outage as a result. I strive to conduct my business the same way by fixing an issue as soon as I determine it’s my responsibility and then explaining what happened and how I fixed it. That’s exercising integrity to build credibility. The value of building credibility is always greater than the perceived liability of admitting to bugs with integrity.
All that said, every action has its equal and opposite reaction. There will always be competitive forces… or persons who will work to build credibility through damaging yours. After all, it seems hard to build credibility by simply agreeing with someone else all the time even if the other person has a 99% success rate. The perception is that always agreeing with another makes one a follower or toady. Likewise, some resources are hiding the fact that they will not succeed with your proposal because it involves things they haven’t been trained on. Yes, the corporate business environment often mirrors school yard factions carving out various spheres of dominance. Woe unto the executive staff that has to always play teacher or referee. Truly, you have to pity decision makers who are constantly dealing with weak personalities who cannot tolerate others discovering they may not be perfect, so seek to advance solely through bringing down others.
The school yard provides the tactic for dealing with this. Get to teacher first! Luckily, if you’re catching, fixing, and admitting to your short comings before anyone notices, your competition shows up to tattle on you and looks rather foolish. Teacher says, “Yes I know. He told me and corrected the issue in such a seamless matter we never knew anything was wrong.”
Don’t misunderstand. It makes me sick that adults conduct themselves in this matter. It’s one of the reasons I sought the freedom of working for myself. Even now, when these situations arise, I suffer less than healthy rises in blood pressure. Why do we have to go through this schoolyard battle again after I’ve already built up all this credibility? The point is to revert back to the idea of not immediately going with gut reactions mentioned above. Don’t fall into the competitive traps. Diligently building credibility through accuracy and integrity should, in theory, pay off in the long run. Optionally, find a sub-contractor and throw them to the wolves.

I’m not a DBA, But I Play One on TV: Part 2 – CPU and RAM

In Part 1 I discussed SQL Server and Hard Disk configurations. Now let’s have a look at CPU and RAM. This topic is actually kind of easy. More is better… most of the time.

CPU

It’s my opinion that most development environments should have a minimum of 4, 2.5+ GHz Processors, If that’s one socket with two cores, or one socket with 4 cores or, or two sockets with 2 cores, doesn’t really make that much of a difference. For a low utilization production system you’ll need 8, 2.5+ GHz processors. Look, you can get this level of chip in a mid-high grade laptop. Now if you’re looking at a very high utilization system it’s time to think about 16 processors or 32 split up over 2 or more sockets. Once you get to the land of 32 processors advanced SQL Server configuration knowledge is required. In particular you will need to know how to tweak the MAXDOP (Maximum Degree of Parallelism) setting.

Here’s a great read for setting a query hint: http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/

And here are instructions for a system wide setting: http://technet.microsoft.com/en-us/library/ms189094(v=sql.105).aspx

What does this setting do? It controls the number of parallel processes SQL Server will use when servicing your queries. So why don’t we want SQL Server to maximize the number of parallel processes all the time? There is another engine involved in the process that is responsible for determining which processes can and cannot be done in parallel and the order of the parallel batches. In a very highly utilized SQL Server environment this engine can get bogged down. Think of it like air traffic control at a large airport… but there’s only one controller in the tower and it’s Thanksgiving the biggest air travel holiday in the US. Well the one air traffic controller has to assign the runway for every plane coming in and going out. Obviously, he/she becomes the bottleneck for the whole airport. If this individual only had one or two runways to work with, they wouldn’t be the bottleneck; the airport architecture is the bottleneck. I have seen 32 processor systems grind to a halt with MAXDOP set at 0 because the parallelism rule processing system was overwhelmed.

For more information on the parallel processing process: http://technet.microsoft.com/en-us/library/ms178065(v=sql.105).aspx

RAM

RAM is always a “more is better” situation. Keep in mind that if you don’t set the size and location of the page file manually, the O/S is going to try and take 1.5 times of the RAM from the O/S hard drive. The more RAM on the system, the less often the O/S will have to utilize the much slower page file. For a development system 8GB will probably be fine, but now a days you can get a mid-high level Laptop with 16GB even 32GB is getting pretty cheap. For production 16GB is the minimum, but I’d really urge you to get 24GB. And like I said 32GB configurations are becoming very affordable.

Kimball Dimensional Modeling Practices Waterfall Only?

Kimball Dimensional Modeling theory and practices are the most widely accepted processes for consolidating data from different sources into a central “Delivery Area” for consolidated cross functional reporting. Or simply a process for normalizing and standardizing data from several data marts into a data warehouse for Business Intelligence (BI) reporting.

From “The Data Warehouse Tool Kit”; Second Edition; Kimball, Ross; 2002; John Wiley and Sons, Inc.; Page 22:

Finally, dimensional models are gracefully extensible to accommodate change. The predictable framework of a dimensional model withstands unexpected changes in user behavior. Every dimension is equivalent; all dimensions are symmetrically equal entry points into the fact table. The logical model has no built-in bias regarding expected query patterns. There are no preferences for the business questions we’ll ask this month versus the questions we’ll ask next month. We certainly don’t want to adjust our schemas if business users come up with new ways to analyze the business.

The main tool used to discover the applicable Dimensions for a model is the Business Process Dimensions Matrix, see Figure 1. Each row represents a Business Process and Each Column a Dimension available within the Delivery Area that is populated manually or through the consolidation of the source data.

BPM

Figure 1. Kimball, Ross; et al.; Page 79.

On this surface this looks like a Waterfall approach… Identifying all the requirements upfront before development starts. I disagree. This document should be an organic repository that is constantly updated with changes like new Dimensions or Business process as additional data sources are added to the system. Further, I believe this matrix is the perfect primer for authoring User Stories. For Instance the first Business Process would translate to:

As an Inbound Contact Center Supervisor I want to see Voice, Chat, Email and Fax metrics summarized by Date, Time, Agents (Users), Goals and Locations So that I can…

The last section of the User Story where the reason or benefit is recorded also derive from a central tenant of BI practices. That central tenant is, “Every report must answer a question to aid in the conclusion of one or more business decisions.” The question we’re answering does not show up on the matrix, but should be part of the BI project management artifacts and the User story is the perfect place to record it.

If you’re reading this you may be a BI solution developer and suffered the frustrations of pointless and repetitive presentations (reports and dashboards) because your customers don’t know what they want. Someone on the project must take it upon themselves to get the stake holders to commit to the questions they want to answer. In Agile Scrum, it would make sense that the Product Owner maintains the matrix and the user stories and therefore should be responsible for those commitments. Here’s an Example of the resulting user story.

As an Inbound Contact Center Supervisor I want to see Voice, Chat, Email and Fax metrics summarized by Date, Time, Agents (Users), Goals and Locations So that I can more accurately forecast future staffing needs.

The wide acceptance of Kimball practices predates the wide acceptance of Agile Iterative Development practices. Therefore, several professionals in the space are unwilling to adapt their practice of Kimball methodologies. Hopefully this discussion will aid in efforts to convince these BI resources to modify their approach to conform to the Software Development Life Cycle (SDLC) methodology the rest of the development team uses.

 

To Proc or Not to Proc

I’ve had some interesting conversations and fun arguments about how to author queries for SQL Server Report Services (SSRS) reports. There are a lot of professionals out there who really want hard fast answers on best practices. The challenge with SSRS is the multitude of configurations available for the system. Is everything (Database Engine, SSAS, SSRS, and SSIS) on one box? Is every service on a dedicated box? Is SSRS integrated with a SharePoint cluster? Where are the hardware investments made in the implementation?

Those are a lot of variables to try and make universal best practices for. Lucky for us Microsoft provided a tool to help troubleshoot report performance. Within the Report Server database there is a view called ExecutionLog3. ExecutionLog3 links together various logging tables in the Report Server database. Here are some of the more helpful columns exposed.

  •          ItemPath – The path and report names that was executed in this record.
  •          UserName – The User the report was ran as.
  •          Format – Format the report was rendered in (PDF, CSV, HTML4.0, etc.)?
  •          Parameters – Prompt selections made.
  •          TimeStart – Server local date and time the prport was executed.
  •          TimeEnd – Server local date and time the report finished rendering.
  •          TimeDataRetrieval – Amount of time in milliseconds to get report data from data source.
  •          TimeProcessing – Amount of time in milliseconds SSRS took to process the results.
  •          TimeRendering – Amount of time in milliseconds Required to produce the final output (PDF, CSV, HTML4.0, etc.)
  •          Status – Succeeded, Failed, Aborted, etc.

I always provide two reports based on the information found in this view. The first report utilizes the time columns to give me insight into how the reports are performing and when the systems peaks utilization. The second report focuses on which users are using what reports to gauge the effectiveness of the reports to the audience.

Generally I’m a big fan for stored procedures, mostly because my reports are usually related to a common data source and stored procedures provide me with a lot of code reuse. Standardizing, the report prompt behavior with stored procedures is also a handy tool. A simple query change can cascade to all the reports that use a stored procedure, alleviating the need to open each report and perform the same change. Additionally, I like to order the result sets in SQL not after the data is returned to the report. But that doesn’t mean that you’re not going to find better performance moving some functionality between tiers based on the results you find in ExecutionLog3.

I’m sorry there just isn’t a one size fits all recommendation for how SSRS reports are structured. Which means; 1 you’ll have to do some research on your configuration, and 2 don’t accept a consultant’s dogma on the topic.

How are you coming with those TPS reports?

Does anyone remember the original “Weekend at Bernie’s”? When the two accountants are pouring over the green and white dot matrix printouts of the accounts on the hot tar roof of their apartment building? That’s the traditional report, pages and pages of numbers. Until the invention of spreadsheets, this was the means by which accountants reviewed the accounts. Larger companies have since outgrown even spreadsheets and demanded larger data storage, like databases. However a majority of the reporting provided from these robust data stores still looks like a spreadsheet.

Detailed row data has its uses. Financial transactions and system audit logs are very useful when displayed as uniform rows of data for visual scanning. You can easily find the row that doesn’t look like the others when searching for an error, but how easy is it to determine transaction volume, or the frequency of a particular event? Are you going to count the lines and keep a tick mark tally on another sheet? You can calculate some of these statistics and group them by date, and compare the groups if all the data is still available at the source. Hopefully the query doesn’t slow down the system while users are trying to do their work on it. Save the data in monthly spreadsheets that are backed up regularly? In most cases, the generation of these reports just becomes a meaningless process and waste of paper.

Business Intelligence (BI), I don’t know who coined the term, is meant to communicate the difference between a report (any formatted delivery of data) and the display of information in a way that aides in the business decision making process. BI reporting answers questions like how are this month’s sales compared to last month’s? Or has there been a statistically significant increase in defects with the new modifications to our product?

Many professionals familiar with BI reporting make the assumption that it’s really only applicable to data collected and aggregated over a large period of time. Contact center management is the best example of why this isn’t the case. A contact center is much like an old Amateur Radio that requires constant tuning to produce the best receiving and transmitting signals. These machines come with a panel full of dials and switches used to make sure the radio and the antenna are in perfect attunement. Similarly, contact center managers are constantly monitoring the call handle and queue times making sure the correct proportion of agents are staffed for email, voice, or chat processing. These managers require timely 15 or 30 minute latent reports to determine short term staffing levels. Most companies see the customer service departments as necessary expenses to keep their customers happy. Decision makers need nearly real-time information to make constant adjustments maximizing the efficiency of the staff and keeping their customers happy.

The challenge for BI professionals is, understanding the users’ needs well enough to deliver the correct solution for the need. There isn’t a one size fits all approach to BI delivery. The assembly manager needs metrics on how many completed plastic toys are failing inspection every half hour. Management needs to compare this month’s inspection failures to the samples before switching to the new vendor, perhaps a few times a week. The executive might want to know how sales are going this year compared to the last five, but she only needs this information on the first of the month when she first walks into the office. Each one of these examples has different requirements for the size of the data set, the amount of time the report needs to be displayed for, and the near or distant data term period access.

What’s the point? Go run a search on any technology job board for Business Intelligence or BI. Employers are looking for qualified BI professionals to deliver reporting solutions way that aide in the business decision making process. It’s a growing space/niche on par with security and mobile development. If you can get past the stigma placed on this practice by developers that “Reporting Work” is somehow inferior to software development, there is a lot of opportunity to be had.