Coming Home

Almost 2 months ago I had my first day at Avanade. For those of you who don’t know, Avanade was cerated as a join venture between Microsoft and Accenture. Avanade has thier own business development streams but 99.9% of the Microsoft projects Accenture wins, are sent to the the Avanade team for execution.

Well let me just say what an absolute joy it has been to come back to the Microsoft family of products. After 13 months of wasting my life away fighting with Open Source garbage, I’ve come home to integrated enterprise solutions that work as advertised or at least have some reliable sources for support when they don’t. I was actually told to stop blogging about how much the Open Stack is a waste of time and money… Anyway, that’s behind me.

To add to the good vibes, Avanade is connected to Microsft in so many ways. We’ve actually had advanced looks at new technologies before the rest of the community. There 20+ MVPs in just the Midwest region, Avanade requires 80+ hours of training every year, and employees are encouraged to participate in developer community organizations.

I’m excited to talk about the first area of expertise they’d like me to look at, Avanade Touch Analytics (ATA). I haven’t completed the training yet, but this offering is fantastic. The easiest interface I’ve ever used to create dashboards that look and feel like Tableau or Spotfire, but perform lightyears ahead of both. Once the data sources are made available to the ATA server for any customer’s instance, the dashboards can be authored for or on any device. Switch between layout views to see how your dashboards will look on any device before releasing them. Publish multiple dashboards to different Active Directory security groups and let your users pick the information that’s important to them. It’s exciting, and I’m glad to see an offering addressing the shortcomings of the competition in a hosted or onsite instalations.

Well that’s enough advertising. Now that my censorship is at an end, I’ll be blogging mroe often I really want to discuss SQL Server’s memory resident database product, interesting things I’ve learned about the SSIS Service recently, and Service Broker.

PostgreSQL, AWS, and Musical Bottlenecks

I have had the misfortune of working with PostgreSQL for the last 8 months. Working is a relative term, for me little work has been done mostly I’ve been kicking off queries waiting forever fo the returns and then trying to run down the bottleneck.

I am not a Linux professional and have to rely on those professionals to diagnose what’s going on with the AWS instance that runs PostgreSQL 9.3. Everyone who looked at the situation has had a different opinion. One person looked at one set of performance data and said the system isn’t being utilized at all, someone else would say it’s IO bound, still someone else would say it’s the network card… So we wnet through all these suppositions added more RAM, then more processors, then we used the SSD drives more, finally switching from Non-provisioned IOPS to Provisioned IOPS got the system roughly as far as we could push it to where the complex queries would drive one CPU Core to 100%.

Now those of you who work with read enterprise RDBMS might say, “Wait… One CPU core reached 100%?” Well yes, of course, because you see PostgreSQL does not have parallel processing. Yeah…

No matter how many CTEs or sub queries present in a query statement sent to PostgreSQL, The processing of said query will happen in a synchronous, single threaded fashion on CPU core. I’m thinking SQL Server had parallel processing in the late 90’s or early 2000’s? It’s 2014 for crying out loud.

And it gets better! According to my observations, the Postgres process is also single threaded. This process is responsible for writing to the transaction logs. So there isn’t any benefit to create multiple log files for software striping and efficient log writing. In fact, one big insert seemed to back up all the smaller transactions, while the first insert wrote to the transaction log.

This is one of the joys of Open Source offerings. If the development community doesn’t think a feature is important you have to fork the code and write the feature yourself. What blows me away is that companies are willing to gamble the success of their products and implementations on something so hokey.

I’m not a DBA, But I Play One on TV: Part 3 – Database Files

When a customer invites me to review their SQL Server or Oracle databases and server architecture, I start with the servers. I review the hard disk layout and a few server settings. The very next thing I do is review the data files and log files for the databases. In the case of SQL Server, when I see one data file and one log file in the same directory and the database has one file group called Primary, I know I am once again presiding over amateur hour at the local chapter of the Jr. Database Developer Wannabe Club.

 

One file pointing to one file group indicates to me:

  1. Someone went through the “create new database” wizard.
  2. There wasn’t any pre-development design analysis done before the database was created
  3. No one bothered to check readily available best practices for SQL Server
  4. I can anticipate equally uninformed approaches to table and index design and query authoring

 

This will antagonize the hardware striping advocacy group, but there are reasons to split up split up your data files and log files. Specifically in the case of TempDB files, you can greatly improve performance by creating the same number of log files as you have processors. With this configuration each processor will control the I/O for each file.

 

Check out number 8 here: http://technet.microsoft.com/en-US/library/cc966534

 

In addition to performance, recovery processes greatly benefit for splitting up the database files. Previously, if a data file failed, if everything was in one file or not, SQL Server would take the database offline. With SQL Server 2012 a new feature was added that will leave your database accessible, just not the data located in the corrupt or otherwise unavailable file. Well if all the data is in that one file your dataset is down until you can recover. Even if that data file contains only a subset of the data in a table, the rest of the data in that table is still available for querying.

 

Now, you might say ok we’re going to have a separate file for every table and multiple files for some. Ok, I’ve seen that configuration and there isn’t anything wrong with it. If your IT department isn’t using SQL Server to manage their backups, instead they’re backing up the actual files across all the drives, they’re going to be annoyed with you. However, this configuration gives you maximum flexibility.  For instance, placing tables that are commonly used at the same time on different spindles won’t conflict for disk I/O.

 

Splitting up your log files is also beneficial. Log files are populated in a round robin fashion. When one reaches the level you’ve set it starts filling up the next. Hopefully you have at least 4 and they are of a sufficient size. This gives you time to archive the transaction logs between backups making sure no transactions are lost due to the file rolling over before the backup removes completed transactions and shrinks the file.

 

Next episode will cover backup basics. The purpose in all these posts is to provide the understanding to apply the best configuration to the database system your building.

 

I’m not a DBA, But I Play One on TV: Part 2 – CPU and RAM

In Part 1 I discussed SQL Server and Hard Disk configurations. Now let’s have a look at CPU and RAM. This topic is actually kind of easy. More is better… most of the time.

CPU

It’s my opinion that most development environments should have a minimum of 4, 2.5+ GHz Processors, If that’s one socket with two cores, or one socket with 4 cores or, or two sockets with 2 cores, doesn’t really make that much of a difference. For a low utilization production system you’ll need 8, 2.5+ GHz processors. Look, you can get this level of chip in a mid-high grade laptop. Now if you’re looking at a very high utilization system it’s time to think about 16 processors or 32 split up over 2 or more sockets. Once you get to the land of 32 processors advanced SQL Server configuration knowledge is required. In particular you will need to know how to tweak the MAXDOP (Maximum Degree of Parallelism) setting.

Here’s a great read for setting a query hint: http://blog.sqlauthority.com/2010/03/15/sql-server-maxdop-settings-to-limit-query-to-run-on-specific-cpu/

And here are instructions for a system wide setting: http://technet.microsoft.com/en-us/library/ms189094(v=sql.105).aspx

What does this setting do? It controls the number of parallel processes SQL Server will use when servicing your queries. So why don’t we want SQL Server to maximize the number of parallel processes all the time? There is another engine involved in the process that is responsible for determining which processes can and cannot be done in parallel and the order of the parallel batches. In a very highly utilized SQL Server environment this engine can get bogged down. Think of it like air traffic control at a large airport… but there’s only one controller in the tower and it’s Thanksgiving the biggest air travel holiday in the US. Well the one air traffic controller has to assign the runway for every plane coming in and going out. Obviously, he/she becomes the bottleneck for the whole airport. If this individual only had one or two runways to work with, they wouldn’t be the bottleneck; the airport architecture is the bottleneck. I have seen 32 processor systems grind to a halt with MAXDOP set at 0 because the parallelism rule processing system was overwhelmed.

For more information on the parallel processing process: http://technet.microsoft.com/en-us/library/ms178065(v=sql.105).aspx

RAM

RAM is always a “more is better” situation. Keep in mind that if you don’t set the size and location of the page file manually, the O/S is going to try and take 1.5 times of the RAM from the O/S hard drive. The more RAM on the system, the less often the O/S will have to utilize the much slower page file. For a development system 8GB will probably be fine, but now a days you can get a mid-high level Laptop with 16GB even 32GB is getting pretty cheap. For production 16GB is the minimum, but I’d really urge you to get 24GB. And like I said 32GB configurations are becoming very affordable.

I’m not a DBA, But I Play One on TV: Part 1 – Hard Drives

This is the first in a series of posts relating to hardware considerations for a SQL Server 2008 R2 or later server. In Part 1 – Hard Drives I’m going to discuss RAID levels and what works for the Operating System (O/S) versus what works for various SQL Server components.

As a consultant I always go through the same hardware spec dance. It sounds like this:

Q: How much disk space does your application database require?

A: Depends on your utilization.

Q: Ok, what’s the smallest server we can give you for a proof of concept or 30 day trial?

A: Depends on your utilization.

Q: Well we have this VM with a 40 GB disk, 8 GB RAM, and a dual Core virtual processor available. Will that work?

A: Depends on your utilization, but I seriously doubt it.

SQL Server 2008 R2, depending on the flavor will run on just about any Windows Server O/S 2005 and newer, Windows 7 and Windows 8. This isn’t really a discussion about the O/S, more of how the O/S services SQL Server hardware requests. At the hardware level the O/S has two main functions managing memory and the hard disks and servicing requests to those resources to applications.

In a later post we’ll look at memory in a little more depth, but for the hard disk discussion we’ll need to understand the page file. The page file has been part of Microsoft’s O/S products since NT maybe windows for workgroups, but I don’t want to go look it up. The page file is an extension of the physical memory that resides one or more of the system’s hard disks. The O/S will decide when to access this portion of the Memory available to services and applications (processes) requesting memory resources. Many times when a process requires more memory than is currently available the O/S will use the page file to virtually increase the size of the memory on the system in a manner transparent to the requesting process.

Let’s sum that up. The page file is a portion of disk space used by the O/S to expand the amount of memory available to processes running on the system. The implication here is that the O/S will be performing some tasks meant for lightning fast chip RAM, on the much slower hard disk virtual memory because there is insufficient chip RAM for the task. By default the O/S wants to set aside 1.5 times the physical chip RAM in virtual memory disk space. For 16GB of RAM that’s a 24GB page file. On a 40GB drive that doesn’t leave much room for anything else. The more physical chip RAM on the server the bigger the O/S will want to make the page file, but the O/S will actually access it less often.

Now let’s talk RAID settings! You may find voluminous literature arguing the case for software RAID versus Hardware Raid. I’ll leave that to the true server scientists. I’m just going to give quick list of which RAID configurations O/S and SQL Server components will perform well with and which will cause issues. I’m going for understanding here. There are plenty of great configuration lists you can reference, but if you don’t understand how this stuff works you’re relying on memorization or constantly referencing the lists.

Summarization from: http://en.wikipedia.org/wiki/RAID

But this has better pictures: http://technet.microsoft.com/en-us/library/ms190764(v=SQL.105).aspx

RAID 0 – Makes multiple disks act like one, disk size is the sum of all identical disk sizes and there isn’t any failover or redundancy. One disk dies and all info is lost on all drives.

RAID 1 – Makes all the disks act like one, disk size is that of one of the identical disks in the array. Full fail over and redundancy.

RAID 2 – Theoretical, not used. Ha!

RAID 3 – Not very popular, but similar RAID 1, except that each third byte switches to the next disk in the array.

RAID 4 – One drive holds pointers to which drive holds each file. All disks act independently buy access by one drive letter.

RAID 5 – Requires at least 3 identical drives. All but one are live at all times the last acts as a backup should one of the other drives fail.

RAID 6 – Like RAID 5 except, you need at least 4 identical disks and two are offline backup disks.

RAID 10 or 1+0 – A tiered approach where two groups of RAID 1 arrays form a RAID 0 array. So two fully redundant RAID 1 arrays of 500GB made up of 3 500GB disks come together to form 1 RAID 0 array of 1TB. Sounds expensive, 3TB in physical disks to get 1TB accessible drive.

At this point I’ll paraphrase the information found here: http://technet.microsoft.com/en-US/library/cc966534

SQL Server Logs are written synchronously. One byte after the other. There isn’t any random or asynchronous read requests performed against these files by SQL Server. RAID 1 or 1+0 is recommended for this component for two reasons 1. Having a full redundant backup of the log files for disaster recovery. 2. RAID 1 mirrored drives support the sequential write I/O (I/O is short for disk read and write Input and Output. I’m not going to write that 50 times.) of the log file process better than RAID configuration that will split one file over multiple disks.

TempDB is the workhorse of SQL Server. When a query is sent to the databases engine all the work of collecting, linking, grouping, aggregating and ordering happens in the TempDB before the results are sent to the requestor. This makes TempDB a heavy write I/O process. So the popular recommendation is RAID 1+0. Here’s the consideration, TempDB is temporary, and that’s where it gets its name from. So redundancy isn’t required for disaster recovery. However if the disk your TempDB files are on fails, no queries can be processed until the disk is replaced and TempDB restored/rebuilt. RAID 1+0 helps fast writes and ensures uptime. RAID 5 provides the same functionality with fewer disks, but decreased performance when a disk fails.

TempDB and the Logs should NEVER EVER reside on the same raid arrays, so if we’re talking a minimum two RAID 1+0 arrays, might be more cost effective to put TempDB on RAID 5.

Application OLTP (On-line Transaction Processing) databases will benefit the most from RAID 5, which equally supports read and write I/O. Application databases should NEVER EVER reside on the same arrays as the Log files and co-locating with TempDB is also not recommended.

SQL Server comes with other database engine components like the master database and MSDB. These are SQL Server configuration components and mostly utilize read I/O. It’s good to have these components on a mirrored RAID configuration that doesn’t need a lot of write performance, like RAID 1.

A best practice production SQL Server configuration minimally looks like this:

Drive 1: O/S or C: Drive where the virtual memory is also serviced – RAID 1, 80 to 100 GB.

Drive 2: SQL Server Components (master, MSDB, and TempDB) data files – RAID 1+0, 100-240 GB

Drive 3: SQL Server Logs – RAID 1+0, 100-240 GB

Drive 4: Application databases – RAID 5, As much as the databases need…

Where to skimp on a development system? Maybe RAID isn’t available either?

Drive 1: O/S or C: Drive where the virtual memory is also serviced, 80 to 100 GB.

Drive 2: SQL Server Components (master, MSDB, and TempDB) data files Application database files, As much as the databases need…

Drive 3: SQL Server Logs, 100-240 GB

Optimal Production configuration?

Drive 1: O/S or C: Drive – RAID 1, 60 GB.

Drive 2: SQL Server Components (master, MSDB) data files – RAID 5, 100GB

Drive 3: SQL Server Logs – RAID 1+0, 100-240 GB

Drive 4: Application databases – RAID 5, As much as the databases need…

Drive 5: TempDB RAID 1+0, 50–100 GB

Drive 6: Dedicated Page File only RAID 1, 40GB. You don’t want to see what happens to a Windows O/S when the page file is not available.

Buffer I/O is the bane of my existence. I have left no rock unturned on the internet trying to figure out how this process works. So if someone reading can leave a clarifying comment for an edit I’d appreciate it. This I do know, the buffer is kind of like SQL Server’s own page file. A place on a hard disk where information is staged before it is written to the memory pool managed by the O/S. If your system is low on memory and using the page file extensively you will see Buffer I/O waits in the SQL Server Management Studio activity monitor. Basically, this indicates that the staging process is waiting on memory to become available to move data out of the buffer and into the memory pool. The query can’t write more information to the buffer until there is space open in the buffer for it. In fact if the query resultset is big enough, the whole system will begin to die a slow and horrible death as information cannot move in and out of memory or in and out of the buffer because so much information is going in and out of the page file. This is why I highly recommend splitting up the disks so that SQL Server does not have to fight with the page file for Disk I/O.

Look if you have 10 records in one table used by one user 2 times a day that VM with a 40 GB disk, 8 GB RAM, and a dual Core virtual processor available is going to do just fine. But you might as well save some cash and move that sucker onto Access or MYSQL or some other non-enterprise level RDBMS.

 

 

Open Suck… I mean Open Source

If you’re reading this for a socialist country, I’m sorry but you’re going to struggle to understand the basic premise of this discussion. The application of a common cliché in capitalist societies, “You get what you pay for” I believe is universally appropriate. From my father-in-law, who bought the cheapest satellite service and complains incessantly about how much he wishes he had the same cable service I have but is unwilling to pay the higher service charges, to out sourcing call centers to regions of the world that speak a different language than the users of this service, to booking a cheaper hotel near the Orlando amusements with free shuttle service that’s just a glorified, overcrowded city bus without the graffiti. Going cheap is almost always going to disappoint. But this is a technical blog and my focus is Business Intelligence.

I’m working on a favor for a friend and I wanted to take this opportunity to explore some new technology. This friend of mine doesn’t have any budget for this project so I’m looking for cost effective components for this application that’s simply client front end to an RDBMS. My friend runs a small collection of Windows 7 desktops, I love Entity Framework, I’m proficient in Visual Studio, and I don’t need a “Big Data” solution. So I start thinking Open Source. Alright, hurdle 1, I’m not a java guy, and some of you might start harping about how Ruby, Rails, PHP running on Apache, Beans and Java all vastly different things…. I’m not into any of them; they’re all Java to me. A lifetime ago I played with swing and it sucked on Windows. Most Java apps I see run in Windows, are crap.

I don’t want to go into an in depth discussion on all the options, but I decided to investigate PostreSQL based on a recommendation from someone in my network who swears by it. One of the things I liked is the multi-OS support. Just in case the world turns upside down and I want to install the database one something other than a Microsoft OS, I thought I’d work with an RDBMS that would work the same no matter where it was installed with ne common client. The installation was smooth enough. I installed everything and clicked next, next, next… no errors. Good. Then I started researching ADO .NET clients to support Entity Framework, that’s where the wheels fell off.

In the realm of free providers to go with the free RDBMS; there is an OLEDB provider pgnpoledb, multiple JDBC drivers, and one ODBC/.NET provider npgsql. Now, I’m skeptical man and before I went down the path of actually trying to connect Entity Framework to the PostgreSQL database I decided to read the npgsql wiki. Pages were devoted to all the different issues and bugs, what was or wasn’t being submitted for acceptance in GitHub. From the headache mounting on my cranium, I could tell this option was going to require maybe a bit more effort than I was willing to invest in a favor for a friend. A lot of posters were pointing to the .NET provider for PostgreSQL from DevArt. Long story short, $199 for what I wanted… Wait a second I thought this crap was all Open Source and free!

Let’s just explore this concept, which has long been my complaint with the Open Source stack. If your goal is to create a mission critical high availability enterprise application with the Open Source offerings, you must be prepared to not only code your application, but also the platform on which it runs, or abandon the “Potentially Free” benefits of Open Source by purchasing licensed products to augment and stabilize the Open Source platforms. Option 1 means roughly doubling your workforce or your time to market. You need resources to code the platform and resources to code the application or resources that do both, but really only one at a time. Option 2 cuts into your equipment and tools budget and you need to verify what the vendor’s royalty and redistribution requirements are. No one wants to depend on a component that requires $1000 royalty for every user on a 40,000 seat client server application, right?

There are other Open Source challenges I love to joke with the diehard apologists I know. Like the fact that your favorite platform was written by one talented foreigner who doesn’t speak your language and only responds to email questions once a week when the internet service satellite flies over his bunker. I like a challenge as much as the next person, and I sympathize with the desire to revolt against the powerful software companies that are so slow to accommodate user needs. But, I’m just not willing to chance providing a service, where contractually I have to pay a refund for every minute of down time, dependent on a platform that was developed by hobbyists and amateurs.

Look at the example I stated above where the free provider has lots of challenges and the paid one is stable and supports all features of the toolset it’s meant to service. Developers whose livelihood (paycheck) is dependent on the successful execution of a project are naturally going to be more motivated to generate a better product than those who are working merely to support a community. Likewise, those tasks that facilitate the collection of said paycheck will take priority over the needs of a community, which leads you to have more down time as you wait for someone to get off from work (or high school marching band practice and homework) to fix a bug in the platform your product depends on and publish it to GitHub.

 

 

Job Req. Sanity Check

Let me start by saying I am not an HR guy. Nor have I ever been a full-time recruiter of any sort. So perhaps, I’m way off base with my thoughts on this topic. PLEASE straighten me out if I am because there are a lot of practices within this space that make no sense to me.

I.  The Skill Set Years Experience Mismatch

Lately I have seen a flood of open position postings on the various job boards that will say something to the effect of “Jr. Developer\Recent College Grad\1-2 Years experience” as the headline of the posting. Only to find in the requirements section, experience (which to me means more than just exposure or reading a help doc online) for some 30 different technologies. Maybe, yes maybe with the right set of circumstances a Jr. resource as described in the headline might have started in an environment where he or she was given free rein to provide solutions through whatever means. I was lucky enough to have started my career as the only software developer for a successful Insurance company where I was able to explore whatever new technology came along and experiment with different techniques. I think this is pretty rare. Some companies spend the first 6 months breathing over a new resources shoulder with weekly code reviews before they’re promoted to level on and the code reviews come when the developer is ready. Many companies only let their resources sustain existing code and teach them just the basics to troubleshoot the existing technologies while the more senior staff works on innovation.

So are the hiring managers or recruiters looking for 80% of the required skills? One or two? Software design and development professionals are detail oriented and precise personalities. If I can’t talk about every skill listed, I’m not going to apply for a position.

II.   Competing Technologies

Another favorite of mine is when the laundry list of experience includes market competitors. The posting is looking for someone with 5 years experience and expert knowledge of Oracle, DB2 and SQL Server, or Expert level .NET and Java. First, can you really become expert in 5 years, especially if the maybe 2 of those you were just doing maintenance work (i.e. spell checking websites)? Secondly how many companies invest tens of thousands of dollars in SQL Server and more tens of thousands on Oracle? As a vendor software developer your product may need to support more than one database platform. However, what percentage of the candidates the job market hail from vendor software companies? Are there really any transferrable skills between .NET and Java? It seems to me trying to grow one resource into an expert of both is far more expense than cultivating two specialists and most companies would do the latter.

These types or requirements lead to a lot of confusion for candidates. They don’t know if they should bother applying or not. The recruiters are inundated with resumes that don’t fit the request from the hiring customer.

III.   Automated Recruitment Phone Recruiting

This year in particular I have been flooded with outsourced call center recruiter calls. These calls always follow the same format.

  • I answer the phone to silence
  • A few seconds later someone in a very thick accent says, “Hello may I speak to George?”
  • “Yes this is George.”
  • Faster than any normal human being should be able to speak -“Uh hi. My name is gibberish. gibberish gibberish gibberish gibberish gibberish gibberish gibberish gibberish gibberish gibberish gibberish gibberish …”
  • Me, “Whatever you’re talking about I’m not interested. Thanks.”
  • Hang up.

It’s as bad as the campaign calls around supper time during an election cycle. Who in their right mind thinks this is in any way an effective means to find a qualified candidate? I seriously doubt these individuals understand the technical requirements well enough to successfully phone screen much less are able to fight through the language barrier well enough to have a real conversation about the candidate or the opportunity.

IV.   Don’t Read the Resume

Another new interesting fishing tactic is the mail blast, or I guess that’s what’s going on. Why else am I getting emails for Jr. or Intermediate 5 years or less positions from the job boards where my resume clearly showing 16 years of experience are posted? Or the expert Java Architect roles I was sent when Java J2EE doesn’t appear anywhere on my resume? Recruiters, does this tactic work?

I understand there is a perception in the US job market right now that a lot of people are out of work and some companies are hoping to cash in on getting better qualified candidates for less compensation. This perception has created a recruiter feeding frenzy atmosphere. The truth is most of the top ranked talent is aware of what’s going on and they’re sitting this cycle out, or contracting. The unemployment rate among software development professionals is not nearly as high as other skill sets like manufacturing and construction. I believe this tactics will not be successful, and my land your corporation with a lot of negative feedback on a site like GlassDoor.com.