How Do You Solve a Problem Like…

ITIL is not an abbey, but was Maria an incident, or a problem?

At what point does an incident become a problem?  That is a question that unfortunately may have multiple answers.  After all, in ITIL, an incident is something that happens.  If it continues to happen, or if multiple people have it happen to them, it can become problematic.  But a problem?  As much as I detest ‘squishy’ answers, the answer to this conundrum is, it depends upon your organization.

Some organizations take a lot of pain before assigning a problem.  Some almost immediately assign incidents a problem depending on the length of time the incident is open, or the number of people calling in who do not want to follow instructions in the knowledge base.  It really does depend. There’s the squish.

However, there is a good rule of thumb.  First, you need to know the severity of the incident. Is this almost all of your customers or just a handful?  If this is something that is happening to just some people, then is there a knowledge-based article about it?  If not, there needs to be.  If you do have a KBA out there, are people using it and what are their reactions to it? Is it perceived as an easy fix or do they need to perform a voodoo ritual in order for it to work?  Trust me, your customers will let you know.

Then you need to assess the incidents, your customers and the fix itself.  As this is still an incident, it has passed through all of the levels of support, so you will have some general idea as to what needs to happen to fix it.  And by now, you know about how many people this incident affects.  The next question is, how much pain is you organization capable of enduring?  No one like to hear the same complaint over and over.  But sometimes, the fix is more than the disease.  Sometimes, however, a fix easily can be found and there you are.  Problem it is and soon to be problem solved.

But sometimes, problems cannot be solved by marrying them off to well known Austrian war heroes.  More about that in the next post.

Just How Many is “Everyone”?

If your company has more than three people, this is not “Everyone”.

Ah, language.  I remember a while back, being a passenger in a car with a friend of mine. We were at a fast food drive though because he had a coupon.  So he placed our order and told the attendant on the other side of the speaker he had a coupon.  The attendant asked him “How many people are in the car?”

My friend replied , “All of us!”

I say this because when you are opening up an incident ticket, it may ask you how many people this incident is affecting.  Even if you have multiple personalities, if you are the only one affected, then you are the only one affected.  Saying that this is affecting the whole company when obviously it is not is more likely to get a hearty laugh from the person who gets your incident (along with a few other people on the desk). This is basically known on the desk as lying about your situation in order to have someone look at it “right now!”

Protip: This never, ever works.

You see, yes, we jump on emergencies with both feet.  We’re the high tech firemen and women, trying to make sure that real four alarm fires are taken care of quickly.  We want to help everyone, but we have to set priorities.  Sometimes we can get to you immediately. Sometimes, we can’t.  But we will, as soon as we can.

Do it enough times, and your name will become well known as someone who is trying to game the system.  Yes, we talk about people like you in between putting out fires. When we see that the incident is not the level of everyone in the country catching Ebola at the same time, we set the ticket to reality.  if you really are adamant about doing this continuously, we’ll have a conversation with your boss about it.

And that is a ticket, I don’t think you want.

Show Me On the Ticket Where the Bad Thing Happened

If you picture your Help Desk like this, it can make you feel a little better.

Not to sound dramatic, but an Incident is kind of like a crime scene. In order to piece everything together the person on the help desk needs to know exactly what happened.  Everything.  Yes I know, what you need doesn’t work, but I need to reassemble the crime scene in order to see what happened.

That is why putting everything on your ticket is helpful.  This is not a blame game, but an investigation. Sometimes, (actually, more than you think) one software program is at odds with another, and yes, this can happen when the software packages are created by the same company (I’m looking at you, Microsoft).  Sometimes it is user error.  Hey, we all make mistakes.  And sometimes what is happening can actually be a symptom of something far worse.  But unless the incident team knows what is going on, the incident is nothing more than ” Damned if I know.”

So, what needs to be on the ticket?  First, besides your name and how to contact you (which should be automatically recorded on the ticket), I need to know the following:

  1. The time it happened.  This helps us if we need to check server logs.  Things happen all the time, so knowing when it happened helps us dig through the chaff.
  2. Exactly what happened.  The more detailed you can be, the better.  Just saying “It doesn’t work” means we have to call in Miss Cleo and her tarot cards to divine what happened.  And while I love her fake Jamaican accent, she’s never right.
  3. What else was running in the background.  Excel is giving me an error message and I have Word and Visio also running.  This may have something to do with it, it may not.  But we know that there are other avenues we might be able to check if our first assumption is wrong.
  4. Screenshots.  Just like a crime scene investigation, pictures record a lot more than people think.  If you have an error message, get a screen shot to add to the ticket.
  5. If your ticketing system does not capture the information concerning the computer you are on (Some do, some are stupid), then please add that to the ticket as well.  There are times when the hardware does not play nice with the software.

In other words, nothing gets ruled out at the start.  Once we can verify the alibis for various parts, then we can find the perp, solve the problem and wrap up the case a lot neater the Law and Order sometimes does.  If we find that what you are experiencing is part of a larger problem, well, we have a larger case to solve.  We’ll keep you updated.  Olivia Benson never gives up.  Neither do we.  *Chu-CHUNK!*

You Are Not The Most Interesting Developer in the World

Where I work, we deal with developers who were apparently dropped on their heads at a very young age.  Repeatedly.  Onto concrete.  Repeatedly.  I also work with project managers who apparently had the same thing happen to them.

The reason I point this out is because some people truly believe there is a cost savings in combining development, test and production in to one happy little world.  You know, because every developer does everything right the first time. Or at least combining development and test, because, really, those are one in the same.  I actually had a PM tell me this in the last week.  As my father used to say, “That boy is about as sharp as a sack of wet mice.”  I’ll simply add, “Bless his heart”.  Because thinking it can be all in one is the coding equivalent of “Hold my Beer”.

For the uninitiated, the development area is the place where the developers can make multiple horrendous mistakes, or develop, take you choice of phrasing.  The test area is a place where you actually make sure that whatever you have developed installs as it should and that everything should be working as it is designed to.  Production is that area that everyone actually uses and they don’t take kindly to things changing or not working as designed in the middle of an actual transaction.  If you think they do, then I’ll give them your number and you can be a customer service rep as well.  Have fun.  Some groups also include a fourth level – User acceptance testing, or UAT.  This is an environment that your test group of actual business users beat on to make sure everything works to their standards after QA, but before production. I personally fond of the four-tier layout, because It brings the business in on the final product, and anything found afterwards in on their heads, not the development team’s.

Now in this multilayer fiesta there are some points that as operations, I demand.  First, all environments  should resemble production in the fact that the server structure is the same, and you have the same basic software running in the background, because there are indeed differences between, say, .NET 3.5 and .NET 4.5.  Really, there are.  That’s why they are different numbers.  Someone at Microsoft didn’t just decide to change it because they were bored.  If your production server is running .NET 3.5, then why are you developing on something else? Do not talk to me about upgrading on the fly because of backward compatibility.  Sorry, it is something that may work 90 percent of the time, but that remaining ten percent always bites you in the ass.  Replication applies to your databases as well.  Table structures need to be the same.  Change one thing, it needs to go through the whole set and it needs to be tested.  Yes, it is a pain in the neck.  Yes, it takes time.  Why all these restrictions?  Because you are not a cowboy.  You are in IT.  You want freedom, move to Montana and raise dental floss.  But do not ask me to hold your beer.

And you are moving this into production, because…?

And I said “Get out of here, you Loch Ness Monster!”Let us say you are world famous fashion designer Karl Lagerfeld. You are getting ready to present your fall line to the public.  The lights go down, the models are ready to go and the show begins.

The first model hits the runway and about halfway down, Anna Wintour, editor of Vogue Magazine, suddenly jumps up and says “Darling that makeup is all wrong” and stops the model and proceeds to redo the model’s make up in front of the world’s fashion press.

What would you do?  Well, if you were Karl Lagerfeld, chances are you’d be in a Paris jail for beating Anna Wintour to death with a stiletto in front of Kanye West.  But basically this is what we in operations have to deal with on a daily basis.  Someone on the business side sees “something wrong”,  runs down to IT, grabs a developer and yells “FIX IT NOW”.  Now, there are a few thing blatantly wrong with this.  To start:

  1. Is it really wrong?  The same rules should be applying to everything, so if one data point is off for one, shouldn’t it be off for everyone?
  2. How is it wrong?  How off is the calculation? Exactly what should it be?
  3. Who are you and why are you in here yelling at a developer?  I’m  Operations.  I should be yelling at the developer, not you.  Shouldn’t you be talking to the project manager and going over what the rules are?

And of course, the one question that everyone misses.  What time is it? You see making changes in production is something that is dependent upon timing.  And timing in operations is like timing in comedy – it is everything.  There are risks involved with dropping changes into production in the middle of the day like a hot mic.  I need a reason why you are hell bent on shoving this in at lunch time.  Not because it is my lunch time (which really never matters), but because I have to answer for its possible failure.  There are reasons why changes are done during low traffic hours.  It doesn’t affect as many people if and when it blows up.  There is also the question of verification.  When this change goes in, how long do we need to wait to make sure that everything is OK?  If it is so important that your boss is two steps from an aneurism, then why does it take you three days to verify the procedure was done correctly?  Just saying.

Oh yeah, do the paperwork.  It’s not a problem if there’s not a ticket.  And straighten you tie.  Anna hates that.

There are Incidents, and then there are Incidents

What IT Operations does on a regular basis.
What IT Operations does on a regular basis. You’re Welcome.

OK, we all know that every single one of us is a delicate hothouse flower, full of potential, so that when we are unable to get to that spreadsheet or we are unable to log into the report server, those folks in IT must drop everything they are doing right now and help us continue with our  work.

Yeah, you and the other 25 delicate hothouse flowers that just called in, each with their own special set of problems that is keeping them from reaching their potential.

In Incident management, events are set up according to Priority. Put it this way, your inability to access you spreadsheet will probably take a back set to a fire in the data center. Why? Because chances are, your problem affects only you. The data center on the other hand probably will affect you, your department, and maybe every customer you have. That is how priorities are set – chances are that a problem affecting you is further down the chart than one that affects lots of people. I apologize, but sometimes the cold gust of reality can make a flower stronger.  Sometimes it kills it, but that’s not my problem.

But, you may say, my inability to access that spreadsheet is going to have major ramifications, because it is needed in Federal court tomorrow at 8:00 AM. OK, that changes the topology a little, as no one want to jerk around a lawyer or a Judge. But even that has to fall in the same scale as everyone else calling in. How do you handle it?

IN ITIL, Priorities are based on a combination of two things: Impact and Urgency

The needs of the many outweigh the needs of the few. Or the one. That would be you.

Impact counts the number of people impacted: the more people impacted, the higher you go.  If there are only a handful of people affected, it’s a low impact, regardless of social status.  Urgency is basically when you need it.  Yeah I know.  Everyone needs it RIGHT NOW.  Relax, princess, you do not need it right now.  You know it.  I know it.  Stop acting as if it’s life or death.  It isn’t.  If something is broken, the urgency is going to be higher than a request for access.  Why?  BECAUSE IT’S BROKEN.

Now, most of the time major incidents and minor stuff can co-exist peacefully, as there is usually enough staff to take care of everything.  So yes, you’re going to be able to get that spreadsheet that eventually will send your boss to jail.  So not to worry.  Just remember that there are things out there there that is more urgent and go from there.

Welcome to IT!

thebobstbi0.90x0.90_thumb.jpg
There are more of these guys than you think.

Yeah, you’re going to have a lot of fun here. From the inability to sign on to the network, to people deleting three weeks of work in a single key stroke, to Ms. Whitcombe not understanding why her computer keeps freezing up, even though she has been warned about that coupon site five times it is best to understand the one rule of the jungle:

It is all your fault. Even though you have absolutely no control over the stupidity of your fellow co-workers, it will always be your fault.

OK, so what are you as an IT organization going to do about it? You could take your lumps, or become cynical because there are some people who should not be around a computer under any circumstances. Or you could find a way to collect data on every problem that plagues your company and find out how to prevent it in the future. Most IT Departments are looking for 2 things: Excellent customer service, leading to happy customers and great productivity. Of course achieving these lofty goals for little or no money is also on our mind. But these are the basics that drive us: Productivity and Customer Service.

Allow me to introduce you to ITIL.

Yes, another ingredient in the IT alphabet soup. Groan all you want, but unlike cleverly named languages, ITIL is a common sense process, created to make sure that the things that drive your co-workers crazy are reported and looked at, problems are given solutions and changes are made with knowledge and forethought. Because while everything may be your fault, it is still your responsibility to fix it. ITIL gives you a roadmap. Get into the car. Time to drive.

The Resume: Spring Collection

edna_thumb.jpgThe time has come to face the facts. I came to a realization a couple of weekends ago watching CNN. Specifically, the conversation was about the job market and finding that “dream job” that everyone talks about but very few apparently have. And so the conversation boiled down to that old standby, the resume. Listening to these people going back and forth over what needs to be seen and what is a faux-pas of epic proportions, it hit me that really they were talking about fashion; that the resume is basically what one’s style is, not one’s substance. And I came to the conclusion: much like Anna Wintour at Vogue every now and then proclaims that Brown is the new Black, I am proclaiming the heresy of heresies: the resume is dead.

You read it here, folks. The resume is dead. Gone. Not ever to return. Not even a “long live the resume”. Finished. Ka-put, dahling.

Why, you may ask? Let’s looks at this in the cold light of day, shall we? If you look at all the “tips” you hear about writing the “perfect” resume, you’ll drive yourself over the cliff. From all the tips you hear, the perfect resume is tailored specifically to the exact position you are going for, so that it reflects the fact that you are the person for that position. It must be one page long, covers everything you ever did in twelve point type, has every keyword a recruiter will ever use and never, never looks too crowded or too sparse. Also there must be space for someone to take notes.

In addition, if you are changing professions, a functional resume is great as long as it is in chronological order. You also need to let any future employer know exactly what you have done, but since your work responsibilities always fit underneath that job title perfectly, mentioning what those responsibilities actually were is redundant and therefore should not be included. You need to boldly list your accomplishments, but do so in a way that does not look like you’re bragging.

This would be all well and good, except it really doesn’t matter anyway, because all those resumes that are handed out at job fairs and the like are basically thrown into the trash once they make it back to the HR office. Well, that is if someone even accepts them. “You’ll need to visit our website” is the latest catchphrase that basically means “Keep that filthy piece of paper out of my face”. And what does that website ask you to do? Put your resume in their format. So there you are, rewriting your resume yet again in a format that does not allow you to do any of the things the talking heads on CNN just told you to do.

I think I had come to this realization some time ago, when a recruiter basically took my resume and immediately asked where the keywords were? I replied that the systems I have used are at the top of the page. “No, no.” she said, “You need to list keywords for every single job you have on the resume, even if it seems redundant. Hiring managers are only looking at keywords these days.” I wanted to tell her that any hiring manager that makes decisions based on single words and not the substance should be looking a job themselves, but decided against it.  I glad that I held off, because after some research, she was right.

There is an interesting white paper from The Sierra Group, entitled “The Traditional Resume is Dead: The Technology Behind Recruiting”. In four pages, they point out that recruiting has become an assembly line process for most employers, and that the practice of resume blasting has increased the load of resumes greatly for companies. The result? You are never going to stand out no matter how good your resume is. Given this economy, the chances of them finding you through the noise of useless resumes are greatly reduced. Remember, Human Resource Managers are more interested in finding the best candidate or candidates, not just those who meet some minimum standard of a screening process.

So, with all these barriers to the traditional resume, how do you break through? Networking sites like LinkedIn are the main way that people are reaching each other. I have known people who basically do nothing but find out who the best recruiters are for their area, link to them and then cyber stalk them. Creepy for some, but, some folks are that desperate.

Other than that, the answer to the question of breaking through to get that job varies from person to person. One thing is certain. The old ways of job searching are long gone. Just like Brown is the new Black. At least for this season.

The Handy Dandy Mission Statement

This is a reboot of sorts.  Before, this blog was more of a snarky review of everything technology related.  Now, it is more of a review of technology, methods and madness that the IT community faces on a daily basis.  OK, there will be some snark, but the bulk of this will be attempted with a semi-straight face.

Mainly, I hope to show the difference between the hype and the reality of technology.  There is always that magic moment when people realize that brand new technology X will not solve all their problems.  It doesn’t mean that the technology is bad or poor or evil, it simply means that X does not do what people originally thought it would do.  Every company’s needs are slightly different .  The phrase “Your Mileage May Vary” is key.

So, here we go again.