Tuesday, March 19, 2013

The one where Discovery fails

Hi there,

Ever had one of those issues when you know it was not triggered by recent changes because there were none and you look around and you actually can't find any recent changes? Let me tell you - you need to look harder. There's always someone changing something, next to you, without you actually knowing it. I heard the answer "Nothing has changed" so many times so far that I think it's safe to say...I don't trust people anymore when it comes to this.
Back to our issue. We had a XenApp 6 farm (under Citrix PVS) that all of a sudden it decided not to perform discovery. There are nearly a dozen servers with one data collector. Basic error message:



Error occurred when using {server} in the discovery. On double clicking the message the following was being shown: An unexpected error occurred. Check that the server name is correct, that the server is on, that Citrix Presentation Server is installed on this server, and that the Citrix MFCOM Service is running.

The usual suspects: IMA and MFCOM. You got to love them.

After running all the sanity checks, everything seemed in place. IMA started, MFCOM started, LHC recreated. Farm was functioning properly, people could connect and there were no relevant errors in Event Logs. We even tried un and re / gistering MFCOM - no luck.

I started thinking of creepier scenarios - someone has made a joke and removed all Citrix admins from the farm (you get the same error message when you're not an admin). I used DSView to check there are still admins listed - there were. Time to pull out the big guns - CDFControl.

So I ran CDF Control with only Access Management Console category selected (AMC..nice memories huh?) and performed a discovery. Here's the output:



*the picture was cropped to show only the relevant data.

Aha - a server entry in the farm is causing problems. I bet dscheck will show more details.

C:\Users\user>dscheck

Data Store Validation Utility.    Version: 6.23

Server Consistency Check:  The Citrix XenApp record with HostName {server} at
DN 00001CAE may be invalid.  The Load Manager for Citrix XenApp entry was not found.

Finished data store validation.


I created a backup of the SQL Database holding the Citrix Datastore and then executed the same command as above with the /Clean switch, to fix the problems. For confirmation I ran dscheck again. There was still and error present so I ran dscheck /clean again. The second error was cleared the second time and I was able to run discovery.

To finish the story I started with - the server causing problems was a recently created test server booted from a new test vdisk (completely rebuilt). So the answer to the question: What recent changes were in the environment should have been: tested a new vdisk in the production farm.

Happy Citrixing.


Monday, March 4, 2013

The one with insufficient resources

A colleague of mine had the opportunity to work on a case recently that had a pretty interesting resolution so I thought I might as well just share it. The client was running a CAG 504 VPX serving a XenDesktop infrastructure. Every now and then, during the login the users were prompted with an error of: Intermittent HTTP 405: page cannot be displayed
Refreshing the page would've displayed the resources.

I will not go into details on this one but after months of troubleshooting Citrix was not able to understand why this happens. We narrowed down to where it happens by analyzing the network traces and other logs from the CAG but simply were unable to get to the WHY.

So we were one day in the office talking about VM resource reservations and how a colleague of ours saw some weird things with a NS VPX that was not given enough "juice" so we decided to go down this route and propose to our client to ensure his CAG VPX VM has at least 500 MHz and 500 MB RAM reserved on the Hypervisor side.
I'm happy to say that the issue never presented itself after that.