Sunday, July 11, 2010

Checking current job status from the SQL Server Agent

Wrote a bit of code for the LinkedIn nagios users group and figured i'd post it here as well. This reports back the current status of all enabled jobs within SQL server at the time of running. It can be used to identify failed, hung or long running jobs. Tested on SQL 2000, 2005 and 2008.

SELECT
[sj_].[name]
,[sjh].[run_status]
,[status] = CASE [sjh].[run_status]
WHEN 0 THEN 'Failed'
WHEN 1 THEN 'Succeeded'
WHEN 2 THEN 'Retry (step only)'
WHEN 3 THEN 'Canceled'
WHEN 4 THEN 'In-progress message'
WHEN 5 THEN 'Unknown'
END
FROM [msdb]..[sysjobs] [sj_]
INNER JOIN [msdb]..[sysjobhistory] [sjh] ON [sj_].[job_id] = [sjh].[job_id]
WHERE
[sj_].[enabled] = 1 AND
[sjh].[step_id] = 0 AND
[sjh].[instance_id] = (SELECT MAX([instance_id]) FROM [msdb]..[sysjobhistory] WHERE [job_id] = [sj_].[job_id])
ORDER BY
[sj_].[name]

Friday, October 9, 2009

Careful what you post on LinkedIn

Interesting article on 128 bit on Windows 8 & 9. More interesting though is how the information leaked out. It was posted by a developer as part of their LinkedIn profile. This kind of makes you wonder what other proprietary information is buried in the profiles of LinkedIn participants.

Friday, September 18, 2009

Nagios Remote Monitoring Unix/Linux with check_by_ssh

Nagios (http://www.nagios.org) does full remote monitoring by ssh'ing into the foreign server and running the nagios plugin on that server. So for it to work SSH has to work (without interaction) and the plugins (only) need to be installed on the remote server. This monitoring does not require any special privileges (i.e. It doesn't need to run as root).

On the remote (to be monitored) server (assumes a user name of nagmon):
useradd -d /export/home/nagmon -m nagmon *create our local monitoring service account

passwd nagmon *the password you set is unimportant, we won’t ever use the password to login you just need to set one to enable the account

mkdir /export/home/nagmon/.ssh

chown nagmon /export/home/nagmon/.ssh


On the nagios (monitor) server:
ssh-keygen -t rsa -b 4096 *generate an rsa key pair and save to /etc/ssh/nagmon_rsa, when prompted for a passphrase just hit enter
cat /etc/ssh/nagmon_rsa.pub *this is the public host key file that will be used to authenticate ssh, copy all the text in the file


On the remote (to be monitored) server:
vi /export/home/nagmon/.ssh/authorized_keys *paste the text from the previous command into the file, then save and exit the file

chown nagmon /export/home/nagmon/.ssh/authorized_keys *make sure our service account can read the key file

chmod 600 /export/home/nagmon/.ssh/authorized_keys *ssh will reject the connection if the proper permissions are not set on the file

mkdir /usr/lib/nagios/plugins *install/copy the nagios plugins to this directory, usually by copying from the nagios server. Make sure the plugins are compiled for the system/processor you are using.
chmod 755 /usr/lib/nagios/plugins/*

*Example Nagios Config (monitoring a remote disk):
Add to the file that contains your command configurations:
define command{
name check_remote_disk
command_name check_remote_disk
command_line /usr/local/nagios/libexec/check_by_ssh -H $HOSTADDRESS$ -2 -C '/usr/lib/nagios/plugins/check_disk -w $ARG2$ -c $ARG3$ -p $ARG1$' -l nagmon -i /etc/ssh/nagmon_rsa
}
Add to the file that contains your service configurations
define service{
use local-service *this is a template, yours will probably be named differently
host_name
service_description usr partition
check_command check_remote_disk!/usr!25%!10%
}
That's it! You can run any of the local check nagios plugins via the tunnel and return the results to nagios.

Sunday, September 13, 2009

Updating Vendor Certifications

I've been working on getting my certifications up to date. Mainly because I see certifications as a good benchmark to keep your general knowledge level up to date. It's easy to get deep into a specific function of a technology but certifications make you step back and look at the big picture again.

But I digress. I just re-certified my CCNA (Cisco Certified Network Associate) and it made me think a little. I did the same certification about a decade ago and frankly the test was significantly easier and covered a lot less material back then. I suspect that Cisco did this as a result of introducing it's new entry level CCENT certification. But being a hiring manager my impression of the CCNA was clearly off base prior to retaking it this year. We were making this certification a requirement for our entry level networking folks and I don't think that was appropriate.

So how often do vendors update certifications by making them significantly more difficult and how do vendors get the word out on these changes to hiring managers? Now I know but i clearly hadn't gotten this message prior to completing the test myself.


*Example: I still recall on my first CCNA being annoyed that I had learned STP inside and out. The only question I had on it was something like "what protocol would you use to prevent layer 2 loops". For the recent CCNA exam you had to know VTP, trunking protocols, port trunk modes, STP, RSTP etc... So there was clearly a lot more content.

Monday, May 18, 2009

Integrating IT Services

It's been one of my pet peeves for a while that IT doesn't always seem to eat it's own dog food.  Specifically we extol the virtues of definitive data sources and application integration but we very rarely use definitive sources and integrate our own systems.

So I haven't been posting much because all of my free time over the last few months has gone into developing something to do just that.  I'm definitely still in the Alpha stage of development but I do have something presentable.  I'm looking for any and all feedback on:

Tuesday, February 10, 2009

Security breach = malpractice?

The FAA had a security breach of non-air traffic control systems that resulted in loss of confidential employee info.

http://www.newsweek.com/id/184051

*Official statement from the FAA:  http://www.faa.gov/news/press_releases/news_story.cfm?newsId=10394

Some interesting points

-The end of the article implies (without proving) that this was the second breach of this same FAA network and nothing was done the first time.

-Some of the data stolen was encrypted employee medical information. (*I'm wondering why the FAA would need to store medical records, is this common in non Health organizations? Is this data covered by the same standards as health information technology?)


But the reason I wanted to post this was the following sentence: 

"Our information technology systems people need to take a long hard look at themselves and their capabilities. This is malpractice in their world." -Tom Waters president of American Federation of State, County and Municipal Employees Local 3290


So the questions here are; 

Is there such a thing as IT malpractice? Is a security breach indicative of IT malpractice? Are multiple breaches proof of malpractice? Let's take these on one by one:

-Is there such a thing as IT malpractice?

malpractice - Mistakes or negligent conduct by a professional person, especially a physician, that results in damage to others, such as misdiagnosis of a serious illness. Damaged parties often seek compensation by bringing malpractice suits against the offending physician or other professional.

I think the key part of the definition here is "professional". While IT as an industry has all of the challenge of any other "professional" industry we do not have a central body to certify professionals. By that I mean that while there are various vendor and organizational certifications we do not have a formal licensing body. So i don't believe we meet the legal definition of professionals which means that we are incapable of malpractice.

-Is a security breach indicative of IT malpractice?

delinquency - Failure in or neglect of duty or obligation; dereliction; default:delinquency in payment of dues. 

Based on the above, lets substitute the word delinquency for malpractice. Here I think it depends on the specifics of the incident itself. If you maintained good security best practices then you were probably not delinquent. If you did not (Leaving the default/easy/no password on your firewalls and routers. Not maintaining audit trails. Not restricting access rights to the minimum necessary.) then I would call that delinquent. If the IT department was delinquent it should certainly be held accountable, but we do not have enough information here to indicate that.

-Are multiple breaches proof of malpractice(delinquency)?

Proof? Probably not. But it is certainly indicative. Any security breach should be followed by a post-mortem investigation and response. Two security incidents using the same attack vector over a period of time would seem to suggest that the post-mortem was not done or wasn't done well and would be a reflection on the IT team.



malpractice. (n.d.). The American Heritage® New Dictionary of Cultural Literacy, Third Edition. Retrieved February 10, 2009, from Dictionary.com website:http://dictionary.reference.com/browse/malpractice

delinquency. (n.d.). Dictionary.com Unabridged (v 1.1). Retrieved February 10, 2009, from Dictionary.com website: http://dictionary.reference.com/browse/delinquency

Lesson Learned: MS Terminal Services - Volatile Memory

Had an interesting issue with Microsoft terminal services the other day.  I'm still wrapping my head around this so there may be some inaccuracies here, i'll do my best to correct them as I find them.  Apparently environment variables are stored in "volatile memory" which can cause problems in applications that use common logins and environment variables.

Specifically, we have a single service account that is used by several thin client devices.  The account logs in to the terminal server automatically and launches an application.  (The application itself requires a login so our security exposure is tolerable.)  As the application is launched the %clientname% environment variable is read and sent to the application so that workstation specific workflows can be configured.

Now the interesting part.  When 2 or more thin clients log in within 1 second of each other, they can "steal" each others name.  This was tied back to the %clientname% environment variable changing in between the initial login and when %clientname% is sent to the application.  It seems when the second thin client logs in as the first is launching the application the second is overwriting the environment variables (all within the same user profile because a shared service account is used) resulting in the second thin clients name being used for both.  So...  Environment variables are user specific not session specific.

Work arounds:
1)  Configure different service accounts for each workstation/client.
2)  Require end users to log in with their own credentials rather than using a service account.
3)  Use non-volatile session specific variables in the WMI instead of environment variables.