OSG Grid Operations Center: Internal Developer Page

Change log for Probes

RSV Version 2

Version 2.3.2 (2008-06-10, from v2.3.1 (2008-06-02) version) -- [VDT 1.10.1d]

  • RSV-Core Package: Changes to configure_osg_rsv script.

    • Fixed bug related to use of quotes in Condor-cron job submission file for SRM probes
    • Added valid-Unix-user check to osg-rsv init script.
  • cacert-crl-expiry-probe: Very minor typo in print statement fixed.

Version 2.3.1 (2008-06-02, from v2.3.0 (2008-05-09) version) -- [VDT 1.10.1c]

  • jobmanagers-status-probe: Enables default jobmanager i.e resource_name/jobmanager. This will help with availability calculations at GOC level as well as in WLCG SAM.
  • srm-ping-probe and srmcp-srm-probe: These probes now take --srm-client-loc argument to specify location of SRM client -- default is $VDT_LOCATION/srm-client-fermi.
  • cacert-expiry-local-probe and certificate-expiry-local-probe: These probes print probeType to be OSG-Local-Monitor now - helps with configuration of local probes separately.
  • ALL Probes: Increased process timeouts to 300 seconds (from 120s).

Version 2.3.0 (2008-05-09, from v2.2.4 (2008-04-29) version) -- [VDT 1.10.1]

  • gums-authorization-validator-probe: Multiple fixes by Jay Packard - expected to work more accurately.
  • srm-ping-probe and srmcp-srm-probe: These probes now take --srm-webservice-path argument so BestmanXrootd admins can specify different web service path.

Version 2.2.4 (2008-04-29, from v2.2.3 (2008-04-25) version)

  • cacert-crl-expiry-probe: Minor bug fix to print proper metricName in case of failed probe runs (i.e when probe prints error/critical status)

Version 2.2.3 (2008-04-25, from v2.2.2 (2008-04-14) version)

  • gums-authorization-validator-probe: Multiple fixes - expected to work more accurately -- uses a worker script now.
  • certificate-expiry-local-probe: Prints certificate that was tested for clarity. Fixed bug that overlooks inability to read cert file.

Version 2.2.2 (2008-04-14, from v2.2.1 (2008-04-09) version)

  • cacert-crl-expiry-probe: Fixed bug related to metricName that was being printed.
  • gridftp-simple-probe: Added --gridftp-delay parameter; and related sleep () statement in OSG_RSV_Probe_Base.pm (Per conversation with MWT2_IU admin to enable delay between copying file, and retrieving it back)

Version 2.2.1 (2008-04-09, from v2.2.0 (2008-04-05) version)

  • gridftp-simple-probe: Added --gridftp-destination-dir parameter; and related changes in OSG_RSV_Probe_Base.pm including cleaned up code that uses gsiftp URIs instead of constucting them in each routine.
  • (Per conversation with MWT2_IU admin to enable separate gsiftp host from CE host)
  • osg-directories-probe: Fixed bug related to mis-leading error message if GRAM auth failed.
  • vo-supported-probe: Fixed bug related to mis-leading error message if GRAM auth failed, etc.
  • General change in Perl module: Uses &Exit_Error() where applicable instead of setting and then printing error.
  • Minor change in srmcp probe: Uses hash 'srmcpDestinationDir' instead of 'srmcpDestinationDirectory' to keep things consistent.


RSV Version 1

Version 1.6.2 (2008-04-25), from v1.5.0 (2007-10-24) version)

Mainly a back-port of version probes for VDT 1.8.1 / OSG CE 0.8.0 resources with some new probes thrown in for good measure.

  • Several new probes: CE: vdt-version-probe, vo-supported-probe; SRM: srm-ping-probe, srmcp-srm-probe; GUMS: gums-authorization-validator-probe.
  • Several bug fixes on several other probes: certificate-expiry-local-probe prints clearer error messages; classad-valid-probe uses more accurate check, and uses condor_cron_status command; osg-directories-probe considers if a directory is required by OSG before checking on its setup; gridftp-simple-probe allows destination directory to be specified, as well as a delay between write and read-back; cacert-crl-expiry-probe uses new test based on CA-Certs version number for the CACert metric.

Version 1.5.0 (2007-10-24, from v1.4.1 (2007-10-01) version)

  • One new probe: classad-valid-probe. This probe tests if ALL the classad attributes are valid for a resource. The probe takes the given service URI and checks with the appropriate ReSS collector.
  • Local time switch: All probes take a --print-local-time switch, and force probe to print the timestamp in local timezone. Note that using this option makes the probe output non-conformant to the WLCG data exchange specs but is provided for local site admin's convenience (i.e. the probe will still report UTC to the central RSV database.
  • Renamed perl module: OSG_Probe_Functions.pm to OSG_RSV_Probe_Base.pm
  • Minor changes: All probes print metricType now; The perl module is better documented now.

Version 1.4.1 (2007-10-01, from v1.4 (2007-09-14) version)

  • Minor Bug Fixes:* Minor bug fixes: osg-directories-probe tests OSG_LOCATION instead of OSG_GRID; Fixed copy n paste error where OSG_APP permission was getting reset by AG's testing case; Fixed 'pythonToUse' value to include the word python itself.
  • Cacert-crl-expiry-probe's worker scripts now execute the CE's setup.sh before doing anything else.
  • Gridftp-simple-probe has slightly better structure in terms of readability .. as we work towards documenting how to develop new probes

Version 1.4 (2007-09-14, from v1.3.1 (2007-08-20) version)

  • Jobmanager-status-probe:
    • Returns UNKNOWN status if a particular job manager being tested for, is not supported by the CE. Before it would return CRITICAL status after going through the test unnecessarily.
    • Cleand up sub routine that does actual testing to print appropriate messages - previously there were a couple of errorneous cases.
    • The error printed if there proxy/auth errors, is improved in this version of the probe
  • osg-directories-probe changes: Updated metricName to reflect we're indeed testing a CE -- this metric soes not test for OSG_WN_TMP; Also notion of CRITICAL vs. non-CRITICAL failures introduced.
  • Cacert-crl-expiry-probe takes new switch to specify CAcert/CRL location on remote CE; also default is now $X509_CERT_LOCATION.
  • Some other minor changes.

Version 1.3.1 (2007-08-20, from v1.3 (2007-07-18) version)

  • Added --vdt-location command line argument, so non OSG users of the probe-set may provide $VDT_LOCATION or equivalent, using a command line argument.
  • Modified code to use $VDT_LOCATION/osg-rsv/bin/probes in places where I was using pwdbefore.
  • Tarball has version number on its filename now!

2007-08-18 version (from the 2007-07-09 version)

  • Added --python-loc switch to all the probes
    • The gratia python script that is generated now also includes a shebang line; if the option above is used, then that python will be used.
  • Fixed bug in processing of non-standard proxyfile. Now specifying -x on command line should work.

2007-08-09 version (from the 2007-07-06 version)

  • Added test-rsv-probes-by-hand.sh script. This script can be used to do a test-run of all the probes by hand, on the command line.
  • The condor-cron tarball and readme are no longer available in this tarball, since it's available through the VDT in its entirety.

2007-07-06 version (from the 2007-07-04 version) -- MINOR BUG FIX

This version is what is available in the current VDT package (as of writing of this page); the package also includes the Condor-cron scheduling infrastructure and Gratia collector bits.

  • Fixed basename error; In places where the probe's name was being used, the code was grabbing it with the complete path. Now it only uses probe name.

    For example, the following command execution:
    /foo/ping-host-probe --uri foo.com --ggs --gsl /bar/gratia
    would have given a warning about not being to write to Gratia output file:
    2007-07-06T18:28:40Z-/foo/ping-host-probe-send-gratia-record.py
    within the /bar/gratia directory.

    Now the code will instead write to:
    /bar/gratia/2007-07-06T18:28:40Z-ping-host-probe-send-gratia-record.py

2007-07-04 version (from the 2007-06-18 version)

  • Added --ggs option: Generates Gratia record uploader script (in Python) and stores it in standard location: /tmp by default, can be changed using command line option --gsl <directory>.

    This option is disabled by default

  • All probes will honor -m all -l if anyone needs to get all the metric names programatically
  • Uses default proxy /tmp/x509up_UID .. can be changed using -x option
  • Verbose output goes to STDERR now
  • Clearer usage information using option -h.
  • Ping test sets metric to WARNING (and not CRITICAL) if a resource is alive but not pingable (say, due to firewall restrictions)
  • Several bug fixes

2007-06-18 version (from the 2007-05-25 version)

  • They are consistent with the WLCG standard specis 0.91:
    • Including updated output metric format (list of fields)
    • Takes argument for proxy file, warning and critical time checks (on the proxy)
    • New service URI format -- The WLCG specs group is unable to settle on one or the other format; I am going to let it sit with the current format (hostname[:port{][/service] in the next 6 months or something?
  • The CRL-expiry-probe has been renamed cacert-crl-expiry-probe, and it's probably self-explanatory -- it does CA cert expiry check too on a remote resource (assuming the one CA cert that really matters for connectivity from the monitoring host to resource being tested is valid :-)
  • The probes that initially need remote worker scripts ... now they stage those worker scripts (Background -- initially I was not staging them since I thought the probes would be part of VDT, and would live in a standard location; but it's been decided the probes will not be part of the VDT distro .. so ...)
  • The multimetric probes (cacert-crl-expiry-probe, jobmanagers-status-probe, and the local cacert-expiry-local-probe)now require the -m switch with a metric name value. Check the help page for matric name values and such. (A deprecated -m all option is also supported if indicated by the usage information).
  • Usage information is a bit more detailed now.
  • And of course, they all have bugfixes, etc. from the previous version.

-Arvind Gopu < Last Modified: Mon Apr 14 19:01:15 UTC 2008 >