Joyent

Translations of this page:

How to troubleshoot SMF

I assume you know what SMF is. This document will show some day-to-day operations with it.

Before your begin

SMF can take snapshots and do rollbacks, but that's advanced stuff. For now, it's best to just dump a copy of the SMF database. It's good for reference, just in case.

svcadm export > all_smf_export

There are 3 major tools for playing with SMF:

  1. svcs - This tool shows running (and stopped) services, and also reports on errors.
  2. svcadm - This tool is for starting/stopping services. Each service can have a say what users are allowed to start/stop it. If you configure it right, you don't need to be root to restart your web stuff, even if your web server requires root to run.
  3. svccfg - import/export of service configurations.

Getting service information (All services)

For all services:

  1. To see all running services
    svcs
  2. To see all services, including not-running
    svcs -a
  3. To check for services having problems (“in maintenance mode”)
     svcs -x

Getting service information (A specific service)

Every service has a name (called an FMRI). In the examples below, “NAME” stands for your service name. It can be abbreviated. For example, if you have the following service:

	svc:/network/smtp:sendmail

Then the following are valid abbreviations:

	sendmail
	:sendmail
	smtp
	smtp:sendmail
	network/smtp

And the following are invalid abbreviations:

	mail
	network
	network/smt

You can get detail about any one service.

svcs -l NAME

It looks like this:

fmri         svc:/network/postfix:default
name         Postfix SMTP Server
enabled      true
state        online
next_state   none
state_time   Fri May 09 19:29:30 2008
logfile      /var/svc/log/network-postfix:default.log
restarter    svc:/system/svc/restarter:default
contract_id  4980239
dependency   require_all/none file:///opt/local/etc/postfix/main.cf (online)
dependency   require_all/error svc:/network/loopback:default (online)
dependency   require_all/error svc:/network/physical:default (online)
dependency   require_all/none svc:/system/filesystem/local (online)

To see what processes a service is running:

svcs -p NAME

Controlling existing services

svcadm -t enable NAME    # Start service
svcadm -t disable NAME   # Stop service
svcadm enable NAME       # Start service (persists over reboots)
svcadm disable NAME      # Start service (persists over reboots)
svcadm restart NAME      # (only if defined in manifest?)
svcadm refresh NAME      # usually does a "kill -HUP" if defined

If a service is in maintenance

  1. Services have dependencies, so a service may be down because a dependancy is down.
  2. When you run “svcs -x” and it reports something, that means SMF is having a problem starting your service. The output looks like this:
    svc:/network/security/kadmin:default (Kerberos administration daemon)
    State: offline since Tue May 06 07:10:47 2008
    Reason: Start method is running.
    See: http://sun.com/msg/SMF-8000-C4
    See: kadmind(1M)
    See: /var/svc/log/network-security-kadmin:default.log
    Impact: This service is not running.
    
  3. To get a little extra information (not always useful)
    svcs -v -x
  4. First, take a look at the SMF log file (under /var/svc/log/NAME) For example, in the above output, we would run “tail /var/svc/log/network-security-kadmin:default.log”
  5. Sometimes, this log will help you out by giving you an error message. Other times, you will see “Method 'start' exited with status 1” with no explanation. In this case, you need to hunt down the log for your specific service and tail that. (i.e. /var/log/postfix.log, or your application mongrel.log, etc.)
  6. If the SMF manifest is setup correctly, editing the application config file will automatically cause SMF to “try again” like magic. (Not all of them are set up correctly.)
  7. To manually clear a service from maintenance
     svcadm clear NAME

    The service should then go offline and if all is well return to online. Alternately, you could disable, then enable it. (running enable a 2nd time doesn't seem to help.)

How to add permissions to an existing service

You don't always want to become root to restart your web app.

TODO.. write some documentation here…

How to create your own manifests

Start with a similar service. Export it's manifest, then start hacking.

svccfg export postfix > my.smf

Before importing, you can test for XML problems. (I find that this is not really needed.)

svccfg validate

Importing the service will over-write an existing service (unless you changed the name). Normally, your service is started right away, but the manifest file can specify otherwise.

svccfg import

After you get your SMF file working, you don't really need the file anymore, since you can export it if you need it again.

 
accelerators/kb/smf2.txt · Last modified: 2008/07/14 22:05 by joshr
 
Recent changes RSS feed Creative Commons License Driven by DokuWiki