===== How to troubleshoot SMF ===== I assume you know what SMF is. This document will show some day-to-day operations with it. ==== Before your begin ==== SMF can take snapshots and do rollbacks, but that's advanced stuff. For now, it's best to just dump a copy of the SMF database. It's good for reference, just in case. svcadm export > all_smf_export There are 3 major tools for playing with SMF: - svcs - This tool shows running (and stopped) services, and also reports on errors. - svcadm - This tool is for starting/stopping services. Each service can have a say what users are allowed to start/stop it. If you configure it right, you don't need to be root to restart your web stuff, even if your web server requires root to run. - svccfg - import/export of service configurations. ==== Getting service information (All services) ==== For all services: - To see all running services svcs - To see all services, including not-running svcs -a - To check for services having problems ("in maintenance mode") svcs -x ==== Getting service information (A specific service) ==== Every service has a name (called an FMRI). In the examples below, "NAME" stands for your service name. It can be abbreviated. For example, if you have the following service: svc:/network/smtp:sendmail Then the following are valid abbreviations: sendmail :sendmail smtp smtp:sendmail network/smtp And the following are invalid abbreviations: mail network network/smt You can get detail about any one service. svcs -l NAME It looks like this: fmri svc:/network/postfix:default name Postfix SMTP Server enabled true state online next_state none state_time Fri May 09 19:29:30 2008 logfile /var/svc/log/network-postfix:default.log restarter svc:/system/svc/restarter:default contract_id 4980239 dependency require_all/none file:///opt/local/etc/postfix/main.cf (online) dependency require_all/error svc:/network/loopback:default (online) dependency require_all/error svc:/network/physical:default (online) dependency require_all/none svc:/system/filesystem/local (online) To see what processes a service is running: svcs -p NAME ==== Controlling existing services ===== svcadm -t enable NAME # Start service svcadm -t disable NAME # Stop service svcadm enable NAME # Start service (persists over reboots) svcadm disable NAME # Start service (persists over reboots) svcadm restart NAME # (only if defined in manifest?) svcadm refresh NAME # usually does a "kill -HUP" if defined ===== If a service is in maintenance ===== - Services have dependencies, so a service may be down because a dependancy is down. - When you run "svcs -x" and it reports something, that means SMF is having a problem starting your service. The output looks like this: svc:/network/security/kadmin:default (Kerberos administration daemon) State: offline since Tue May 06 07:10:47 2008 Reason: Start method is running. See: http://sun.com/msg/SMF-8000-C4 See: kadmind(1M) See: /var/svc/log/network-security-kadmin:default.log Impact: This service is not running. - To get a little extra information (not always useful) svcs -v -x - First, take a look at the SMF log file (under /var/svc/log/NAME) For example, in the above output, we would run "tail /var/svc/log/network-security-kadmin:default.log" - Sometimes, this log will help you out by giving you an error message. Other times, you will see "Method 'start' exited with status 1" with no explanation. In this case, you need to hunt down the log for your specific service and tail that. (i.e. /var/log/postfix.log, or your application mongrel.log, etc.) - If the SMF manifest is setup correctly, editing the application config file will automatically cause SMF to "try again" like magic. (Not all of them are set up correctly.) - To manually clear a service from maintenance svcadm clear NAME The service should then go offline and if all is well return to online. Alternately, you could disable, then enable it. (running enable a 2nd time doesn't seem to help.) ===== How to add permissions to an existing service ===== You don't always want to become root to restart your web app. TODO.. write some documentation here... ===== How to create your own manifests ===== Start with a similar service. Export it's manifest, then start hacking. svccfg export postfix > my.smf Before importing, you can test for XML problems. (I find that this is not really needed.) svccfg validate Importing the service will over-write an existing service (unless you changed the name). Normally, your service is started right away, but the manifest file can specify otherwise. svccfg import After you get your SMF file working, you don't really need the file anymore, since you can export it if you need it again.