In the previous post we learned a bit about servers and sysadmins, and how the sysadmin’s job is usually fairly unexciting.
But then occasionally something will go wrong. A server application which was working perfectly well five minutes ago is now giving the users nothing but error messages. This is when the job becomes terrifying, because some users feel compelled to explain to the sysadmin how (s)he has personally failed them. The phone starts ringing off the hook, the email inbox fills up in a hurry, and people stop by the sysadmin’s desk to point out the painfully obvious fact that the system is down. Boredom is far preferable to this.
When things go off the rails, there are a number of things the sysadmin can do to diagnose the problem:
- She might look at a list of active processes running on the server to see if something important is missing. Sometimes a service will stop for some reason, and simply restarting the service will get things moving again. After everyone settles down a bit, the sysadmin can try to figure why the damn thing stopped in the first place.
- A program called a packet-sniffer can help analyze the server’s network connections. It could be that something about the network (completely external to the server itself) has changed, and that this is causing connectivity problems. This is the sysadmin’s favorite explanation, because everything immediately becomes someone else’s problem, and it gives the sysadmin an excuse to go yell at the network pukes.
- Log files may be the most common diagnostic tool. If the server application experiences some kind of problem, it will hopefully write a useful message to a log file. Often (not always, but often) looking at the log file will reveal the problem, and hopefully there will be a straightforward solution that the sysadmin can apply promptly. Getting things back to normal in a hurry is certainly a priority in these situations, but it’s not always that easy. Sometimes it takes a while to diagnose a problem, and the solution may require unscheduled downtime.
The adage about an ounce of prevention governs the work of an experienced sysadmin, who will expend no small amount of effort putting a lot of canaries in the coal mines. A big part of this job is avoiding common or recurring problems. Examples of this might include some of the following:
- Setting up a process that emails the sysadmin when a hard drive starts to run out of space.
- Reviewing log files every day. (This is deadly dull, but sometimes it identifies problems before they break things.)
- Keeping a detailed list of upgrades and configuration changes so you can put stuff back the way it was days or weeks later.
Anyone who has been doing this kind of work for a while has horror stories, and I have a few of my own. I’ll write those up as short posts from time to time.
A sysadmin is someone whose job is the care and feeding of one or more servers (sysadmins frequently look after several servers). This is a job which is mostly boring (day-to-day stuff) but which is occasionally terrifying (when something unexpectedly stops working). There’s often not a lot in between those two things. If you are writing a story with a character who does this kind of work, don’t portray them as someone whose job is constant excitement, because no one will believe that.
A sysadmin’s celebrity among the non-technology people in the organization is often inversely proportional to how well she does her job. If this sysadmin is effective at her work, people tend not to know who she is. If that sysadmin is not good at her job, many people tend to know her name. This is a job where anonymity can be desirable.
Servers tend to run server applications (also known as services), which can be all kinds of things:
- A server might run an application commonly called a web server (Apache and Microsoft’s IIS are popular web servers). You typically interact with this service through your web browser.
- A server might be a file server. If you’ve ever used the “map a network drive” feature, then you’ve downloaded files from (or uploaded files to) a file server.
- If your organization runs a centralized accounting system, then that accounting system likely runs as a service on one or more servers. You might interact with this service through a desktop application or even a sophisticated web-based interface.
- email is another good example of a high-profile server application. Although users interact with email through a single desktop application (like Outlook), email usually happens by way of a pair of services: one for sending and receiving email (to and from other email servers), and one for retrieving email messages from the server to the desktop application.
A single server might run several different services.
Ideally a server and its server applications run smoothly and don’t need a lot of help beyond ordinary maintenance. This is the relatively tedious part of the job:
- The server requires backups, which are frequently automated processes.
- If the sysadmin manages a complicated application running on a server, that application may have many users. As people come and go, there’s a lot of giving this new person access to that, taking away that person’s access because they left, etc.
- The server and its applications often have upgrades become available. This is a never-ending chore. It’s not uncommon for the organization to schedule a maintenance period during which the services will not be available to users. This is called “downtime,” and it is unpopular with users. The sysadmins use this time to apply the updates. This is typically routine, but can be stressful, because it can be difficult to recover from a failed upgrade (this may require restoring data and/or the operating system from a backup). These maintenance periods are commonly at night, over a weekend, or even during a holiday, which is not a fun time for the sysadmin to have to work.
In the next post we’ll look at what happens when things don’t go smoothly.
For years I’ve had a ridiculous fantasy of being a fiction writer. It seems that the best-selling novel I want to have written isn’t going to write itself. I’m having trouble getting motivated, so maybe what I need is another distraction: a blog.
I thought that technology in writing might be an interesting theme. Nothing ruins a story for me faster than a character hacking the FBI network after tapping on a keyboard for ten seconds. It probably works for many readers/viewers, but some of us see it as lazy writing.
In my day job I write lots of web applications for a public university. Many of my assignments are to convert paper processes into online forms. My job also involves a fair bit of Linux server administration. Most of this goes on the open Internet and is subject to daily cyber-attacks from all over the world (my server logs once revealed malicious traffic from Antarctica).
So the purpose of this blog has a couple of goals. One is to get me in the habit of writing. But I thought it might be useful to share some of what I’ve learned in a format that may be helpful to other prospective writers. I may also write about how technology can affect a writer. Here are some topics I have in mind:
- credible hacking
- port scanning
- realistic exploitable security vulnerabilities
- case studies of actual security breaches (like Target)
- a writer’s technology
- safe(r) Internet use (account security, security-related Firefox extensions, password managers)
- affordable and effective backups
- writing tools like scrivener and wordpress (I know a fair bit about the latter and would like to learn more about the former)
- the day-to-day life of a web programmer
- server administration is not sexy
- the importance (and challenge) of making web sites accessible
- the horrors of working with vendors and ticketing systems
This blog may at times earn a PG-13 rating. I’ll mostly keep it clean, but there may be the occasional bit of salty language.
I’ll try to post every seven to fourteen days (historically I’ve really struggled with self-imposed routines like that), and I’ll try to keep individual posts fairly short (preferring to break up longer topics into multiple posts).