Personal backups are important

This post is a little different. It’s not about writing believable characters or situations. It’s about protecting your (the writer’s) computer files.

The hard drive in your computer is eventually going to fail. It’s inevitable, and it’ll likely happen with little or no warning. If you don’t have good backups, you’ll lose everything on your computer: your manuscript(s), research for your writing projects, pictures you’ve downloaded from your camera, banking information, all of it.

This happened recently to a relative. She didn’t have good backups, and her hard drive became unreadable. This was devastating to her, especially the loss of ten years’ worth of photographs. She later realized that she could recover many of the photos from an online photo service she’d been using, but she still lost a lot of files.

Backups can protect you against several other things things, too:

  • accidental deletions
  • the theft of your computer
  • malware that encrypts your files and asks you to purchase the decryption key (ransomware)

When you set up backups, you need to figure out where you are going to store your backups. Storing the backups on the same hard drive that you are backing up only protects you from accidental deletions. An external hard drive (something that connects via a USB cable) can be a good choice here. A quick search on amazon.com shows 1 terabyte external hard drives for around $55.

I’m not much of a Windows user, but it looks like backing up Windows files to an external device is a straightforward affair via the Control Panel.

Backing up a Mac is also pretty easy with Time Machine (I did this on my relative’s new Mac, and it just took a couple of minutes).

Backing up to an external hard drive is a great start, but it still doesn’t protect you from really terrible things like a fire, theft, or natural disasters (like a tornado). Even some malware may be able to delete or mangle your backup files.

You may also want to consider setting up off-site backups. This is something that backs up your files to some place that isn’t in your house. I’ve used CrashPlan for this1. It runs in the background on your computer (Windows, Mac OS X, even Linux) and backs up your files to the cloud. It also backs up every version of each file, so if your computer becomes infected by ransomware, you should be able to recover versions of your files prior to infection.

UPDATE (22 August 2017): CrashPlan is no longer offering personal or family plans.

Looks like current pricing for CrashPlan is around $5 or $10 per month per user. I have a family plan which is around $15 per month, and it lets me run backups on up to ten computers.

The files on your computer are probably even more important to you than you realize. They are worth protecting.

1I don’t own stock in CrashPlan or anything, I’m just a happy customer.

Advertisements

Insecure databases

People love storing information in databases, because databases make it easy to store, sort, and search large amounts of data. Sometimes those databases are not as secure as they should be.

Traditional databases are great for storing structured data, like a list of books. Books are sort of uniform, in that you describe books pretty well with a small set of identifiers (like title, ISBN, author, year of publication, publisher, etc.), and those identifiers don’t change a lot over time or from book to book. A spreadsheet will often suffice for this kind of thing.

Describing people is harder, because people are weird. Consider medical records. Women would need lots of columns men don’t need, and vice versa. A patient with diabetes would have lots of columns not relevant to a non-diabetic. Likewise for a cancer patient.

A relatively new class of database called NoSQL is good at storing records on people and other complicated subjects, because NoSQL databases can store (and sort and search) unstructured data. MongoDB is a popular open-source NoSQL database product.

The idea is that a company installs MongoDB on their server, pours data into it, and writes a web application (or some other kind of interface) to access the data. Earlier versions of MongoDB had some poorly-chosen default settings which would make the database itself directly available over network connections. More recent versions of the software have better defaults, but the damage is done: lots of people installed MongoDB with the network-available default, and they never changed it.

So even if they wrote a web application with good access controls, the database itself might be open to the internet. If the database’s network port wasn’t firewalled, anyone could completely bypass the web application’s access restrictions by connecting directly to the database (and they could download as much data as they wanted).

It’s important to note that this problem is not specific to MongoDB. This could happen with any network-enabled database system. But because of some recent discoveries of internet-accessible MongoDB databases, they’re in the spotlight. The Office of Inadequate Security has reported on several high-profile examples of open MongoDB databases, including a voter registration database with 191 million records. A security researcher named Chris Vickery used Shodan to find these databases.

That bears repeating: an ordinary guy used a search engine to find a database with the voter registration data of 191 million Americans.

All too often people don’t take care of their data. The Office of Inadequate Security reports on data breaches large and small all the time. Sometimes it’s 191 million voter records over a network connection, and sometimes it’s patient records left on a sidewalk next to a trash can when a doctor’s office goes out of business. That site might be a good place to look for inspiration when you’re writing a character that needs to acquire data that wouldn’t (or shouldn’t) be widely available. Whether your character needs to do some port scanning or some dumpster diving, she might be able to get her hands on all kinds of data.

HTTPS is not infallible

When your browser connects to a web site whose address starts with https, you’re connecting to a “secure server.” It’s considered secure, because (at least some of) the traffic between your browser and the web server is encrypted.

This business has a formidable amount of jargon. Your browser connects via one of several types and versions of protocols, and it uses one of many possible ciphers. Newer protocols are more secure than older protocols, and ciphers with longer encryption keys are more secure than ciphers with shorter keys. If an attacker can exploit some protocol vulnerability, he may be able to capture enough information to decipher encrypted data.

When your browser and a web server negotiate a connection, they try to pick the most secure combination of protocol and cipher that they can both understand. If an older browser connects to an up-to-date server, one of two things will happen:

  1. If the server has been configured to support older protocols, the server will use one of those older protocols in order to be able to talk to the browser. This is the  less secure choice.
  2. If the server has been configured not to support older protocols, the browser won’t be able to connect at all (the user will get an error message in their browser). This choice is more secure, but it causes problems for users with older browsers.

SSLLabs has a nifty web page that lets you test a web server. Type the address of your online banking site into the SSLLabs server test page and see how your bank’s site looks. My bank’s site got an F, because it supports older protocols and weak ciphers. The knuckle-dragging server pukes that work for my bank had to choose between requiring strong encryption and getting complaints from customers (and they clearly made the wrong choice).

(SSLLabs also has a page which lets you test your browser, and the browser I use for banking is not vulnerable to the things the page tests. That assuages some of the misgivings I have about using my bank’s web site.)

In the past year or so, several protocol vulnerabilities have been revealed (and corrected). These flaws often have catchy names:

This class of vulnerability is typically exploited by a man-in-the-middle (MITM) attack (see footnote). Imagine that Alice and Bob communicate with each other using written messages which they encrypt using some method that they both know how to decrypt. Alice writes a message, encrypts it, and then gives the encrypted message to a courier named Eve who takes it to Bob (these names are traditional: the courier is named Eve, because she likes to eavesdrop).

If Eve learns how to decrypt the messages, then she (the “man” in this MITM attack) can read what Alice and Bob are saying to each other. Eve could even alter the messages she delivers. In a real example, Alice would be your browser, Bob would be the web server, and Eve is someone who is somehow able to capture the traffic between the two (like someone who has tapped into the network at Alice’s ISP).

A couple of those vulnerabilities (FREAK and logjam) allow the attacker to force the the server and browser to use an older protocol and/or a cipher with a shorter key length than the browser and server might otherwise elect to use. Eve then has an easier time decrypting the traffic that she’s able to capture.

That’s easier, not easy. The traffic is still encrypted, and it takes time and computing resources to break the encryption. There are a couple of things to take away from all of this:

  1. It’s really important to keep your browser up-to-date so that it has the most modern set of protocols and ciphers.
  2. If you’re writing about a character who wants to eavesdrop on a target’s encrypted traffic, the attacker probably has to overcome the formidable obstacles of compromising the target’s network connection and have the computing resources to break encrypted traffic. It might be more believable to have your character try to get the target to fall for a phishing attack that installs a keylogger.

Footnote: Heartbleed is the exception here. That was something that potentially gave the attacker the ability to read the contents of a web server’s memory (which might include the private keys that would decrypt the server’s connections.)

Default passwords

A network router is a device which forwards traffic between two networks. Your computer is on one segment of the internet, and your favorite web site is (likely) on a different segment. There’s at least one network router between you and your favorite web site moving the data packets back and forth.

Routers will typically more-or-less work right out of the box, but they generally need some configuration to do their jobs well (and securely). Routers frequently offer a web interface for this: you connect a computer to the router, go to a particular web address (specified by the product’s documentation), and then configure the device for its particular purpose. For example, if you’re setting up a router for an elementary school, you might configure the router to send all web traffic through some kind of content filter.

More and more devices are like this: you buy a shiny new gizmo, connect it to your network, and it offers some feature you can control with an app on your phone. This is the “Internet of Things” (IoT):

Network-enabled security cameras are another interesting example of this kind of thing. Imagine being able to log on to a camera hundreds of miles away, have it take pictures on demand, and view the images.

These devices typically ship with a default password. And that’s the big problem with these things: they don’t necessarily force you to change the password, and those default passwords are well documented and widely available: they’re in the product documentation that the manufacturer probably puts on their web site for anyone to download.

(Sometimes the manufacturer will try to assign a unique default password to every unit they sell. This is great when they do it right, but sometimes they fail hilariously.)

Shodan and Censys are projects which portscan the internet and make the data available to anyone who wants to look at it. This data often reveals the manufacturer and model number of internet routers. Netgear devices often give the full model number in the remote administration password prompt. And there are web sites (like routerpasswords.com) devoted to making it easy to look up the default password for a particular network device model.

There are two important points to remember here:

  1. If you are writing about a character who wants to compromise a network target, and if she can determine the manufacturer and/or model number of the router protecting her target (either through shodan or by portscanning it herself), she can look up the default password either through something like routerpasswords.com or by downloading product documentation from the manufacturer. If the network pukes at the target haven’t secured their router, your character could add routing table rules allowing her direct access to resources on the internal network.
  2. If you haven’t changed the password on the home router that may be sitting on your desk, now would be a good time to do so. (And unless you REALLY need it, you should disable the remote administration feature which was probably enabled by default.)

Target, Home Depot, Ashley Madison, and third-party vendors

If you are interested in writing about large-scale data and credit card theft, you could look to the Target, Home Depot, and Ashley Madison data breaches for inspiration. Much of what we know about these breaches comes from reporter Brian Krebs. His blog is fascinating, and I recommend it very highly. This post will refer heavily to his reporting.

(This post will refer to Target the retailer and targets of crime. Mind the capitalization to tell the difference.)

The retailer Target was the victim of a large data breach during the 2013 holiday shopping season. Criminals stole credit card information of 40 million customers and personal information (names, email and mailing addresses, phone numbers) of 70 million customers. The numbers here are so large that the thieves had trouble selling all the stolen credit card numbers before banks were able to cancel the credit cards, and some banks had trouble re-issuing cards, because the people who turn plastic into credit cards had a huge backlog of orders. (Target recently agreed to a $39.4 million settlement with banks and credit unions as a result of this breach.)

The picture that Krebs’ reporting paints about the Target breach is that it involved an external HVAC company that worked for Target. Someone at the HVAC company fell for a phishing attack, which probably installed a keylogger or some other malware on that person’s (the HVAC company employee) PC, and this enabled the criminals to acquire login information to servers that Target’s vendors use to interact with Target (for work orders, billing, etc.). The criminals were able to use this access to install malware on the point-of-sale (POS) devices at target stores. (Yes, there are probably several steps missing there, which I don’t understand, either, but it’s not the point of this post.) The POS malware was able to upload credit card data to another compromised server on Target’s internal network, and then that internal server exfiltrated the stolen data (gigabytes of it) to external FTP servers all over the world. (See Krebs’ coverage of the Target data breach for more details.)

Much the same thing happened to Home Depot in 2014. Criminals installed malware on thousands of self-checkout lanes at nearly every Home Depot location. The criminals got away with 56 million credit card numbers and 53 million customer email addresses. As happened with Target, the Home Depot network was initially breached using login credentials stolen from a third-party vendor. (Again, Krebs has more details about the Home Depot data breach.)

Although it didn’t involve credit card theft, the Ashley Madison story is similar. Ashley Madison is a social networking site created with the specific intention of enabling elicit (e.g., extra-marital) affairs. Someone managed to download and publish the account information of many or all of the AM users. Little is publicly known about how that information was acquired, but the CEO of AM’s parent company implied that it was the work of a non-employee who had previously had access to the AM information resources.

The takeaway here is something that might be useful for writing any kind of story about corporate hacking and espionage. In all three of these examples, a confirmed or suspected method of infiltration involved a vendor hired by the target company. Even if the vendor isn’t complicit, the vendor may be a softer target with lower standards of security (or with more access than they really needed). Breaching the vendor may give the attacker a foothold into the larger target.