Technology Dimensions: April 2012

Monday, April 23, 2012

Know Your Strengths

Last week, I attended the Amazon Web Summit in New York City. My conclusion: Amazon is taking over the world.

OK, yes, I get that this was an elaborate sales presentation, not a technical conference. While I was being wowed by their talks, I was cognizant that I had entered Amazon's reality distortion field, and that not everything was to be taken at face value. But even taking these factors into account, what Amazon is working on is impressive.

In case you don't follow these things, Amazon is one of the biggest players in Cloud Computing. Suppose you need to run a large, complicated computing problem, such as a one-time conversion of a large number of video files. It's computation intensive, so running it on your company's computers might take weeks or months. You could bite the bullet and wait weeks or months. You could purchase additional servers. This would be an expensive option, to say nothing of the logistics of ordering and installing them, plus the question of what you do with these servers once the task is done. Alternately, you could lease some online servers from Amazon. Use as many as you like. Pay by the hour. Once you're done, take back your processed files and shut down your account. All done, no mess, no hassle.

Sounds like an interesting niche business. But it's not a niche business, not anymore. It's huge. Netflix relies on Amazon Web Servers to stream movies. Instagram relies on Amazon to host it's infrastructure, which may help explain how a billion dollar company was able to get by with only twelve employees. Schrodinger recently worked with Cycle Computing to build an Amazon Cloud supercomputer, leveraging 50,000 cores to do an exhaustive analysis of 21 million compounds to discover good drug candidates. This virtual supercomputer was estimated to be among the top 50 supercomputers in the world, and cost just under $5000 an hour, with no setup or capital costs. One recent analysis of web traffic estimated that 1% of all web traffic goes to a web server hosted by an Amazon cloud computer. This business is massive.

How did Amazon get here? To hear Amazon tell it, they're not a retail company. They're a technology company that happens to do retail, among other things. And they've certainly paid their dues in the process. If you followed the tech industry back in the late nineties, then you probably remember the anecdotes that surfaced at the end of the every year like clockwork, as Amazon's systems were overloaded by a Christmas rush bigger than anything they'd ever experienced before - again. Everybody, including the CEO, pitched in to help get packages out of the warehouses on time. Computer systems were overloaded. Inevitably, many packages only shipped weeks after Christmas had come and gone. I'm sure it was a painful process to go through.

But having gone through their trial by fire, Amazon seems to have learned their lessons well. They've developed one of the top logistical systems in the world, and more importantly, they've mastered the art of running the tens or hundreds of thousands of servers required to keep it all running. You can't manage that type of complexity one server at a time. You need to organize it, glue it together with seamless software that allows you to transparently shift processing activities from one part of the world to another to deal with the inevitable breakdowns and snafus. And once you've built this impressive infrastructure, it's a very small step to sell some of this processing power to the outside world.

In short, Amazon has done exactly what business gurus like Tom Peters and Peter Drucker have been talking about for decades - building on its strengths as much as possible. A superficial analysis of Amazon would have said that they were retail wizards, and should stick to that world. If they want to expand, maybe they should try opening a line of physical stores. That misses their real strength, which is managing the complexity behind their operation. Thus they've found a huge opportunity in a business that seems highly unrelated to their origins as an online bookstore.

Here's the real question: why aren't more businesses doing this?

The answer is, they try. For example, in 1994, Quaker Oats acquired Snapple, thinking that their experience running Gatorade would make this an easy win. They discovered, to their great dismay, that the logistical and marketing issues made the skills needed to manage these different brands very different from each other. Quaker finally sold the business in 1997, having lost a billion or more learning this lesson.

Quaker made a common mistake. They assumed that two processes that produce similar products or outcomes must be similar. What they needed to do instead was to ask what their real core competency was, and figure out how they could leverage it, even if it turned out to be in a radically different business.

Here’s a thought. Walmart is one of the best companies in the world at getting their products where they need to be. Stories abound of how they can shift inventories from one region of the country to another in a blink of the eye, whether its old video games being purchased in Florida by grandparents for their visiting grandkids, or emergency supplies in New Orleans in the wake of Hurricane Katrina.

So why hasn’t Walmart started a shipping business? You could simply drop off your package in your local Walmart store, and the recipient could pick it up in another Walmart store anywhere in the country the next day. Because Walmart is one of the pioneers in the use of RFIDs, there would be excellent package tracking through the entire process. And because they already have all the fleets and computer systems developed, deployed and scaled to massive volumes, they could ship packages for a fraction of what it costs Fed Ex to or UPS to bring it to your door. It wouldn’t replace traditional shipping companies, but there would probably be significant interest, for almost no risk or additional cost.

What applies to a business applies to a person. What’s your unique strength? Don’t assume it has to be something related to your current line of work. An administrative assistant with a relentless eye to detail might make a great event planner, or accountant. An interior decorator with exquisite taste and an eye for color might do well consulting on movie design, or taking photographs.

The deeper your understanding of your unique strengths, the more prosperous you’ll be in any type of economy.

Monday, April 16, 2012

Who are you?

On March 30^th, 2012 (or sometime close to then, nobody knows the exact date), a server breach occurred within the Utah Department of Health. It was reported that personal information for 25,000 people was exposed, including names, addresses, birth dates, and social security numbers. This figure was later increased to over 250,000 people. If you follow this sort of news at all, you’ll know that this is just one in a long string of similar breaches. It happens to all sorts of organizations, public and private, government and corporate.

These data breaches are a serious problem. This information is all that is needed to open financial accounts, such as credit cards. When this happens, a person’s credit record can be destroyed. Fixing the problem can take years or decades. Some people never recover. There is very little recourse for a person claiming that an account was opened without their knowledge or consent. The Utah Department of Health is offering a year’s worth of free credit monitoring to all affected individuals to limit the damage, which is better than nothing, but not much.

Organizations that treat this type of data carelessly need to be held accountable. We need to highlight these cases in the media, and discover who was negligent, and why. Lessons need to be learned, and civil and criminal charges need to be issued where appropriate.

And yet…

At the end of the day, this type of breach is not the real problem. It’s just a symptom. The real issue is that we allow to people to be held accountable for accounts opened using nothing more than their name, address, birth date and social security number. While all of this information may be tricky to assemble, none of it is actually secret. All of it is in the public domain, somewhere. Before all records were computerized, the cost of pulling together a comprehensive data profile for somebody was considerable, and often required the services of a private investigator, who would physically visit town halls and other archives. It’s only a matter of time before all this data is available freely on the Internet, with automated data agents able to pull it together with no human involvement.

What happens then?

There are some areas of security, such as choosing better passwords, that are relatively easy to address. But before you can validate somebody’s access, you first need to identify that person. How do you do so, in a way that only one person in the entire world can successfully pass the test?

It used to be that most transactions and account applications were done face to face. In a small town, the people involved had probably known each other all their lives. Therefore, identity was a combination of hair color, facial and body structure, accent and speech patterns, and thousands of other minor factors. While it was possible to engage in fraud by mimicking another person, it wasn’t common, and carried a high degree of risk for the perpetrator. It's too easy to be caught when you have to show up in person.

As the economy grew and national (and then global) banks and stores took over, the system of personal recognition broke down. We evolved a new system, which is based on security through obscurity. A bank would ask all sorts of questions that were difficult for another person to know. It would then spend time and effort to validate that this data was correct. It was certainly easier to commit fraud than it had been when everybody knew everybody else, but was still enough of a challenge that losses were manageable.

We are now approaching the tipping point, when information will flow so freely that we will need to develop a completely new approach to identifying people. How can we achieve this?

There are generally three ways to identify somebody:

Something you know
Something you have
Something you are

Our current system is based on something you know. People know their own Social Security numbers, addresses, mother’s maiden names, and so forth. If we’re going to keep this solution, then we need to ensure that people know something that is not, and never will be, public knowledge. In short, everybody will need a “secret” Social Security Number, or other identifying text string, which would be issued when they are born. Their parents would store it in a secret place, and have them commit it to memory as soon as they were old enough.

I doubt this will work. In order to effectively validate this secret, somebody else will need to look it up. Even if you use a fully encrypted process, sooner or later, the secret is going to get out, and we’ll be right back to Social Security numbers.

Something you have is generally some type of possession which can be uniquely identified, such as an RSA token. Unfortunately, tokens are easily lost – people will constantly be having to ask for new ones. And if the token is the principle way of identifying somebody, how do you figure out that the right person is asking for the replacement token?

Something you are represents physical characteristics, such as finger prints, retina patterns, or bone structure. Since we’ve ruled out something you have and something you know, we’re probably going to have to go with this one. And yet, this is a very tricky and problematic option. The first solution that everybody immediately thinks of are finger prints. They are unique to a person, and have successfully solved countless crimes by criminals who fail to take the basic precaution of wearing gloves. It’s easy to find their fingerprints covering everything they touched.

That’s the problem with fingerprints.

We leave them everywhere. Finding somebody’s fingerprints is a trivial exercise – just pick up any object they’ve handled. And once you have their prints, they’re pretty easy to replicate. All you need is a pair of extremely thin latex gloves, encoded with another person’s prints, and voila! You are now that person.

Retina patterns are a little better, but still problematic. With today’s technology, you don’t leave your retina patterns lying around everywhere. But what about with tomorrow’s technology? How long before a high quality camera can take a picture of somebody from 5 or 10 feet away, and see enough detail to capture all the necessary details in their eyes? Heck, all you need to do is to put your own camera onto something that looks like an ATM, and ask them to have a retina scan to make a deposit. (This is already done for skimming a person's ATM card.) What you do with that information is a trickier challenge, but it doesn’t seem impossible for somebody to invent a set of contact lenses in the near future that can match another person’s retinal patterns. And once that happens, here’s the real problem with retinal patterns and finger prints – you can’t change them. You’re stuck for life, so as soon as anybody figures out what yours are and invents the technology to duplicate them, you’ve now lost your identity permanently.

DNA scanning is possible, I suppose. But frankly, I simply don’t want to spend much of my time time envisioning a world where you’re asked to spit into the handy receptacle to prove your identity every time you wish to make a purchase. So we’ll leave it at that.

That leaves facial and body recognition. This seems to have some promise. Unlike when humans recognize faces, computers are not fooled by wigs or extra glasses. Facial recognition software works by looking for structural features such as the contours of the eye sockets, shape of the cheek bones, and length of the jaw line. Weaknesses in the facial recognition systems tend to result from poor lighting or bad angles, which can be reduced or eliminated when the subject wishes to be identified to complete a transaction. Add measurements of bone structures to the mix just for safety, and you’ve probably got a fairly robust system.

But remember, the purpose is to identify somebody, not just validate them. Which means that every time you want to really demonstrate your identity, you need to be measured – we don’t want people simply showing a photograph to fool facial recognition software. Which means that first of all, your measurements have to be recorded in a secure, encrypted database somewhere – probably initially when you are born, and then updated every year until you reach maturity. Then, to demonstrate you are the same person who belongs to these measurements, you need to subject yourself to detailed scanning that can confirm that you are a real live person who has facial and body features that match those on record. Only then can anybody safely assume that you are who you say you are.

Of course, this doesn’t prevent identity theft, it just brings it down to a manageable level. You still have the problem of an impostor taking your turn when you go to be measured. Or a hacker breaking into the database to update the records recording who you are. And of course, you’re always in danger from your evil identical twin.

In the meantime, call up the Utah Department of Health and give them an earful about losing their data. We may never be able to go back to the days when are personal information was relatively obscure and private. But I miss them already.

Monday, April 9, 2012

Looking for Danger in All the Wrong Places

A doe had the misfortune to lose one of her eyes, and could not see anybody approaching her on that side. So to avoid danger she always used to feed on a high cliff near the sea, with her sound eye looking towards the land. By this means she could see whenever the hunters approached her on land, and often escaped by this means. But the hunters found out that she was blind in one eye, and hiring a boat, rowed under the cliff where she used to graze and shot her from the sea.

-Aesop's Fables

Anybody paying attention to advances in artificial intelligence has noticed that significant milestones are being crossed more and more frequently. Deep Blue defeated world chess champion Gary Kasparov in 1997. Watson took the crown in the much more flexible game of Jeopardy on 2011. And while Dr. Fill disappointed many observers by only placing in the near the bottom of the top quartile at the America Crossword Puzzle Competition, few expect it to take much longer before computers reign supreme in this area as well. Jeopardy Champion Ken Jennings summarized the ambivalence of many when he used his terminal to display the line “I for one welcome our new robot overlords.”

We've all seen movies like The Terminator and The Matrix. We're all wondering which massive collection of computers is going to go Skynet on us and achieve the critical mass needed to wake up and start making its own decisions. Will it be Watson? Dr. Fill? Or perhaps Google's seemingly infinite collection of server farms, running in unmarked, undisclosed locations spread across the world? We pontificate endlessly about what safeguards we need to keep these colossal systems in check. Can we build in kill switches? Keep them fire-walled off from the control software running power plants and weapons systems? Build in Asimov's three laws of Robotics in the hope that the newly awakened system will serve our needs instead of their own?

I have to wonder if, like the one-eyed doe, we're all looking in the wrong direction.

For all the intricacy of a Watson or Dr. Fill, these are highly monolithic programs, designed to do one thing and do it well. I haven't heard any reports that Deep Blue or any other chess program has been getting bored and asked to try its hand at tennis, or Parcheesi. Adapting Dr. Fill to to do some task other than complete crossword puzzles would be a monumental undertaking. Probably easier to throw everything out and start from scratch.

Moreover, these programs have no survival instinct. They have no ability or inclination to replicate, or to try to thwart the intentions of anyone who would prevent these activities. They cannot rewrite their own code to avoid detection and adapt to a new environment.

Modern malware has all of these attributes.

The sobering fact is that it's becoming increasingly difficult to come up with a good definition of life that does not include malware. Wikipedia lists the following criteria to consider something "alive":

Undergoes metabolism. While this traditionally refers to chemical reactions that sustain an organism, there's no intrinsic reason why it couldn't refer to the processes of a functioning program.
Maintains homeostasis. Similar to metabolism.
Possesses a capacity to grow. True, though currently limited. (But see below)
Respond to stimuli. Absolutely. Many worms and viruses will watch what is happening in the operating system and take actions accordingly.
Reproduce. Yes, and then some.
Through natural selection, adapt to their environment in successive generations. Limited again, but not for long.

The two points above that are weakest today are the ability of malware to grow and adapt to its environment. In malware terms, this most closely translates to polymorphism, where a virus will modify its own code. In today's world, these are generally very minor modifications, designed to make the virus more difficult to detect by an anti-malware program looking for a specific code signature. A given unit of malware doesn't have the ability to spontaneously change itself in order to discover and take advantage of a new zero day exploit.

Not yet.

There's no reason why its not possible. The technique involved is called a genetic algorithm. It involves replicating evolutionary techniques by introducing random variations into code to see if it improves. It has minimal usefulness in many programming applications due to the high level of computational power required, and the difficulty of measuring improvements from one generation to the next. When the computing power is provided by infected computers on the internet, and effectiveness is measured by the ability to survive and propagate, both these limitations go away for malware.

We are then left with the question of how fast a self-replicating, self-modifying worm in the wild could improve using genetic algorithms could improve. I see no reason why it could not improve very quickly indeed. The field of medicine has recently seen the introduction of "super-bugs", bacteria which has acquired immunity to many or most antibiotics over time. A bacteria attempting to infect humans has faced a very difficult environment since the introduction of antibiotics. What we're only beginning to appreciate is how a difficult environment leads to much more rapid evolution. With an internet full of anti-malware programs and researchers dedicated to stamping it out, malware must be very good to survive for long. Many or most strands will be identified and wiped out. Those that survive will be scary indeed.

I don't know when we're going to get the first malware in the wild that can truly modify its own capabilities, rather than just its signature. Maybe its already out there. How complex is it getting? At what point is it going to exceed its creators wildest expectations? At what point will it begin exhibiting behaviors that will appear to demonstrate creativity and innovative problem solving? A what point does it become self aware?

Whenever that happens, I don't know if we'll know what to do. We're going to need help.

Maybe we can ask Watson.

Monday, April 2, 2012

Intentions versus Capabilities - Part II

I’d like to open this post with an apology, because I’m deviating a little bit from my usual subject matter. This is a blog about technology, and about how patterns of technology change and interact with daily life. I don’t normally comment on general news stories that don’t have a direct connection with technology. I’m making an exception this time because it touches on some themes I’ve covered before, which I’d like to reiterate from a different angle.

The story in question is the tragic death of Trayvon Martin, and the failure of Florida police to prosecute his assailant due to Florida’s “Stand Your Ground” laws. This is still a story in progress, and anything could happen, but the evidence is increasingly casting doubt on George Zimmerman’s story. Independent analysis demonstrates it is exceedingly unlikely that the cries for help were Zimmerman’s. It is further difficult to demonstrate that George Zimmerman had sustained the types of injuries that his story had claimed. Add to this the undisputed fact that George Zimmerman left his car to confront Martin despite the instructions from 911 not to do so, and it’s difficult to paint this as anything other than a one-sided assault. This is a crime.

Or is it?

Under normal circumstances, Zimmerman’s departure from his car to confront Martin would paint him fairly clearly as the aggressor. Under the stand your ground laws, he had no duty to retreat from what he believed to be a dangerous situation, and, having deliberately entered into this situation seeking a confrontation, was justified in the use of deadly force to protect himself from what he reasonably believed was a threat.

Critics might question whether his fear of Martin, who was unarmed and a hundred pounds lighter than Zimmerman, was “reasonable”. Unfortunately, there is no precise definition of what constitutes “reasonable”. Certainly his fears would be shared by many other residents in his neighborhood. Does that make them reasonable?

Regardless of how the law will be interpreted, the fact that the Florida police used it as an excuse to not initiate a criminal investigation of Zimmerman indicates that something is seriously askew. At the root of the issue is an asymmetry of information created by the situation. In any dispute between two individuals, each is likely to have a different narrative explaining the context, motivations, and possibly even the facts leading to the event. In a murder, one of those narratives disappears. The only narrative remaining belongs to somebody who will likely have every reason and every inclination to skew the facts in his own favor. This is why imposing a duty to retreat is such a useful concept. While it does not eliminate this type of situation – anybody can still claim they had no opportunity to retreat – it does reduce the number of situations in which an assailant can claim a legal justification for their own aggression.

I don’t believe the Florida legislature had malicious intent when they enacted this law. I don’t expect they ever thought about the possibility of an individual committing an act of murder and using their law to claim it was self-defense. And that’s the crux of the problem. They didn’t think.

As I noted when I discussed SOPA, there is a world of difference between the intentions of a law, and its real life effects. It behooves legislatures to think long and hard about the possible secondary effects of the laws they pass. In this case, it seems they didn’t think hard enough. The results have been lethal.

Well done, Florida. Well done.