How Steve Jobs May Have Snookered the TV Industry

| Editorial

It is widely believed that when Steve Jobs told his biographer, Walter Isaacson, that he had “finally cracked it,” he was referring to a new way to operate TVs. That’s been interpreted as Siri for an Apple HDTV. That’s driven the TV manufacturers into a frenzy. Maybe a blind alley.

Mr. Jobs’s comment (Isaacson, p 830) is taking on the stature of Fermat’s Last Theorem. That is, everyone believes Mr. Jobs (and Fermat) had the solution, but no one could figure out what it was.

The TV industry, without vision, research or a demonstrated understanding of first principles seems to be jumping to the conclusion that voice input is the Holy Grail that Mr. Jobs was referring to and has hastily surmised that, to get a jump on the boogeyman of Apple’s own HDTV, that they should introduce voice input to their TVs. Or, in the style of Kinect, gestures. This is the conventional wisdom.

Home TV

Human Ergonomics

Before I can jump on that bandwagon myself, I tend to think in terms of history, the human experience of watching television, and why people have problems with modern TV systems. There are several parts to the problem.

Selecting Input

By far, the number one problem people have using modern HDTV systems, from my reading and chatting with other people, is the selection of the input source. Many people have several devices plugged into an AV receiver or just the TV: a Blu-ray player, an Apple TV or Roku box, and likely a cable/satellite box, in some cases a DVR. In order to watch the desired show, you must understand which input to use, then select the right remote (if not using a Logitech Harmony) and then pick the right button to cycle through (or select) the inputs until you estimate, from the visual appearance, that you have the right input source. Worse, the button may be cryptically labelled and hard to find. This is the process that drives non-technical people crazy, especially those who have not participated in the setup of the system and understand what’s going on.

Items of Interest

Steve Jobs came to understand that music customers aren’t interested in the Labels or even the albums they create. Music fans are interested in songs. Applying that understanding to the TV audience, it isn’t hard to understand that what people focus on is the show.

Studio   <=>   Label
Network  <=>   Album
Show     <=>   Song

What people hunt for when they’re in a TV watching mood isn’t the studio or the network. That might be a crutch, however. You know that Justified is on FX. FX is channel 248 on DIRECTV. So you back into that show by tuning to channel 248 at the appointed time. (Or set the DVR.)

Most people have come to the idea that, because it’s the show people are interested in, all they need to do is announced the show verbally, and the TV system will go find that show. But there are nuances. Which episode? The latest? Last week’s? The rerun of the season finale from last season? Distinguishing exactly what you want to watch leads to thoughtful articulation to a voice input system — a process that will still challenge both humans and computers.

Articulation

That leads me to the final issue: thoughtful articulation. Just because you can apply a technology to a problem doesn’t man it’s the best solution. Accurate, speaker independent voice recognition is a fairly new technology, not thoroughly researched for human interactions. Yet, it’s being rushed out by TV makers.

Historically, TV operation has been “See and do.” In the ancient past, if you wanted to watch Star Trek, you’d remember that it’s on NBC and then tune to, say, channel 4. You can see that the dial is on channel 4. That concept has evolved over the years with cable, a plethora of channels and remotes, but it still depends on the idea of “see and do.”

Voice input requires one to “think and then articulate.” One must form the right thought, compatible with the abilities of the TV system, then enunciate the proper command. Or make the right gesture. My suspicion is that this can be tiresome and challenging, especially in a household with a lot of kids and yelling going on. Or background noise, like a vacuum cleaner. Or a sports bar.

At least, when you have physical possession of the remote, you’re in charge. I think this is just one of the human factors issues that needs to be addressed. It’s a major technological shift without the backing of extensive research. And we know how in this era of cost cutting and cut-throat TV competition, how much money is spent on human factors research.

The Way Forward

Solving the problem of selecting the right input is easy if you constrain the user to stay in the Apple HDTV realm. That is, if you can deliver everything the customer wants with no other inputs, then the input problem is solved. Lots of people are hungry to cut the cord. However, issues remain. Not many people will throw away their DVD collection — they’ll want tol keep their player. The last time I checked, Comcast will reduce your Internet speed if you cut Basic Cable (“bundling.”) So there still needs to be a way to select the desired input. Unless the Apple HDTV has zero additional HDMI inputs. Not likely.

Siri is designed for a device that’s small and has a small, virtual keyboard. Voice input makes sense. However, when you have a giant 60-inch screen, there are a lot of things you can do that you can’t do on an iPhone. Not to mention that the environment is different.  It may make more sense, to just walk up to the TV, touch the “window” that’s showing the desired input, and then swipe until you get to the show, and episode you want. Or do it on your iPad/iPhone — if you have one. Carefully constrained voice input may still be an option, not a requirement.

If Apple does an HDTV, I guarantee that it will be a thoughtful byproduct of all Apple technologies, an understanding of human nature and ergonomics, and be fun, not frustrating.

My point is that TV technology is fairly stupid while Macs and iOS devices are fairly smart. The challenge is to select the best technologies at hand that cover all the bases and make TV watching a delight. Jumping on the voice input or gesture technology alone, as a geeky replacement for the remote control, is like the ill-conceived rush into household 3D TVs. Just as 3D is now considered a feature but not a sea change in TV viewing, I also think voice input will be an ancillary feature, not the end-all, be-all solution that is being rushed out in a worrisome response to a cryptic comment by Steve Jobs before he passed away.

If Apple does an HDTV, it’s too important a change, and too big a challenge, to simply throw Siri at the problem and be done.

Sign Up for the Newsletter

Join the TMO Express Daily Newsletter to get the latest Mac headlines in your e-mail every weekday.

16 Comments Leave Your Own

Alan

I think the current AppleTV shows the way - additional HDMI inputs will be in another menu stack (or a whole stack for each one).

If Apple works with set top box manufacturers they might be able to get basic functions (like choosing a channel) done via the CEC control channel in HDMI.

Similarly if they work with the cable companies they could get the correct channel / program listings, then they are at least a lot closer to the goal.

Typing in search queries on the current AppleTV using their remote is a poor experience. They could improve that by adding ‘clickwheel’ to the remote so you could scan through each letter but that’s really not substantially better, just a little better.

skipaq

SSSHHH! John, stop trying to corral a stampeding herd of appliance manufacturers. Let them run wild on every bit of spooky sound that emanates from One Infinity Loop. There is just too much to enjoy watching them try to beat Apple at something as they trample all over one another.

Next thing they are trying to figure out is what Steve really meant when he uttered “Wow” when departing this life. Rumor has it that he recorded one last epiphany on his iPhone for the creatives at Apple to develop.

Samsung execs are currently in a frenzy trying to figure out how they can demonstrate they have had “Wow” under development for years. In an attempt to head off another round of IP challenges, Google is buying up everything related to “Wow” beginning with WoW. Microsoft thought it was “Woe”. So they are comfortable knowing they already have the biggest share of that market.

So, I ask you to please stop trying to point in the right direction. Don’t take the fun away. wink

John Martellaro

Skipaq:  grin

mhikl

Succinct, John. I am bored with prognosticators who get any Apple future wrong. The raising of hopes on a GarageBand for E-books is the latest garbage.

Laying out the problems and posing possibilities or strategies, as you have done here, may not be the candy floss most people want to hear, but it certainly stimulates thought and introspection.

And should Apple figure it all out, and figure it out right, then it will be intuitive and seen to be ?obvious? and Sumsung will be delighted.

Was reading how old keyboard layouts were carry overs from typewriters which reminded me how early cars had running boards and looked liked carriages which should remind us that thinking outside the box is a particularly difficult skill to acquire. This article is a step in the right direction for honing this difficult skill.

BurmaYank

“Mr. Jobs?s comment… is taking on the stature of Fermat?s Last Theorem.”

@John:  smile

@Skipaq:
snortingcoffeefrommynostrilsROTFLMFAO!!!
smile

1stplacemacuser

... Was reading how old keyboard layouts were carry overs from typewriters which reminded me how early cars had running boards and looked liked carriages which should remind us that thinking outside the box is a particularly difficult skill to acquire. This article is a step in the right direction for honing this difficult skill.

At the same time, being too innovative won’t work either as people can’t grasp what the newfangled device is.  There is a reason why evolution progresses in steps.  There is no intelligent design to create a brand new better-than-ever product.  Such a new product may not fit within the existing environment or may overwhelm the environment.

Having something that’s incrementally better than existing will allow users to easily adapt and provides a roadmap for future improvements.

nytesky

Here’s my analysis. The biggest problem is that cable companies charge you a lot of money for bundles that include a lot of channels you don’t want. The way Apple can counter this is to offer network apps that you can subscribe to. You can pick and choose only the channels you want. A good start would be to offer ESPN, MTV and other ‘anchor’ channels.

serandip

Nice article.

I am one of those people who thought siri was the answer.  But I can see now that there could be problems (background noise etc).  I still think siri would be an interesting input method to change channels, select inputs, and content.

The ideas presented here are very thoughtful.  I like the idea of using an iPad/iPhone/iPod touch to control the TV.  This fits into my activation barrier concept, i.e. the barrier that I have to doing anything while I am watching TV.  It would be too much of an activation barrier to get up and actually touch the TV to select input etc.  But I can see myself using an iPad to so the same (minimal muscle movement, low activation barrier).

bitsinmotion

I don’t think you’ve though through some of this particularly well.  Yes, the user should not have to select an input source, but that really isn’t a problem—certainly not in the case of the DVD. First of all, the DVD player is built into the television.  When you insert your disc, it gets read and recognized (perhaps even cloudified ala iTunes match if licensing can be worked out, but not essential) Once the disc has been recognized, it’s content is presented as an option just like any other.

The cable source is of course a tougher nut to crack, but one that may not even need cracking. Or a deal with the cable companies might have been worked out to integrate the cable tuner into the TV and somehow preserve the unified interface for that content as well.

Henrik

Steve Jobs came to understand that music customers aren?t interested in the Labels or even the albums they create. Music fans are interested in songs.

hmmm….

iTunes, why didn’t Steve fix that piece of crap?

You always see the albums when it’s the name of the song and the artist I’m interested in.

wilf53

Or do it on your iPad/iPhone ? if you have one.

Or the TV-set comes with a remote which is very much like a track pad… No buttons. Remember Jobs’ aversion to buttons:)

Lee Dronick

iTunes, why didn?t Steve fix that piece of crap?

You always see the albums when it?s the name of the song and the artist I?m interested in.

You can change how you view things, how you sort, in iTunes,

Lee Dronick

Or the TV-set comes with a remote which is very much like a track pad? No buttons. Remember Jobs? aversion to buttons:)

That would be the rumored 7” iPad

CityGuide

@skipaq: that was xkcd-quality.

John: I agree it is all about the content. The navigation to and portal from which it comes should be irrelevant. My ideal screen menu would provide a browser-like smorgasbord of viewing choices. I’m ready for the post-television era.

SpeechGuy

I agree with a basic issue being multiple devices and multiple remotes. I disagree with your claim that speech recognition isn’t ready. Siri works in very noisy environments because it contains noise cancellation software (ever notice how a call from an iPhone doesn’t seem to transmit background noise). Companies such as Sensory (as an example) specialize in software and chips that listen in noise and at at distance; Sensory is in many voice-activated consumer products. Cars are noisy, yet systems such as the Ford Sync have voice control at a distance. If for no other reason, use of hands-free voice control in autos will drive continuing advances in this field. In the TV space, it’s can be even easier—the software has access to the TV audio signal and can subtract what sounds the TV is making with commonly known technology.

zewazir

Voice-activated television will never fly.  It’s bad enough when 3 kids fight over the remote.  Image them all trying to out-scream each other for their choice….. “Vampire Diaries!!”  “Ben Ten!!”  “FOOTBALL!!”  VAMPIRE DIARIES!!!!!!!

BE QUIET!!

Now, if it would allow me to program it to only recognize MY voice, it might be worth exploring.

Log-in to comment