Mihai Budiu’s Blog

Computer Concepts

Monday, December 28, 2009

An interview with Kurt Akeley

Interview done in February 2009

Kurt Akeley is currently a principal researcher at Microsoft Research in Silicon Valley. He has played a significant role in the history of computing several times; perhaps his greatest claim to fame is his work on the specification of the OpenGL graphics system, done while working at Silicon Graphics; OpenGL is still one of the two major graphics architectures still in wide use (the other one being Direct 3D). Kurt is endowed with a magic touch: he has worked on virtually all layers of the computer system stack, including hardware design, system software, CAD tools, display technology, programming languages, and many other. This interview was conducted in February 2009.

The contents of this interview is licensed under a Creative Commons with Attribution License.

IMG_5277 kurtid

Left: Kurt Akeley seated next to an optics table that he is using to test the design of a 3D display (October 2009, Microsoft).

Right: Kurt’s badge showing employee #10 at Silicon Graphics. Courtesy of Kurt Akeley.

Q: When did you start using computers?

A: I started using computers in my last year in high school. We had one of these teletype machines hooked up to a computer I believe that was at the University of Delaware (although I never saw it). I remember not knowing what an array was – something mysterious and wonderful. We saved our work on little yellow tapes.

Q: Your work spans all levels of the computer system architecture: from hardware design, tools for doing CAD, system software, compilers, graphics libraries. What insights do you carry from one domain to another? Do you find it easy to move between areas?

A: I did find it easy to move between areas. The general direction of my motion — when I was at Silicon Graphics — was "up". When I started at Silicon Graphics I expected that I would do silicon; that was part of the name, and that was part of the excitement — to design custom integrated circuits. In fact, I started with hardware and integrated circuit design, moved to library design, and then higher level designs (but not really application design).

GE Die Image Clark 1982 founders

Left: Die photo of the SGI geometry engine in 1982. Courtesy of Kurt Akeley.

Right: some of the founders of SGI. From the left: Marcia Allen, Mark Grossman, Dianle Wilford, Rocky Rhodes, Kurt Akeley, Marc Hannah, Tom Davis. Several of the founders are missing, including Jim Clark, Vern Anderson, Abbey Silverstone and David Brown. Courtesy of Kurt Akeley.

SGI started with Jim Clark implementing a ge­ometry engine — his name for the custom integrated circuit that he and his students at Stanford had designed. The company was built around that piece of technology. We would start with a highly complex integrated circuit, then we would build circuit boards, and then interfaces, and then a library and finally make people use it. We did it this way because it took very long to design the custom integrated circuits; the people doing the ICs had to get started long before the people doing the interfaces and the system design. This was an implementation problem that cried out for a solution. Our solution, in the mid ’80s, was to stop building custom ICs and to use gate-array technology. That greatly reduced the technology cycle for the components.

A lot of my contribution at SGI in the first 10 years was roughly inverting that order, so that we started from the problem we were trying to solve, followed by the interface to do that, drivers and hardware systems, and finally components needed to implement that system. There’s no right answer here: it’s not right to go bottom-up or top-down; there’s some happy in-between. It’s definitely wrong to go purely bottom-up the way we were: the systems we were building were not doing as well because we had designed the wrong components. This was true in the beginning at SGI and it was largely remedied. We went from being a nothing company to more or less dominating the space of 3D graphics for workstations in the late eighties and early nineties. It is more important to design the right system than to use the very best and fastest components and to design the wrong system.

Q: Would you use today something like FPGAs (Field-Programmable Gate Arrays) for such systems?

A: If by “such systems” you mean "graphics accelerators for computers," probably not. That market has matured considerably. The handful of companies that still exist have developed a considerable expertise in building ASICs. AMD, Intel and NVIDIA do highly specialized and highly engineered components. You could not compete today. But in some new marketplace, that isn’t yet mature, and where the architectures still have to be defined to allow it to mature and to grow, the answer would be "yes".

Q: Who was SGI competing with in the ’80s?

A: It varied over the decade. [The company] Evans and Sutherland pioneered computer graphics in devices for professional application (other than simulating an airplane). Jim Clark (the founder of SGI) worked there when he was in Utah. They built very high performance 3D graphics terminals. The "terminals" were not programmable, but instead hooked to a host computer that was programmed; the host downloaded data sets to the terminal to view them interactively. That was the state of the art for professional use. There were a few very domain-specific applications in the late seventies that used these terminals. SGI distinguished itself by fully integrating the computer graphics into the workstation, building the equivalent of today’s inexpensive personal computer. It was a programmable machine with an operating system, a what-you-see-is-what-you-get terminal, which could do high-performance 3D graphics. In the first 10 years SGI focused on building tightly integrated 3D graphics capabilities into a programmable computer. Apollo and a couple of other companies were also doing it, but SGI was clearly the most successful during that decade. At the beginning of the nineties we had a clear position to drive the market.

Q: So, what happened?

A: SGI went like a cannonball launched from the cannon at ground level. SGI decided early on to cater to professional customers and to help them solve their difficult problems. So we built a sales force with a very deep competence in the field. The business model was to charge high premiums for the hardware, and give away the consulting. The value of the expertise was part of the sales transaction. We sold for high margin but low volume. We were selling billions of dollars per year, but that is small stuff compared to personal computers. The cannonball followed its predictable path, and the customers abandoned us for less expensive technology, with a lower margin.

Q: It sounds like the innovator’s dilemma (Clayton Christensen’s book).

A: There are aspects of that. Such a company has difficulty in introducing products that compete with its higher margins. SGI built graphics for personal computers earlier than anyone else, but what it didn’t do — which is classic innovator’s dilemma — it didn’t ship clearly inferior, but less expensive products. We had defined a graphics architecture called OpenGL, we had defined a set of rigorous compatibility standards with OpenGL, and we weren’t willing to ship inexpensive products that weren’t compatible. Competitors like NVidia and ATI were, and did a good job. Now they are defining the extensions to the architecture that we created. In that sense, that is a classic innovator’s dilemma.

Q: Can you point some innovations that everyone’s using today and which originated at SGI?

A: The concept of stencil buffer from OpenGL (it was actually introduced in IRIS GL, our proprietary library / architecture). The stencil buffer allows you to perform some operations when a pixel is updated. It is not z-buffering (we didn’t invent that). The stencil buffer implemented a very simple state-machine, but provided a lot of useful capabilities in the end. One of its useful applications was for solving "the capping problem" in solid design.

Let me explain: SGI did polygonal 3D graphics. The things rotating on the screen appeared to be solid, but in fact they were modeled with polygons. If you were to look inside you would see nothing. You could look inside by passing a clipping plane, like those cut-away drawings that you see in magazines. Now, when you clipped a polygonal drawing it looks really ugly, because you see the rear sides of the back-facing polygons, and they would be displayed incorrectly or not at all. The designers wanted to clip objects and see them shaded with the correct color. This had been a long-standing problem, and we solved it using the stencil buffer. It’s a simple idea, in retrospect it is obvious, but people hadn’t done it before, and it endures to this day. In all modern graphics systems the final pixel operations in the frame buffer are still not generally programmable; they are specified as a set of modes, and one of the modes is a stencil buffer. The stencil buffer mechanism used today is virtually identical to the one we created.

Q: The SGI system was much less programmable: you had in hardware all the features that now are done in software.

A: In fact, they were all software. The question was "who got to write the programs?" Did an SGI box from 1988 have programmable vertex shaders? Yes, it absolutely did. The geometry engine, that I implemented, used an off-the-shelf floating-point hardware microcode engine; we wrote the microcode. The microcoded engine looked to the external world as a hardwired mechanism: the applications could not change the vertex shaders. The applications could control the shading just through some mode bits. That was true at the vertex level for all Silicon Graphics machines (and — I think – for all competitors of Silicon Graphics). At the pixel fragment level and rasterization level some of these machines were programmable, and some were not. In fact, SGI built machines that were programmable at the pixel level too, all the way to the frame buffer. That level of programmability still hasn’t been revealed in the world of OpenGL and Direct3D.

Q: One of the most enduring things to come out of SGI was OpenGL. You were one of the main architects of OpenGL. How was Open GL born?

A: The career path that I described previously, of inverting the order of system layers, gave rise to my interest in graphics interfaces. I realized that we needed to design the interface first, and then figure out the implementation (e.g., what integrated circuits to use).

There was a Silicon Graphics standard called IRIS GL. Around 1987 I specified a lot of the additions to IRIS GL that enabled capabilities of the new graphics system we were developing. In the next few years a lot of competition in this area emerged. In particular, the PHIGS evolved from a relatively uninteresting retained-mode scene graphics library to becoming a 3D extension to the X-Windows system called PEX (PHIGS for X), with an active standardization process.

A lot of the value SGI had delivered to customers was in specifying the correct interface. I had put a lot of energy into this for 10 years. It was a frightening thought to lose control, but, more fundamentally, to have a bad interface as the industry standard. We had to choose between supporting IRIS GL and improving the X-based interface, knowing that unfortunately it was predicated on PHIGS, which we thought was quite bad.

Q: Who supported PHIGS?

A: Almost everybody at this time. SGI got a third-party company to implement PHIGS on IRIX. Our competitors were SUN and HP and maybe Apollo. Probably all these companies had PHIGS implementations. People at SUN were particularly excited about it. These companies were actively moving from that specification to an X extension and making it into a standard architecture.

By the way: it’s important and very valuable for an industry to have a standard architecture. But it’s also important that it be a good architecture. I think we are in a better place with OpenGL than we would have been by building on PHIGS or PEX. That could be debated, though of course it’s a test that we will never do.

I realized that specifying the interface was the key design decision. So we cranked out an interface that we thought was better. Along the way we gathered a several partners and made this work a collaborative exercise. In the 1.0 version of OpenGL, published on July 1st 1992,a few of us at SGI made all the decisions, based on a lot of input [from our partners]. It wasn’t a design by committee, although Digital, Microsoft, and Compaq contributed considerably to it. We made all the hard decisions for the first revision. We also promised in advance that the future governance of the standard would be done in a less heavy-handed way: "trust us, give us your input, we’ll do a good job on the first revision, and then we’ll govern it going forward in a much more democratic way." We wanted to get it off to a good start. As they say "a camel is a horse designed by a committee." There was a lot of fear and angst in the community, because, remember, SGI was the dominant player. And people sort-of trusted us, but there still was a lot of concern. I think that in retrospect they made a good choice.

IMG_5218

The cover of the v 1.0 Open GL specification (Kurt’s signature is on the top-right corner).

Q: You said that OpenGL is a better interface. How do you measure the quality of an interface?

A: Let me back up: something that’s not fully appreciated about OpenGL is that it is an architecture. I’ve been using that term consistently. It is an architecture, in the sense used by Fred Brooks for the IBM 360 ISA (Instruction Set Architecture) in the sixties. OpenGL is a library interface, but it should be evaluated against the same metrics as an ISA:

  • how suitable is it to its task,
  • how effectively can it be applied to tasks not envisioned by its designers
  • how effectively can it be implemented consistently on different machines
  • how well can it be evolved into the future while still maintaining the compatibility
  • how well does it encourage application compatibility

Fred points out a classic set of design mistakes; for example, if you leave any things undefined in an architectural specification (and, by the way, there needs to be a written specification), then they become defined de-facto by early implementations, and often these de-facto definitions aren’t in the best interest of evolving the architecture. One way to measure the quality of and architecture, is to see how much undefined behavior there is in the architecture. You can get picky about the exact metric of "how much," but by using hand-wavy comparisons we can claim that before OpenGL no graphics architecture was very tightly defined, and many had all kinds of undefined behaviors. As a result, each implementation was different, and application compatibility failed. To be fair, SGI’s proprietary implementation IRIS GL was quite bad. We built OpenGL based on experience from our early effort at defining a flawed architecture.

Q: The effort to generalize IRIS GL into OpenGL made it better?

A: Absolutely. Because it wasn’t just an effort to generalize it. For example, we had licensed IRIS GL to IBM so they could use it on their workstations. This turned out to be an awful mess, because IRIS GL wasn’t very well specified. Having them implement it required constant communication: "what happens in this case?" The correctness was defined by "he who yells loudest:" whichever implementation had the most applications would define correctness. That’s a very poor way to implement an architecture. We also had our own bad experience with Iris GL that we were attempting improve with OpenGL. This process was not generalizing or popularizing; it was an attempt to build a library at the same time as creating an industry standard. Those weren’t inconsistent goals. Some of the wrong things in IRIS GL made it difficult for IRIS GL to become an industry standard.

Q: Do you think that this architecture is still fit given the current evolution of hardware?

A: Software interfaces, even if they are not done very well, but which get a large following, last a long time. There’s a kind of "fitness by default." I would predict that people will be using OpenGL and Direct3D (which is more or less a different evolution of the OpenGL architecture), for some time to come. They are not going to go away.

On the other hand, for some really important application spaces, such as first person shooter interactive 3D graphics games, or games that push the state of the art in computer graphics, I think that Direct3D and OpenGL have little life left. The way implementation technology changes — I am thinking specifically the Intel Larabee graphics chip – will push the standard interface for game development to a higher level. Like the Unreal Game Engine. People creating those game engines need a different kind of interface to the underlying hardware and microcode, and they will bypass Direct3D and OpenGL.

Q: So what will the new graphics architecture be?

A: The architecture has been generalized through programmability to become a high-level pipeline. You can think of it in terms of datatypes. Let’s ignore high-level surface specifications for a moment and focus just on polygons or triangles. The base type for these is the vertex. The graphics system converts vertexes to triangles, converts triangles to pixel fragments and finally converts those to pixels, and puts them into the framebuffer. It’s a three-stage pipeline with specialized tidbits to conversion from one datatype to another. That sequence is incredibly useful and powerful, but it isn’t the only reasonable sequence of transformations for computer graphics. For example, you can turn things around with ray tracing: instead of starting with vertexes and pushing them toward the framebuffer, you can start at the framebuffer and cast rays back into the scene and figure out what each ray comes from in the scene. You can think of this as a loop-unrolling technique. The standard graphics architecture really can’t deal with ray tracing. This is just one example. In the near term the world will not switch from the pipeline sequence to ray tracing. There has already been a lot of research for mixing the strengths of both, and this research will turn into products soon. These products are not implemented naturally on top of an architecture like OpenGL or Direct3D.

Q: But the world still needs an architecture!

A: But it might not be a standard architecture. The Larabee system will expose something that we can call an architecture: a sea of IA32-like processors; that will be well-defined. It won’t be an industry standard, it will be an Intel standard. And it won’t be licensed; it won’t be available to NVidia. OpenGL or Direct3D will continue to be available, but you will also see the architecture exposed through something like the Unreal game engine. There won’t be any software layer between Larabee and the Unreal game engine.

Q: The graphics engine itself is evolving in unexpected ways. You worked on Cg and there are other languages like CUDA (for parallel programming on graphics cards) that are becoming very popular. What do you think about this evolution?

A: I like it: it’s fun! At SGI we made some pretty weak attempts to move in this direction. In 1990 we shipped a machine which provided a programmable interface for general-purpose computation. Some of the early people at SGI were hired by Jim Clarke specifically to expose the geometry engine for more general-purpose calculations in the graphics pipeline. It’s an old idea, and probably older than that, but that’s when I became aware of it. It’s old, but it’s really getting some traction now, because the technology and expertise have evolved far enough to make these architectures practically programmable. They were programmable by the people who built them, but they weren’t programmable by general-purpose applications developers. You need to implement the right language architecture, and to build the right compilers to make all this work. There really is a convergence between GPUs and CPUs. If you pry open a modern NVidia part and the Intel Larabee you find that each of them has on the order of 10 cores. And each of these cores is a programmable engine, and is implemented with an instruction set augmented with SIMD capabilities. And the widths are on the order of 16 for both of these machines.

Q: The memory management seems to be very different.

A: Yes. The NVidia part still isn’t designed to run an operating system. Its relationship with memory is rather different. There are other important differences: the Intel architecture exposes this implementation quite directly, while the NVidia does not, even for CUDA. It hides that implementation behind an architecture that’s higher-level, like OpenGL. It still implements the OpenGL pipeline and it time-shares the same underlying compute engines for all the important stages of this pipeline. But it hides this fact from the programmer. The other difference is that NVidia does not expose an individual core. Its architecture makes you think that you are programming "lanes". You are programming only one of the SIMD paths, and NVidia put in mechanisms to handle divergent code paths, when the SIMD hardware executes the same program on 16 datasets and a branch that goes in different ways in different data sets. Even CUDA, where the graphics pipeline abstraction goes away, still maintains a Single Program Multiple Data model. Intel will expose a lower level for Larabee. I think we’ll converge to something more like Larabee than NVidia, something that’s designed to do well with thread parallelism, and with data parallelism. Then we can say that graphics has led the underlying implementation toward a multi-core computer architecture.

Q: Can you tell us about your current research?

A: I am now in two broad areas. One is a combination of human vision and computer graphics. My goal is to implement stereo display technology that provides better focus cues to the viewer. This work is straddling psychology, perception and human vision, and computer graphics. I continue to publish in this area and to collaborate with people at Berkeley, where I did my dissertation work. The second area is computer architecture: evolving graphics architecture to provide a platform for data-parallel computation.

IMG_5214

Kurt’s image reflected in a prototype 3D display with correct focusing cues (the subject of his Ph.D. thesis), using 3 semi-transparent mirros.

Q: What’s wrong with current displays?

A: I did my dissertation on stereo displays. All existing ones have a bad property: they create a simulated object in your view whose apparent distance is determined by its vergence (disparity), meaning that it appears slightly different to your two eyes. Because of disparity you have a sense of how far away it is. That’s called “depth cues” in a stereo image. But there are other cues for the brain besides the vergence, and in particular there is a cue based on where your eyes are focusing. If you are focusing at one distance but verging (or looking at) an object at a different simulated distance, you are getting conflicting cues. That’s bad, but not tragic, the focusing cues are less important than display latency or hidden surface occlusion (another fundamental depth cue). Occlusion and vergence are solved, but people were unaware that focus was a problem at all. It turns out that where you focus is important. Focus doesn’t matter so much for a non-stereo display, but it does matter in stereo displays.

Q: Or at the 3D cinema.

A: Same thing. You are focusing on the screen, but you display images which are farther than the screen or in many cases much closer than the screen. We implemented a display that allows you to focus at approximately the correct distance (i.e., same as the vergence distance). It does that without any moving parts and without tracking where you are actually looking. Using that system we’ve done some measurements to quantify the problems. Does decoupling focus from vergence cause discomfort? The answer is "yes". There had never been a rigorous experiment to show it. We also postulated, based on the mathematics, that even with very low resolution in depth, we can make people focus at the right distance, “in between” depth pixels. We are collecting data that shows you can build good a display with a very low resolution and depth.

You can build a display with 10 transparent screens, but we have shown we can simulate correct cues at a finer granularity with many fewer than 10. It’s also interesting from an engineering standpoint because the obvious ways to solve this problem were just insanely expensive. To have thousands of pixels of depth resolution, you need billions of pixels. There is no easy jump from the current million pixel displays to a billion pixels, but conceivably engineering can jump from 1 million to 10 millions. In this work fundamental science (understanding human vision) and engineering (building a display that can solve the requirements) converge.

Q: Do you see applications? For example, in flight simulators?

A: Focus matters most in the near field [of vision], within an arm’s reach. The flight simulators solved this long ago by setting the focus at infinity. In cinema it’s almost the same, except that sometimes the simulated distances are very close. The screen is effectively infinity, but the movie may provide a disparity that makes you think that you are looking two feet in front of you. In flight simulators by the time you are two feet from the ground the game is over. Remote surgery, augmented reality, telepresence: are practical applications.

What’s computing used for these days? It’s mostly communication. Facebook. We are still doing only rudimentary visual communication. We need to provide comfortable, intuitive viewing systems. That will mean stereo, and that will mean near field as well as far field. I think that this work isn’t just for flight simulation. There is a place for stereo 3D graphics. If you think you know the details, it’s a good place to start a company.

Q: How about the architectural direction in your research?

A: I work with some Stanford students and a professor on making graphics architectures more general-purpose both for graphics, and for high-performance computing. We have a paper in the January issue of Transactions on Graphics. We are thinking about the architecture at the level below Direct3D: how many cores there are, how they are managed, how they communicate.

Q: What ideas in your field of expertise surprised you? The converse is: what ideas were ahead of their time and still haven’t been accepted?

A: I am on record saying that I don’t think that exposing the programmability of GPUs will be a successful approach, a wildly wrong prediction. This is a classic expert problem: which things that I know should I question? You can’t move forward if you doubt everything, but you move incorrectly if you don’t reevaluate things that have changed

Something that the world still isn’t ready for? The kind of rigor that we brought to OpenGL. in particular, rigor about (computation) state. We identified, named and provided mechanisms to change, query and utilize all of the state in OpenGL. The document specified that implementation could do as it pleases, but it must appear at the architectural level that implementation is managing state in exactly the way described. OpenGL’s forward evolution also took this very seriously. And it works. It allows people to use OpenGL effectively, and to extend it effectively, to reason about it effectively, and it is antithetical to modern software practice!

Q: Do you know any other API which satisfies these criteria?

A: We patterned OpenGL on the X Windows system. The people who did the core of X Windows took state very seriously (not the whole software stack on top of the X Windows — I am not making a case for Motif here). Not a perfect job, but an admirable one.

As far as I can tell, at least at the application level, pretty much nobody is rigorous about exposing state. How is the state modified, stored, queried? All of these things are very ill-defined at higher levels of the software stack. It drives me crazy. I cannot think of a fundamental reason why we shouldn’t be as rigorous at the higher levels. My take is that we are just being sloppy. Perhaps most people just don’t think this way: if you rigorously exposed the state of the telephone, Outlook, Photoshop, it will be confusing and not helpful to the average person. I find that unimaginable, but it may be so. What seems clear to me to be the right way to do something may in fact not. But I continue to have this hunch: that we’ll all be in a better place when we take the notion of state, and specifying the interactions of state clearly, much more rigorously than we do now.

posted by Mihai at 3:36 pm  

Sunday, August 5, 2007

More on Presentation Mistakes

More on Presentation Mistakes

A few years ago I have given a very short talk about giving effective talks; I still think that was a good summary, so I am providing the link above, to the PowerPoint slides.

BTW, I love PowerPoint. I think that it is a great tool, which can be extremely effective when used properly. Like any other tool, it can be used improperly, but this doesn’t make it bad. (Are hammers bad because they can cause injuries?) I think PowerPoint enables many people to present and organize their ideas in much better ways than before. Can you give a great speech without it? Sure. Can you have a terrible slide show? Certainly. But that’s not how you measure its effectiveness.

I have also written some (extended) advice on how to give presentations some time ago, but I guess that web page it too long, and not terribly original.

I also know quite a few people who manage to contradict a very large number of pieces of advice from my write-up and still give amazingly good talks (one example I remember vividly is Amir Pnueli). The secret in these cases is almost always crystal-clear thinking and a very sharp flow of ideas. So these rules are not the only way to do things.

But here I thought to mention a few (other) mistakes which I see frequently in job interviews (and many other presentations):

  • Not listening the questions. It is amazing how many people answer a different question from the one that is being asked. Watching tapes of my talks I have realized that this has happened to me too. The reason is that I often have already in mind a list of questions that I expect, and I tend to match the words I hear to one my mental models. (Some people start answering before the questioner has even finished, thinking they have heard everything!)

    The best way to avoid this symptom is to listen carefully, repeat or rephrase the question, and (if possible) ask whether this is the intended meaning.

  • Giving too many details. A good talk is a fine balance between advertising and teaching hard facts. In truth, most people will forget almost everything you tell them, so the point is to make them interested in the work, and to stick a few salient facts in their mind. To remember, (the vast majority of) people will have to rehearse, and for this they will have to go to a more persistent material than just a talk. A corolary of this observation is that one should rely on intution when it is appropriate, and not try to explain everything in detail.
posted by Mihai at 11:25 pm  

Sunday, July 22, 2007

Some Common Job Interview Mistakes

You have worked hard to earn a Ph.D., and now you are looking for a job, either in an academic position (in a university), or in an industrial research lab, or perhaps in some other place where you have to use your recently earned qualifications. For this transition you have to go through a job interview. This is one of the most important “exams” in your life, since depending on its outcome your future may change in completely unexpected ways.

Having seen the job interview process from both sides, once when I was looking for jobs, and many times while interviewing candidates, I want to point out a few mistakes that I see many candidates making during their interviews. None of these mistakes is “fatal”, and stengths in other areas of the interview can overcome any of them. But why not avoid them if you can?

  • Argue about assumptions instead of substance. Most job interviews contain a presentation given by the candidate. This presentation gives the candidate to focus on an important project which is representative of his or her past research. In many locations this presentation is interactive: the members of the audience can interrupt and ask questions during the presentation.

    A good presentation starts with an introduction, which explains the setting of the research, and (some of) the assumptions under which the problem is solved.

    (Here are some examples: we assume that “the cost of communication is much larger than the cost of computation”, or “we assume that we have a universal public key infrastructure“, or “we assume that we can completely replace the internet protocol IP with the protocol we have designed”, or “we assume that we can recompile all software with our compiler”.)

    Many talks leave some of the assumptions unstated, and listeners have to be deduce them during the talk. I have seen two things happen quite often:

    • The speaker and the listeners have different assumptions sets in mind during the presentation, and thus they do not understand each other, and they argue constantly about the quality of the solution.
    • Even if the speaker and the listeners do agree on a set of assumptions, the listeners may not find the assumptions reasonable. For example, many listeners may object to most assumptions presented above. This can turn out into a religious debate.

    Both these situations are highly counterproductive to the presentation.

    If you are giving a talk and you sense that the debate steers towards assumptions, it is time to stop it immediately and re-focus it. The research has been done, and there is nothing you can do to change the assumptions you have used. You should not spend too much time to defend them either, religious wars are endless, and as a job candidate you have much more to loose than the interviewers by wasting precious time during the talk.

    What you can say is the following: “These are the assumptions behind this piece of work, let’s make all of them clear. You may like them or not, but we cannot argue about them anymore. In this talk, I will show you what I have built on top of these assumptions. What you should judge me for is this construction — my research — and not the assumptions. The value is in the work, not in the premises. If you like the construction, perhaps I can do more constructions in the future, after you hire me, starting from a set of assumptions which is less controversial.”

  • Attempt to hide obvious weaknesses. Sometimes during a presentation an audience member discovers some weakness in the presented work. For example, a very old benchmark was used for measurements, which does not stress enough current machine capabilities. Or there is some related research that the candidate did not know about. Or a theorem was used, for which the assumptions were not all satisfied.

    One of the worst attitudes to take in this case is to defend your work, and to attempt to hide the mistake. “No, this benchmark is actually very good, because the machines in our lab are old.” “No, I didn’t read that work, but I know that that group does not use a sophisticated compiler like mine.” “I am sure that the theorem can be proved anyway.”

    You have to realize that often during your interview in the audience there may be experts in the field of your research. Attempting to trick them with subterfuges will backfire badly.

    The best thing, when discovering a mistake, is to first understand it (this shows that you can think), and second acknowledge it, and then to present the rest of the work, and show how it stands on its own. Hopefully, this mistake will only have a local impact, and it is not the basis for everything you have done in the last 5 years.

  • Not care about the place where you are going. If you interview some place, you better care about that place: the people who work there and who might become your colleagues, the work atmosphere, the reward system, if you plan to do research how you get money for your research, and, very important, the health and wealth of the mother organization which is hiring you. If you interview some place and you never display an interest in any of these things, it shows you are not really committed to go there.

    These are good subjects to discuss for some of the one-on-one meetings that usually occur during the job interview.

  • Show modesty. Some people don’t like to boast about their achievement. The interview is the worst place to hide your qualities.

    Some people prefer for their abilities to “speak for themselves”. Well, I have news for you: it doesn’t really work. People are too busy to guess your abilities, you have to use your mouth.

    In general, do not assume people will think and infer something about you. “They will see I published papers in conferences both in theory and systems, so they will realize that I am an interdisciplinary guy.” If you think that interdisciplinarity is one of your strengths, put it in writing in your statement of purpose, or cover letter, or even better, on a slide in the talk (or all of them).

    I am not saying you have to go around saying how good you are, that does not work, but one thing you have to be very careful during the interview process is about using negative labels for yourself. Even when joking. This is a long post already, so I will write some more about this topic another time.

posted by Mihai at 11:30 pm  

Thursday, May 3, 2007

An interview with Leslie Lamport

Leslie Lamport
Leslie Lamport in his office, May 2007

Leslie Lamport is a legendary figure of computing. While he is probably most well-known because of the open-source typesetting LaTeX macro package and book, arguably his most important contributions are in the domain of distributed systems; this is also the subject of this interview.

This interview was conducted in April 2007. Leslie Lamport typed the answers to my questions and reviewed the final editing. I thank my colleagues who have helped me think about these questions.

This interview is licensed under a Creative Commons Attribution License.

Q: You have always been a member of an industrial research lab. What is the difference between an industrial research lab and a university department?

A: Jean Renoir wrote in his autobiography that someone once asked his father, the painter Auguste, why he painted from nature. Renoir père answered that if he were to try painting a tree in the studio, he would be able to draw four or five different kinds of leaves, and the rest would all look like them. But nature creates millions [his count] of different kinds of trees. I like working in industry for the same reason Renoir liked to paint from nature.

Q: What was the first computer you have used/programmed?

A: The IBM 705.

Q: Your paper Time, Clocks and the Ordering of Events in a Distributed System (Lamport Clocks) (1978) taught programmers once and for all how to think about clocks. The key message had been known by physicists since Einstein: that there exist events in a computer system which do not occur one before another, (i.e., time is not a total order).

A: I hope it didn’t teach anyone once and for all how to think about clocks. For one thing, I’ve written papers describing other ways to think about them. The most important idea in that paper is thinking about distributed systems in terms of state machines — an idea that most people who read the paper seem not to notice.

Q: The clocks are really “stealing the show” in this paper, and I can understand why people can overlook the state-machine. Could you explain the essence of the state-machine idea?

A: Almost any system can be described as a state machine. So an easy way to implement a system is to describe it as a state machine and use a general algorithm for implementing state machines.

Q: The Byzantine Generals Problem paper (1982) describes the first provably correct algorithm for making several computers agree when some of them may give deliberate wrong answers. What are the its practical applications?

A: The only practical applications I know of are in real-time process control — in particular, for systems that fly airplanes.

Q: The Part-Time Parliament (Paxos), (1989 and 1998) paper shows how to make a (server) program more reliable by making several copies, which continue to operate as long as a majority of them are functioning correctly. Paxos is deployed in the Google Chubby lock server and in the Microsoft Autopilot cluster management service from Live Search. Where else is Paxos deployed?

A: The problem that Paxos solves is common to a wide range of distributed systems. The normal-case behavior of Paxos is what sensible programmers come up with when faced with the problem. If those programmers also succeeded in correctly handling the abnormal cases, then they would almost certainly have some variant of Paxos. Of course, they almost certainly won’t come up with Paxos unless they’ve already learned about it. I don’t know how many programmers have. As of a few years ago, I knew of two or three groups at Microsoft who were planning to use Paxos in their next releases. I don’t know if they did.

Q: If I understand the history right, Paxos was appreciated first by the system builders, and later by the theoretical community, despite one of its main contribution being the formal correctness proof. Why did things happen this way?

A: If that is indeed the case, then I suspect Butler Lampson was responsible. He is the only person who understood the algorithm and its significance as soon as he saw it. He advocated essentially the state machine approach and the use of Paxos as the general state-machine implementation. His influence was greater among system builders than among theoreticians.

Q: I noticed that you worked on an operating system (1962—1965). What large programs have you implemented, besides LaTeX — A Document Preparation System (1985)?

A: I haven’t implemented any programs that my colleagues would consider large (LaTeX included). People seem not to appreciate the virtues of writing small useful programs. Years ago, Butler Lampson proposed a project for capturing every keystroke and mouse click and permitting you to restore your computer to its state at any instant. This would have been a large, multi-person project. I thought that was a neat idea. Since all machine state that I ever needed to restore was contained in files that I created with Emacs, I spent an hour writing some Emacs macros that called RCS to checkpoint files whenever I saved them. (Since I grew up in the days when computers crashed frequently and am in the habit of saving the file I’m editing every minute or so, that was often enough. Otherwise, it would have been easy to do the checkpointing on some other regular basis.) I can therefore recreate the state of any of the source files I’ve written since 1993 within about a minute of any given time. I told my colleagues about my macros, but no one was ever interested. Grand projects are much more exciting.

Q: All throughout your writings you emphasize the necessity not only of coming up with sound algorithms for solving problems, but also of proving formally, mathematically, their correctness. Not just using program testing, but using mathematical reasoning for all possible circumstances. However, the currently available formal methods are unable to prove correctness of large software systems, such as real operating systems. What is your advice to software developers for bridging this gap between algorithms (which can be analyzed) and full software systems?

A: You seem to be implicitly asserting the usual argument that program verification cannot prove the correctness of a complete operating system, so it is useless for real systems. The same reasoning would hold that because you can’t implement a complete operating system in C (since you need a fair amount of assembly code), C is useless for building real systems. While this argument has obviously never been applied to C, it has in fact been used to dismiss any number of other programming languages.

People fiercely resist any effort to make them change what they do. Given how bad they are at writing programs, one might naively expect programmers to be eager to try new approaches. But human psychology doesn’t work that way, and instead programmers will find any excuse to dismiss an approach that would require them to learn something new. On the other hand, they are quick to embrace the latest fad (extreme programming, templates, etc.) that requires only superficial changes and allows them to continue doing things basically the same as before. In this context, it is only fair to mention that people working in the area of verification are no less human than programmers, and they also are very reluctant to change what they do just because it isn’t working.

The fundamental idea behind verification is that one should think about what a program is supposed to do before writing it. Thinking is a difficult process that requires a lot of effort. Write a book based on a selection of distorted anecdotes showing that instincts are superior to rational judgment and you get a best seller. Imagine how popular a book would be that urged people to engage in difficult study to develop their ability to think so they could rid themselves of the irrational and often destructive beliefs they now cherish. So, trying to get people to think is dangerous. Over the centuries, many have been killed in the attempt. Fortunately, when applied to programming rather than more sensitive subjects, preaching rational thought leads to polite indifference rather than violence. However, the small number of programmers who are willing to consider such a radical alternative to their current practice will find that thinking offers great benefits. Spending a few hours thinking before writing code can save days of debugging and rewriting.

The idea of doing something before coding is not so radical. Any number of methods, employing varying degrees of formalism, have been advocated. Many of them involve drawing pictures. The implicit message underlying them is that these methods save you from the difficult task of thinking. If you just use the right language or draw the right kind of pictures, everything will become easy. The best of these methods trick you into thinking. They offer some incentive in the way of tools or screen-flash that sugar coats the bitter pill of having to think about what you’re doing. The worst give you a happy sense of accomplishment and leave you with no more understanding of what your program is supposed to do than you started with. The more a method depends on pictures, the more likely it is to fall in the latter class.

At best, a method or language or formalism can help you to think in one particular way. And there is no single way of thinking that is best for all problems. I can offer only two general pieces of advice on how to think. The first is to write. As the cartoonist Guindon once wrote, “writing is nature’s way of showing you how fuzzy your thinking is.” Before writing a piece of code, write a description of exactly what that piece of code is supposed to accomplish. This applies whether the piece is an entire program, a procedure, or a few lines of code that are sufficiently non-obvious to require thought. The best place to write such a description is in a comment.

People have come up with lots of reasons for why comments are useless. This is to be expected. Writing is difficult, and people always find excuses to avoid doing difficult tasks. Writing is difficult for two reasons: (i) writing requires thought and thinking is difficult, and (ii) the physical act of putting thoughts into words is difficult. There’s not much you can do about (i), but there’s a straightforward solution to (ii) — namely, writing. The more you write, the easier the physical process becomes. I recommend that you start practicing with email. Instead of just dashing off an email, write it. Make sure that it expresses exactly what you mean to say, in a way that the reader will be sure to understand it.

Remember that I am not telling you to comment your code after you write it. You should comment code before you write it.

Once you start writing what your program is supposed to do, you will find that words are often a very inconvenient way to express what you want to say. Try describing the mean of a set of numbers in words. If you succeed in describing it precisely and unambiguously, you’ll find that you’ve written a formula as a sentence. This is a silly thing to do. In a great many cases, mathematics is a much more convenient language than English for describing what a piece of code is supposed to do. However, it is only a convenient language if you are fluent in it. You are undoubtedly fluent in arithmetic. You have no trouble understanding

The mean of the numbers a1, … , an equals (a1 + … + an) / n.

The kinds of things you need to describe in programming require more than simple arithmetic. They also require simple concepts of sets, functions, and logic. You should be as familiar with these concepts as you are with arithmetic. The way to achieve familiarity with them is the same way you did with arithmetic: lots of practice.

As you get better at using math to describe things, you may discover that you need to be more precise than mathematicians usually are. When your code for computing the mean of n numbers crashes because it was executed with n = 0, you will realize that the description of the mean above was not precise enough because it didn’t define the mean of 0 numbers. At that point, you may want to learn some formal language for writing math. But if you’ve been doing this diligently, you will have the experience to decide which languages will actually help you solve your problem.

Q: You performed an interesting experiment analyzing quicksort implementations that you found on the web using a search engine.

A: I did a Web search on “quick sort” and tested the first 10 actual algorithms that came up. Half of them were wrong. Every one that was written in pseudo-code, and hence could not have been tested, was wrong. Not one Web page made any attempt to show that the algorithm worked, although several of those pages were notes from computer science courses [at universities].

Q: You have written a book (2002) and many papers about TLA+, which is a very high-level specification language. What is TLA+ good for, and what isn’t it good for?

A: TLA+ is good for writing formal, high-level descriptions of certain aspects of concurrent and distributed systems. The major part of TLA+ (in terms of the number of constructs and operators) consists of a language for ordinary math. Since ordinary math is wonderful for a great many things, TLA+ is potentially useful for any task that requires very rigorous mathematical statements. For which of those tasks it is actually good is something that can only be discovered by trying.

Q: Should programmer’s and computer science education emphasize distributed system concepts more in the Internet era? What are some of the fundamental concepts one should internalize?

A: I’ve spent much of my career in search of fundamental concepts. I haven’t found any that compare with concepts such as energy and momentum in physics. There are a few results that seem to be of fundamental importance, such as the Fischer, Lynch, Patterson theorem. One important concept that is underappreciated in the world of distributed systems is that of invariance.

Q: Is invariance used in distributed algorithms in a different way from traditional algorithms?

A: It’s used in the same way as for any concurrent algorithm. The term invariant is used somewhat differently when talking about sequential algorithms — meaning an assertion that is true when control is at some point in the program rather than one that is true at all times.

Q: What software tools do you use in your day to day work?

A: Emacs, LaTeX, SANY (the TLA+ parser) and TLC (the TLA+ model checker).

Q: What hard open problems are there remaining in distributed systems?

A: On the theory side, recognizing a problem has generally been harder than solving it. Since you are asking about problems that are already recognized to be problems, the answer is probably none. On the practical side, I don’t think I have any more insight into that than most other people.

Leslie Lamport in 2003

Photo (c) Keith Marzullo, 2003

Q: A big part of your research effort seems to be building the appropriate abstraction of reality, which is accurate enough to model essential details, but also precise enough to be modeled mathematically. How does one learn the art of abstraction?

A: An engineer at Intel once told me that writing TLA+ specs taught them to think more abstractly about their systems, which improved their designs. But I don’t know if that made them any better at looking at a completely new situation and abstracting its essence. It’s probably a skill one is either born with or learns at an early age.

Q: Did you get lots of fan email because of LaTeX?

A: Perhaps one or two a week.

Q: What are some things you would do if you were without access to a computer for three months?

A: Think and try to find some pencils and paper.

posted by Mihai at 10:02 pm  

Thursday, February 22, 2007

An interview with Chuck Thacker

Chuck
Chuck Thacker in his office, October 2006
Chuck Thacker is one of the 16 technical fellows of Microsoft – the highest technical position one can achieve among the 75000+ employees of Microsoft.   He has been one of the main designers and builders of the Xerox Alto, the first personal computer, in 1972, and one of the inventors of the Ethernet.  At Microsoft he led the creation of the Tablet PC prototype.

This interview was conducted in mid-October 2006. The audio transcript has been edited for continuity. I thank my colleagues who have reviewed the transcript and have made useful suggestions. Chuck Thacker has reviewed this text prior to posting.

This interview is licensed under a Creative Commons Attribution License.


alto

Image courtesy of Chuck Thacker.

Q: My first question is somewhat tongue-in-cheek. The Alto computer was designed and built at Xerox PARC in 1972. It featured a lot of very advanced technologies: microprogramming, a mouse, bitblt (high-speed screen manipulations), a bitmapped display, overlapping windows, menus, icons, wysiwyg text editing, object-oriented programming, simultaneous applications, Ethernet networking, CAD systems, email, and laser printing, many of the things we take for granted today, more than 30 years ago. So, my question is: What has happened in the last 30 years that wasn’t discovered at PARC?

A: We did miss a lot of things. We missed the IBM PC. And in particular, we missed the significance of the Apple II. I had looked at an Apple II. The Apple II came out about 10 years after the Alto. But it was a very minimalist design, so it would not cost a lot of money. Alto was very expensive: it cost $12,000 in 1973, that’s now $100,000. The idea of putting $100,000 of stuff on someone’s desk just didn’t work. And although Alto went to a lot of places, including the Carter White House, for document preparation, for most things it was really too expensive, it was just before its time. And the reason IBM PC and Apple were so successful is that people had begun to make microprocessors which were powerful enough to run real software. It’s a little humorous, the processor in the Apple II, the 6502, was the processor that we used in the keyboard of the Dorado [a successor of Alto], that we built at about the same time — our keyboard controller. We didn’t think of the 6502 as a computer at all, we thought of it as a controller.

We missed spreadsheets, but only by a little. Peter Deutsch had written a memo about a programming system that would work with a concept called spiders under the paper, which is the spreadsheet paradigm, but he never built it. And the primary reason he never built it is that he didn’t have anybody who needed it.

One of the things that worked out so well with the Alto is that we built it for ourselves. You can’t do that today, because what nerds want is not what the public wants. But we wanted a lot of things that people did want at that time. In particular: document preparation.

We missed RISC. That’s primarily because every machine we had built was microprogrammed. [Microprograms] would allow you to build quite complex instruction sets. In the Alto and in some of the other machines the microprocessor was used for doing I/O control as well. You could do very efficient things with microprograms. You could do things like bitblt, which are very difficult with a hardwired machine. We missed RISC until we read their papers: What a good idea!

We missed Unix. The Alto system was a single-user machine, it was not a time-sharing system. Unix started out as a multi-user system, to allow multiple users to share a PDP-11. [Note: it was actually PDP-7.] We missed that one completely.

TCP/IP is questionable. Some of the people from PARC did participate in the meetings with the TCP/IP group, but the lawyers told them they could not say anything. They could only ask questions, so they asked clarifying questions. I was told that one of the organizers turned towards the Xerox guys and said: You guys have done this before, haven’t you? Xerox had its own network protocols — XNS. That was at a time when there were a zillion protocols. TCP/IP hadn’t really emerged as the dominant one.

Q: Why did you decide to work in computers?

A: I didn’t start in computing at all. I had known for a long time, since I was kid, as a matter of fact, that I wanted to be an engineer. I decided when I was nine to become a physicist. But I would not become a theoretician, because I am a pretty bad mathematician. Instead I would become an experimentalist, and in particular, I was to design particle accelerators. I had worked at the Caltech synchrotron laboratory for a while which had one of the first medium-sized particle accelerators — it was actually a prototype for the Berkeley Bevatron. And I really liked it a lot. You are doing physics, and you are doing cutting-edge engineering, because there is a lot of very hard engineering in building stuff for an accelerator.

I bounced around through various schools. Caltech, UCLA, and finally Berkeley, where I was in physics. I was pretty much self-supporting through undergraduate school. Student loans were not a concept then. When I graduated I said I am going to work for a year or so, to get enough money to pay for graduate school, and I will apply to various graduate schools along the way.

One day a friend came by and said: You know, Chuck, there’s an opening at Berkeley in the Genie Project. So I went and talked to the guy who managed the project — Mel Pirtle, and wound up working for them as a staff engineer. This was in 1968. What Genie had just finished was the 940 time-sharing system. They were augmenting it in various ways. I worked on various pieces, such as the printer controller, some modems, redoing the teletype channels.

Of course, I got totally sucked into computers. I had programmed a bit before, mostly in FORTRAN, for physics. The way I put it, I fell among bad companions: Pirtle, Wayne Lichtenberger, Butler Lampson, Peter Deutsch, a lot of very good people.

We decided we could not do the follow-on machine, which was very large, in a university. Although we probably could get ARPA to pay for it — because at that time ARPA was a big spender, since Bob Taylor was running ARPA — the strictures of the university system would just not let us build something at that scale.

So we went to do a start-up, called Berkeley Computer Corporation (BCC). And we spent through 1970 building the machine. There was only one built, because the company could not get enough financing to make a big success out of it. But the final machine worked, and the company was dissolved. And the machine was sold to the University of Hawaii where it was the primary machine of the Computer Science Department for about 5 years. A lot of people went to this new research laboratory in Palo Alto, called PARC. I was actually one of the very first, because the hardware was working, so I didn’t have anything to do at BCC while the software people were finishing up the software. So I went to work for PARC.

Q: You took advantage of Moore’s law even very early. For example, in the Alto you dedicated 3/4 of memory to the bitmapped display, although it was very expensive. However, can you point out some instances where Moore’s law enabled some capabilities which took you by surprise?

A: One of the things I did not anticipate very well was how far one could push CMOS technology. CMOS was great, it was much better than the original technology, which only had one kind of transistor, an n-channel transistor. Until 1990 I thought that other technologies, like Bipolar, or transistors made from materials like [atomic group] 3-5 oddball compounds, which are fundamentally better than Silicon in terms of performance, would still have a place. And today they don’t. The closest thing you have is Silicon on Germanium as an add-on process to Silicon. But CMOS has conquered the world. I did not anticipate that.

I was a fan of ECL [emitter-coupled] logic much longer than I should have been. If I had really figured it out… I didn’t talk to enough good material scientists or physicists to believe. A lot of the original papers about the limitations [of CMOS], which were written by the masters, like Carver Mead, just turned out to be wrong. They didn’t understand that the high level of investment in CMOS would pay off. I did somethings which were pretty nutty, when viewed even from 20 years after. But by and large, I did internalize Moore’s law.

Most people don’t understand what Moore said. He didn’t say computers will run faster, or they will become cheaper, or anything like that. He said you can put twice as many transistors on this chip every 18 to 24 months. And he was dead-on right. I imagine he’s astonished it has managed to hold on for 35 years, because mostly technology curves are S curves. And there is going to be an S curve in silicon. Things are getting down to atomic dimensions. As they do, other technologies, like electron-spin devices, and carbon-nanotube devices are trying to move from laboratory to reality. Recently Freescale introduced MRAM, which is based on electron spins to store the bits. It is nonvolatile, and very fast. Not big yet, but it may be the replacement for memory in the CMOS area. Nonvolatility would be wonderful: it would take us back where we were with core memory: you can take away the power and the machine does not lose its brain.

Q: Your older projects are quite well known. Can you talk a little about your more recent projects?

A: After coming to Microsoft, I first spent a couple of years in the UK, helping set-up the research laboratory there. That was a finite period assignment, and when I came back to the US I looked around for something to do. I had worked a bit with the guys doing electronic books. They had just gotten the charter to do some kind of tablet device. I went to them and said hire me, because I know how to do this. And they did. So I went into this product group, and I worked on the Tablet PC for a few years.

The Tablet PC hasn’t been as successful as we hoped it would be, but it hasn’t been the commercial failure that all of its predecessors have been. It’s still growing, it’s still a very viable form factor, and a lot of people just love them.

More recently I’ve been thinking about where can we take computers that they have not gone before? The way I characterize this is: Microsoft, in order to be a big growth company, has to do better than 10% market share. People look at me and say: you’re crazy, MS has 90% market share. I say No, you’re wrong, there are about 600M PCs in the world, and there are 6 billion people. By my measure, that’s 10%. I am very interested to make technology more accessible in developing countries.

More recently, I am also interested in getting back into computer architecture. I had left it about a decade ago, because I decided that the papers I was reading were all boring, and were all very incremental. I now believe there may be some ways to break through that and do the major innovations that we were doing 30 years ago. You can’t do those the same way today, because everybody works with the same silicon processes, and that defines what you can build. A small group cannot design a large chip — unless you do it with an industrial partner, because it takes too long, it is too big an effort. Managing the complexity of a billion transistor object is not something that a group of 10 people can do, no matter how wonderful the tools are.

One approach that I am quite enamored of (I didn’t think of it), is RAMP, Research Accelerator for Multiple Processors. Their idea is to use FPGAs to simulate the future. In the same way we used new hardware by spending a lot of money to simulate the future [in a system like Alto]. But we can’t do that any more, because there isn’t that much money [to simulate a next-generation processor]. I think this RAMP device might improve things. I am helping to make RAMP happen more quickly and on ways to use it to redress some of the mistakes we’ve made in the last 30 years in terms of architecture.

Q: What technologies make a Tablet different from a regular laptop?

A: There aren’t any. The intention was that there would not be any, because if you have a lot of novel technologies, it is unlikely you will get widespread adoption, because people will want to charge money for those things, and that is not what leads to volume. The reason the PC is the dominant platform on the planet is because it is a platform, and the specs are sufficiently open that anybody can make one.

The only thing that’s really new in the Tablet PC is the pen digitizer. A lot of people had built tablets with resistive digitizers. You see resistive digitizers in Pocket PCs. But they don’t work very well if the screen is large and if you want to write on the screen. Because if you rest your hand on the screen, it thinks that your hand is the pen. The digitizer that we chose uses an electromagnetic technique which allows it to locate the pen even when it is above the surface of the tablet. This, to many, looks like magic. It is just electronics.

Tablets come in two forms: a convertible and a slate, which I favor, which is just a solid block. To first order, it is a laptop, but it is the evolution of a laptop. That’s the way many manufacturers use it: they take their laptop designs, stick the digitizers on, and they are done. Microsoft built a Tablet PC not because it included a lot of novel technology, but because there was a lot of resistance from [hardware] manufacturers. They looked at the idea, and they didn’t understand it, and they didn’t know how hard it was going to be. We had to build a prototype to show them how hard it was going to be. When that prototype was built, a lot of them said gee, we can build things like that...  Sure you could. It’s easy.

Q: The magic is in the software…

A: The software is always the magic in the computer. Hardware is really the simple part. The complex part is the software, and that’s the part that’s hard to get right.

Like anything, the Tablet’s initial software was very weak. It would not track the pen very well, handwriting would not work very well for that reason. But it’s gotten quite a bit better lately. Now it’s possible to build very good drawing and handwriting applications on the Tablet. There’s still a lot of innovation to be done on how to use the pen to improve the user experience, that we haven’t even thought of. Like when we had built the Alto: we hadn’t really quite figured out windowing systems. It took quite a while before we had the windowed applications. The Alto screen looked like a glass teletype initially.

Q: In your office there is always some disemboweled electronic device. You always take things apart. Why do you do that?

A: You learn design tricks. You learn how other people did it. I encourage reverse engineering. A lot of people say: you shouldn’t reverse engineer, you should go out and do new things. I don’t believe that. I take things apart. I take toys apart. It’s very interesting to see how people designed a toy. Somebody said: an engineer can build for a dime what anybody can do for a dollar. Engineering is a matter of making choices in order to minimize a cost function and to maximize the outcome. I study other designs. And I see how smart people did it.

Q: It’s easier to do for hardware than for software.

A: That’s one of the things that is very positive about the open-source movement: anybody can look at the source and see how it is done. The problem in software is that you may look at the code and still can’t figure out how it is done, and a lot of code is junk anyway. And it is much harder to understand a software system than to understand a hardware system. If you can look at it and see it working, you know what it does, whereas you can stare at a program listing, and you don’t know what it does. Many people that I know from the Unix community spent a long time looking through the listings of the first one that they actually published, which I think was System VII. And they actually published everything. And people pored over them to figure out how this extremely clever system was built. These days it’s much harder. Either you can’t get the code, or there’s so much of it, you can’t understand it. No single human can understand it anyway. But you still do a pretty good job understanding things by reading the source.

Q: What do you think is your best work?

A: The best well known is the Alto. But some of the later things were more relevant today. I am pretty proud of the Firefly system, which was the first multiprocessor workstation. A lot of the experience from Firefly should be relevant to the upcoming multicore world. We built the computers, and we built a multithreaded operating system to run on them, that looked like Unix. This was in 1983 to 1985. They were still operating Fireflys when I left DEC [in 1997]. Of course, DEC was in the workstation business by then. Some of those things are still relevant, and still cited in papers. Nobody cites Alto, unless they are history papers. But the early things were much more fun, because the field was certainly much more open at the time.

Q: How do you learn good engineering?

A: You generally learn it by doing it and by looking at what other people have done. I’ve had a few very good mentors throughout my career. I’ve probably learned more from them than from books. Books are good for reference, but you need to have a very large bag of tricks. Just as a programmer needs to understand a lot of fundamental algorithms, or he would go off the rails and reinvent everything, usually to the detriment of the end-user. We see that a lot today: people don’t get good grounding in algorithms, so they go and reinvent them, and they are very proud. But by and large, the old ones are better. So: learn how other people do it.

Q: People have expected many times that technology will miraculously help the learning process, but it hasn’t happened. How can computers help?

A: To some extent it has helped. If you think about the way these days teachers work about presenting ideas to students, a lecture room is much different than it was, even ten years ago. Everyone uses PowerPoint, everyone uses laptops. You can argue whether PowerPoint has increased the quality of education or decreased it, but it has certainly changed it a lot. Computer have had an enormous effect on education. By and large, this is not true at the lower grades. My own belief is that’s where the problem is. I was at a meeting in Boston last week, where we were told about some of the wonderful investigations that people have been doing to improve secondary and college education, primarily through the use of technology. Somebody in the audience said: isn’t K2-6 where the problem really is? The speaker said Yes, it is, but it is too hard. So we’ve not gone there.

People who have gone there have been in general unsuccessful. Edutainment is the name. CD educational games were given a lot of trials, and they don’t work. I am pretty interested in that. I don’t have an answer. I do have some ideas, and some of them we’re working on.

The thing I worry most about is literacy. That is a fundamental skill. Our kids are not reading as much as they used to. It’s quite clear why, there are two reasons, everybody understands what they are, but they don’t know what to do about them. The first thing is television: kids spend a lot of time in front of television, which has no value whatsoever. The second thing is that parents no longer read to their kids, because they don’t have enough time. When I grew up, in most families one parent stayed at home. That person took care of the child’s education far more effectively than a teacher did. Parents just don’t have the time now, since they both have jobs. Education takes a back seat: they expect the schools to do that. But you can’t reasonably expect the school to work with a kid in an environment where you need a lot of 1-on-1, because it’s 30-on-1.

Q: What are your hobbies?

A: I don’t really have anything that I would really describe as a hobby. I enjoy my work so much, so when I am home and doing nothing I can always find something to do. I am lucky: I can do a lot of things that I do — design things — without a computer. It’s just me, a piece of paper, and quiet.

posted by Mihai at 5:45 pm  

Powered by WordPress