Web Techniques
November 1996
Volume 1, Issue 8




Special Report

The First Virtual Humans Conference

Evolution in Cyberspace

Sue Wilcox

vrmlpro@inquiry.com

What do worms, robots, and Marilyn Monroe have in common? They were all featured at the first Virtual Humans conference. "Virtual humans" is an emerging specialization in the field of virtual-reality research, focusing on the representation of humans in computer-generated virtual worlds. Virtual humans, or "avatars," are being used in 3D multiuser communities, and in video and computer games that require human models. The speakers represented all parts of the 3D community: academics, commercial animators, and hardware and software developers all united to share techniques and technologies to make virtual humans evolve faster.

Professors Nadia and Daniel Thalmann, notable VR trail blazers from Swiss universities, were two of the key speakers at the conference, held June 19-20 in Anaheim, CA. Nadia Thalmann reviewed the development of Marilyn, a virtual human and a new software release. Marilyn is Marilyn Monroe modeled in 3D and given physics-based behavior and clothing. The famous figure gave everyone a chance to compare the model with the memory. Achieving beauty, human movement, and expressive functions is an elusive goal. Animators have long known it's easier to make a cartoon than a realistic figure. People are experts at spotting fake humans, even if they've never seen one before. Nevertheless, Professor Thalmann feels she is 90 percent of the way to reality.

Modeling Human Behavior

"Reality" is made up of four layers: skeleton, metaballs or implicit surfaces for muscles, contours, and a spline-based skin. Movement is controlled by four types of motion generators, including inverse kinematics and inverse kinetics (which takes into account the center of mass). Hair is 150,000 individual polyline structures that must be set, rendered, and animated. Clothing involves not only the reflectance and color of fabric, but also response to gravity, friction, drape, and, above all, collision detection. Clothes should stay on the surface of the body, and each fold and wrinkle has to detect collisions with other fabric. Layers of clothes must stay separate and respond to environmental forces. After the literally skin-tight clothes necessary to achieve clothes that move on 3D models, it's amazing to see clothes that can flutter in the breeze-like Marilyn's dress above the hot-air grate.

The Marilyn software was due to go on sale in October for $1900, but only on the SGI platform. This sort of modeling is computationally expensive, and you won't be using it on the World Wide Web anytime soon. To find out more about Marilyn, explore the Thalmanns' Web site at ligwww.epfl.ch/~thalmann/research.html.

Counting polygons is not a concern of the Thalmanns. It is important, however, to Marc Raibert of Boston Dynamics (BDI), because he has a real-time dancer who can respond to how well you play air guitar. The Interactive Dancer performs on a SGI machine but her 180 polygons gyrate to some fast-moving music. Her movements are the end result of a long research program that started with hopping robots in 1981. The robots graduated from one leg to two legs, then four legs, and even a six-legged scuttler. They can all be seen inside computer animations now as the motion-analysis algorithms are transferred to all manner of simulated creatures. Raibert's one-legged kangaroos are great, and the hopping behavior is scalable from big, slow kangas to tiny, frantic hoppers. This work is then extended to humans who walk naturally and move seamlessly from one position to another. The US Marines are using BDI's "DI-Guy" characters to train on the virtual battlefield. For more detail, visit Marc's Web site at www.bdi.com.

Several speakers discussed "multimodal communication"-the importance of facial expression, gaze tracking, and gestures in maintaining a conversation. Kristinn Thorisson demonstrated
J Jr., a simple line drawing reminiscent of Paul Hogan, who tries to keep up a conversation by saying "Yes" in the right places, moving his cartoon eyes, and rotating a propeller on his hat. His next character, Gandalf, had waggly eyebrows, movable eyes and mouth, and a gesturing hand. Despite three SGIs, two PCs, two DEC Alphas, and an HP, it still took Gandalf a while to answer questions. But at least he could waggle his eyebrows while you were waiting. This sounds silly but is actually very useful. If you know someone is responding or thinking, it is OK to wait for them to speak. Try it yourself-it was good enough for Mr. Spock.

Virtual humans capable of understanding text-based communication were covered by Michael Mauldin. The creator of Julia, a Turing device that mimics human typing, Mauldin is currently chief scientist at Lycos, where his language skills are being used to develop their search engines. Julia can hold her own in a typed conversation: She will take the lead, add content to the discussion, and rebuff attempts to find out if she is a machine. She has been tested in TinyMUDs (multiple user domains), where she has satisfied humans interacting with her that she is just another participant in the shared-text environment.

Web-Based Interaction

So, the work to understand and reproduce human appearance, motion, and expressions continues to approach a semblance of reality. There is still much to be done in the realms of speech understanding and synthesis and artificial intelligence (AI). A quick fix is to use humans to supply speech, understanding, and movement via motion capture, as Linda Jacobson demonstrated using SGI equipment.

But none of this development is ready to be used over the Internet. Human interaction on the Web has evolved from the simplicity of text chat and MUDs, through the 2D pictorial communities like Habitat, Worlds Away, and The Palace, to the current stage of wondering if the interactive 3D of VRML 2.0 will be enough. Mitra, the chief network-technology officer from ParaGraph, discussed how to deal with the bandwidth and human problems of interaction over the Internet. He looked at the issues of design flexibility and control over users: In an interactive world, the host of the multiuser environment must decide who builds the world and who can change it. There is also the problem of how to represent yourself online. Do you want your avatar to resemble you closely? Mitra feels that these problems are exacerbated by the high expectations of people used to the image quality and visual richness of the real world and movies and the speed of video games. Counterbalancing these desires for verisimilitude are bandwidth limitations: Delivering huge amounts of data plus state updates (to deal with position changes and interaction between individuals and objects) can bring a server to its knees.

One of ParaGraph's solutions is to reduce the load on the server by reducing the amount to be downloaded: They plan to place content on CD-ROM. Their Mr. Potato Head-style avatar assembly kits will be available on CD so you can change your appearance every day without having to wait for the entire selection of body parts to download. They also plan to use local behaviors to simplify information sent over the Net and distributed servers to spread the load of supporting a multiuser environment.

Some multiuser sites use polygonal avatars like those seen at Black Sun's site and in Alphaworld. Others use sprites, otherwise known as "multiple block transfers" (blits), or texture maps to reduce loading and movement delays. Mitra detailed VRML's bandwidth and file-size problems, briefly covering VRML file optimization. Because size affects download time and walkthrough time, one large world may function better if it's split into smaller, linked areas. Another way to speed the scene up is to change parts of the graphics pipeline that may cause bottlenecks. (Pipeline blockages can be caused by too many polygon vertices in the scene, too many texture maps, too many lights, or too many pixels.) The basic rule is to simplify the scene: Cut out unnecessary detail, large objects, multiple textures, and lights. Mitra advised thinking more simply from the beginning, but then you have an unrealistic virtual human. ParaGraph works with sprites as avatars, so the figures' details match the details of their surroundings.

Dr. Jon Waldern of the Virtuality Group had differing views on what makes a good avatar. Research led him to concentrate on the face as the most important part of the body for communication, so he developed worms with faces. The bodies wriggle with a peristaltic movement, while the face can appear happy, sad, surprised, or angry. Lip sync is still being worked on, but his theory of simplified expressions has led to flat faces with eyes, mouths, and a faint color blush to indicate emotion. Such an avatar takes only 40 polygons to draw, so many true 3D avatars can share space.

There is a tremendous interest in and market for multiuser social interaction on the Web. The online services have demonstrated the demand for even 2D chat worlds. Virtual humans are needed to act as seed people, hosts, guides, and instructors in 3D virtual environments. Mitra mentioned a few of the companies with demonstration or commercial software to enable and control multiuser spaces.

Products Showcased

Black Sun's CyberHub is multiuser-environment server software. The price depends on the number of simultaneous users: from $1995 for ten users to $59,995 for 1000. All 1000 users can be handled by a Pentium 90 with 64MB of RAM. By the time you read this, audio capability will have been added to the basic text site. You can visit worlds that use CyberHub from the Black Sun home world at www2.blacksun.com/pointworld.

Worlds Inc.'s Alphaworld has been around for over a year and is constantly growing under the influence of both its users and its developers. There is a seamless viewer, so by dynamically downloading Alphaworld, you can automatically view and interact with the world. It uses background sound, although its inhabitants still have to communicate by typing. The multiuser server software is commercially available for setting up Zones of Alphaworld. Worlds is now working on a development toolkit, code named "Gamma," for creating multiuser environments. Access to Alphaworld and Worlds Chat, their other 3D space, is from www.worlds.com.

OnLive's Utopia is commercially available as the first 3D multiuser environment with live sound. You speak into a microphone on your computer and see your disembodied head or weird object avatar speak with good lip sync and facial expressions that you can control. You need to use their Traveller browser to enter Utopia. Visit Utopia at www.onlive.com.

ParaGraph's Castle environment still uses text to communicate between avatars, but you can listen to an orchestra play while you socialize and interactively draw on a white board while sharing space (at the time of writing it was inaccessible from their Web site). To fully participate in ParaGraph's environments, you must use their browser to read their internal file format D96. Try knocking at www.paragraph.com.

Mitra predicts that production of avatars or virtual humans to inhabit these and future worlds will be big business as the WWW expands and 3D is more widely used. (See the fuller list of multiuser worlds in "URL Resources.")

Future Trends

At the end of the conference, the speakers were asked what we can expect next from virtual humans and the World Wide Web. Both Waldern and Sandra Kay Helsel, US editor of VR NEWS, predicted the wide adoption of VR peripherals as they become affordable. Waldern believes dynamic VRML 2.0 will bring a new era of home entertainment to the Internet. AI experts working on natural-language processing and expression recognition predict virtual humans will understand speech and use it within the context of a conversation with a real human. The use of a camera attached to a PC should become widespread, enabling humans to interact with a multimodal entity and other humans for teleconferencing, or within an avatar, using telepresence.

According to Mitra, acceptance of the VRML 2.0 standard will stabilize the 3D industry. VRML 3.0 is often touted as the extension of VRML 2.0 that will deal with multiuser interaction, but Mitra believes that VRML 2.0's extensibility should eliminate the need for version 3.0. Professor Paul Rosenbloom of the University of Southern California predicts that as AI improves, virtual humans with their own goals will surprise real humans. Virtual humans will gradually become visually complete people capable of emotional expression and fully expressive behavior. Jacobson is certain that unencumbered body tracking with no cables or sensors will become possible, so real humans can easily become virtual. It will be possible to have a full emotional relationship with a virtual human-with all its attendant problems and controversy. And, as a member of the audience suggested, virtual humans will be able to become sexual beings.

I predict we will see all these things on the desktop or in the home, but not until the next century. The transition from SGI RealityEngine to real time on a PC will take a lot of hardware catch up. The Internet and your connections to it will also need time to increase bandwidth to cope with the graphical demands of high-quality avatar transmission and constant state updates for 3D worlds and games. The information pipeline is growing fatter, but the AI world has some growing to do, too. The driving forces of AI research are the military and games companies. Now that games machines are beginning to have enough spare computing capacity to handle a little thought, as well as the high-speed graphics that users demand, the market is starting to see games with brains. The military needs AI to drive instructors, guides, and participants in their virtual battles. So it's a safe bet that AI will start leaping forward.

Sue is a game-interface designer and coauthor of the game-theory book EZ-GO Oriental Strategy In A Nutshell. Her background is in psychology and computing, but she concentrates on computer journalism and 3D design for the Web. Sue is VRML pro at inquiry.com.URL Resources

Multiuser Worlds featuring avatars

ParaGraph's Castle

www.paragraph.com

Oz Interactive's Spacestation

www.oz-inc.com

Dimension X

www.dimensionx.com

Intel

its.intel-inside.com

Black Sun and Lycos'

www2.blacksun.com/pointworld

Worlds Inc's Alphaworld and Worlds Chat

www.worlds.net/

The following is a list of Web sites for presenters or products shown at the Virtual Humans Conference

Virtual Jack

www.cis.upenn.edu:80/~hms/jack.html

Virtuality Group plc

www.Virtuality.com/newindex.htm

Boston Dynamics Inc

www.bdi.com

Profs. Thalmann and Marilyn

ligwww.epfl.ch/~thalmann/research.html

SGI's Silicon Studio

www.studio.sgi.com/Features/PerfAnimation

Linda Jacobson's performance animation group D'Cuckoo

www.dcuckoo.com

ParaGraph

www.paragraph.com

Alias/Wavefront

www.alias.com

SoftImage

www.microsoft.com/Softimage/

Talk with Julia

fuzine.mt.cs.cmu.edu/mlm/julia.html

Simulating Human Motion

www.cc.gatech.edu/

gvu/animation/Areas/humanMotion/humanMotion.html


Web Techniques Magazine
Last modified: 11/5/1996