Fundamental Challenges in Modeling, Representation and Synthesis of Gestures

The progress in the area of gesture recognition algorithms mainly through deep learning
techniques is indisputable. We are witnessing a drastic increase in the accuracy, reliability
and speed of such algorithms, which are strongly supported by a growing market adopting
such systems greedily and unappeasable. Yet, the quest for finding immediate, and out of the
box solutions lead to the fact that we cannot understand very well the components, factors,
embeddings and the semantics that make gestures be more similar or different with respect
to learned patterns. That is, we don’t know yet how to categorize gestures based on their
meaning, phonological and morphological attributes. It is becoming apparent that the plain
visual representation is not enough to capture such attributes of the gestures. That explains
why the drastic breakthroughs in gesture recognition is not yet reflected in problems such as
spontaneous gesture recognition (zero-shot learning), or semantic gesture recognition (inferring the meaning of the gestures).

More specifically, we are focusing on challenges associated with gestures:

  1. semantics (the meaning of gestures),
  2. morphology (structure of gestures),
  3. phonology (sonification of gestures),
  4. singularity (zero shot and one shot learning),
  5. cognition (gestures as a proxy of human learning);
  6. motor (how the gestures are physically produced);
  7. physiological (how physical effort plays a role in the gesture generation and use), and
  8. affective (are there the equivalent to action units to express emotive themes during gesture

These eight key-points are emerging topics in the field of gesture interaction and research papers are sporadically found in linguistics, psychology, human-factors and computer vision related conferences, but there is no single forum that unifies these challenges and questions into one coherent framework. As the challenge of recognition is becoming more “tractable” (papers showing recognition accuracies from 98 to 99%), the remaining challenges about thestructure, representation, cognitive mechanisms of gesture generation remain open.

We expect to bring a diverse community of linguistics, psychologists, computer scientists, engineers and roboticists, together to address these questions and propose solutions to these
challenges. With this session, we expect to gain new understanding about how humans perceive, process and generate gestures. Knowledge gained through the session can shed light to understanding how infants, learn to gesture, how to identify spontaneous gestures and/or
uncontrolled gesturing, and how to gain insights of the cognitive processes (e.g. learning) through gesturing.


Juan P Wachs, Purdue University
Richard Voyles, Purdue University


  • Full Paper Submission: January 6, 2019 – midnight PST
  • Camera Ready Deadline for Accepted papers: March 13, 2019


The IEEE FG Proceedings, will include accepted and presented papers from the special session at the Conference.

Format of the paper should follow the format for the main conference. Both long papers and short papers can be submitted.

  • Short paper: 4 pages + 1 page for references
  • Long paper: 8 pages (including references)

Please refer to the main conference website for the specific details about formatting.


We use double blind reviews.

ANONYMIZATION POLICY : Authors should remove author and institutional identities from the title and header areas of the paper. There should also be no acknowledgments. Authors can leave citations to their previous work unanonymized so that reviewers can ensure that all previous research has been taken into account. However, they should
cite their own work in the third person (e.g., “[22] found that…”)