Towards an Interactive Tool for Music and Dance: Gestures, Laban Movement Analysis and Spectromorphology

sfreire@musica.ufmg.br Abstract: The paper presents a real-time tool for the segmentation and analysis of body gestures, part of a larger setup for exploring music and dance in the contexts of electroacoustic composition, live-electronics and other interactive performances. The idea of gesture is the foundation of the proposed interactive strategies, and is discussed from different points of view. The current implementation uses the Max/Msp programming language and Kinect sensors. The segmentation of dance gestures is based on the inspection of the zero-crossings of the acceleration curve of each body joint. Concepts from Laban Movement Analysis are used to qualify the extracted gestures. Dance improvisations on Petrushka excerpts are the basis of a case study, where the relations between the music (tempo, pulses, instrumentation, character) and Laban Basic Actions are stressed.


Introduction
Recent technologies are responsible for the increasing number of researches and activities dealing with music and movement, a relationship deeply rooted in every human culture.Our research is located at the intersection of three areas: study of the sound gesture, study of the gesture in the context of new sound interfaces and the study of the movement qualities in dance.We have developed strategies and tools for interactive musical situations that explore dance and music within the context of electroacoustic composition, live-electronics and other interactive performances.
In this paper, the focus is on the analysis of the gesture in dance, where digital techniques of motion capture, associated with elements from Laban Movement Analysis, are explored with the aim of developing methods of real-time segmentation and description of gestures in dance.Firstly, we introduce the concept of gesture and how it has been used in areas as music, dance and HCI (human-computer interaction).The basis and implementation of the algorithms used on the dance gestures analysis comes next.A case study follows: a dancer improvisation based on some excerpts from the ballet Petrushka, by Igor Stravinsky, from which the correlations between musical pulses and characters, body rhythms and the movement qualities are analyzed.Finally, we present the next steps for the fulfillment of planned interactive tool.

Gesture
The term gesture is widely used in academic studies, playing a significant role in arguments presented in quite different areas.Thus, gesture presents itself as a fertile field of study, although the concept is often anchored on vague and loose definitions.In this section, we present the principal concepts and ideas that provide the basis for our strategies for interaction between music and dance.A good starting point is given by Jensenius et al. (2010, p. 13): "When speaking about the musical activity of musicians and dancers, it is tempting to call the involved embodiment "gestures" rather than "movements".The notion of gesture somehow blurs the distinction between movement and meaning."These authors define a general framework based on three different viewpoints: communication, control and metaphor.
From the perspective of communication, gesture would be any bodily action associated with speech, usually hand and facial movements.This fact reinforces the intermodal characteristic of the gesture, since the conceptual system of language is directly integrated to the crossing of different modalities such as the motor system and the senses of the human body (ibid, p. 14-15).The authors' viewpoint of control covers mainly the humancomputer interaction (HCI), but also includes expressive and manipulative gestures Gesture then denotes any observable bodily movement, with expressive and extractable potential (ibid, p. 17).This definition also resembles the definitions and uses of the term in the field of new instruments and new interactive musical interfaces.Camurri et al. (2004, p.359) present the concept of expressive gesture, applicable to movement in any artistic modality that is capable of expressive content.Regarding the third axis, the authors affirm that "(m)etaphor is involved when gestures work as concepts that project physical movement, sound, or other types of perception to cultural topics."(Jensenius et al., 2010, p. 14).
In conclusion, Jensenius et al. (2010, p.19) propose a definition for gesture within the musical context: "Based on the above viewpoints, it seems straightforward to define musical gesture as an action pattern that produces music, is encoded in music, or is made in response to music."This definition also contributes to a paradigmatic distance from Cartesianism, since it allows conceptual integration of the physical, material and corporeal realms to the mental realm.Although very useful as a "terminological apparatus" (ibidem, p.19), this definition is not free of criticism from authors searching for a common intermodal basis for the analysis of music and dance.Schacher (2010, p. 2) points out that it "ignores the semantic components that gestures inevitably carry."And, according to Naveda: These concepts contribute to the panorama of musically related gestures by reuniting several dimensions of musical experience at a structural level of analysis, which allows for better targeting and control of musical devices, analysis and performance of music.However, they contain an inevitable hermeticism or over-specification of the idea of musical gestures as if they were exclusively subordinated to musical functions, which has closed the concept to mutual influences between music and dance.(NAVEDA, 2011, p.13).
From another perspective, Hatten (2004, p. 93) develops a significant conceptualization of the term gesture.The author focuses on the relationship between gesture and classical compositional styles, but his work can contribute to the understanding of essential aspects and meanings carried by the term.He understands the gesture as a significant energy shaping over time.Thus gestures can be manifested in different types of energy and consequently in multiple sensorial and artistic modalities, such as body movement in dance or sounds in music.The author also presents two aspects of the gesture relevant to our work: (1) intermodality and (2) perceptual integration and continuity.Intermodality is the capacity for analogous representation in all directions and also in the motor system.The ability of each sense to promote a coherent representation of the flow of events in the world and to share it with the other senses is the basis of the perceptual integration.(HATTEN, 2004, p. 100-101) Summarizing the concepts presented above, we arrived at a definition of gesture consistent with our objectives: a movement or an energetic shift through time perceived as a whole, a configuration / gestalt that carries expressive contents.

Sound Gesture
Based on the definition just presented, the sound gesture can be understood as a significant modulation of sound energy over time that carries expressive content.Many authors have sought to delineate this process, generally with the focus on musical analysis, but we can also observe in some works the relevance of the gesture in the creative and compositional process.In her doctoral thesis, Bachratá (2010) discusses the musical gesture and the gestural interaction between instrumental and electroacoustic sounds.She presents in her work leading musicologists who explore or relate to the concept of sound gesture such as Pierre Schaeffer, IannisXennakis, Michel Chion, David Lidov, Robert Hatten, Trevor Wishart, Dennis Smalley and Brain Ferneyhough.In our work we approach the sound gesture with the aid of authors who are most related to the concept of gesture we assumed and, consequently, to the interaction proposals under development.Since sound gesture can be considered a modulation of the qualities of a sound object / configuration, Pierre Schaeffer's work can serve as a basis for understanding the spectromorphological qualities of a sound, and how they relate to our perception.Schaeffer developed a typo-morphology (and its correlations with hearing) based on the notion of sound object.However, we can apply and extend this detailed morphology to the interpretation of sound gestures, since they are dynamic morphologies perceived as a whole, as a Gestalt, a sound configuration.
The important thing about gesture or dynamic morphology in general, is that it is essentially a time-varying property of a whole sonic object and cannot be atomised in the same way that pitch-lattice components can be separated through their discrete notation.(WISHART, 1996, p.112) Schaeffer developed a morphology for sound objects based on seven criteria related to the properties and characteristics of perceived sound: a. mass: "mode of occupation of the pitch-field by the sound", which may be tonic (when one clearly perceives a defined pitch in the tessitura), complex or variable (CHION, 1995, p. 159); b. harmonic timbre: "the additional qualities which seem to be associated with mass and enable it to be described"; (ibid., p.168); c. grain: "microstructure of the matter of the sound, which is more or less fine or course and which evokes by analogy the tactile texture of a cloth or a mineral, or the visible grain in a photograph or a surface."(ibid., p.171).d. allure: "oscillation, characteristic 'vibrato' of the sustainment of sound" (ibid., p.159); e. dynamics: "development of sound in the intensity-field" (ibid., p.159); f. melodic profile: "a variation which affects the whole mass of the sound, making it describe a sort of 'trajectory' in the tessitura" (ibid., p.183); g. mass profile: "internal variation of the sound mass which is, as it were, 'sculpted' in the course of its development" (ibid., p.185-186).
From a tripartite structure: onset, continuant and termination, Smalley (1997, p. 113) proposes three archetypes that can be applied to interpret sound gestures, since they are also based on spectromorphological dynamics.The models proposed by Smalley are: h.attack: an energetic impulse, in which the energy and focus of the listening remains on the attack of the sound.Examples: staccato (no resonance), dry percussive attacks.i. attack-decay: the initial and final phase are present.Example: pizzicato.j. graduated continuant: the three phases of sound are present: attack, followed by a sustain phase and a termination phase, such as a fade out.Example: notes of a flute.
These concepts, devoted to the qualification and interpretation of sound gestures, are essential for our idea of an intermodal interaction.

Dance Gesture
Gesture in dance can be understood as the energetic trajectory of body movements over time, perceived as a configuration / gestalt with expressive potential.According to Laban (1966, p. 48), body movements can carry some kind of expressive content.
The will or the decision to move is born from the depth of our being.We not only change the positions of our bodies and change the environment through our activity, but we bring an additional color to our movements of our psyche.We speak of feelings, or thoughts that precede or accompany movements .(LABAN, 1966, p. 48).
Rudolf Laban (1897Laban ( -1958) ) developed, in the middle of the 20th century, a theory that sought to clarify the expressive aspects of the movement, the Laban Movement Analysis.The LMA, as it is known internationally, is a methodology for observing, describing, interpreting and recording human movement.The LMA has been employed in various areas such as dance, performing arts, sports, physiotherapy, psychology and behavioral sciences.Currently, this methodology consists of four main categories: Body, Effort, Shape and Space.For Fernandes (2006, p. 320), LMA's vital force is the connection of these four categories in an energetic flow.Laban initially divided the study of motion into three areas: Coreutic, Eukinetic and Motion Notation System, known as Labanotation.Coreutic is understood as the study of motion in space, and Eukinetic is the study of the expressive qualities of motion.Within the Eukinetic, Laban engaged in his research on Effort.Fernandes (2001, p. 11) notes that a more adequate translation for this category would be Expressivity, since the word used in German by Laban was Antrieb, which means stimulation, propulsion, impulse, impetus for movement.
According to the Body category, the body can be separated into parts, and each moving part is susceptible of an Effort analysis.Laban proposed four factors of Effort: a) flow: related to the degree of movement control; b) space: related to the trajectory of movement in space; c) weight: related to the resistance to gravity of the movement; d) time: related to the duration of the movement.Each factor can oscillate between two extremes.The flow factor sways between contained and free, the space factor between direct and indirect, the weight factor between strong and light, and the time factor between sustained and sudden.Hence there are two polarities: (1) Indulging, with free Flow, indirect Space, light Weight and sustained Time; (2) Condensing, with contained Flow, direct Space, strong Weight and sudden Time.
The Effort factors are present in all movements, but in some movements some factors are emphasized and others remain latent.Laban (1978, p. 127-131) classifies the possible combinations of the Effort factors: a. incomplete actions or states of motion: when two factors stand out: -rhythmic state: combinations between weight and time; -oniric state: combinations between weight and flow; -stable state: combinations of weight and space; -remote state: combinations of flow and space.
b. motion impetus: when three factors are in evidence: -impetus of vision: space, flow and time; -spell impetus: weight, space and flow; -passion impetus: flow, weight and time; -impetus of action: weight, space and time.
Within the Impetus of Action, Laban (1978, p. 117-120)  The recognition of the Effort Actions during a dance performance is the main goal of the tool described in the next section.

Multimodal interaction
The advance of digital technology has allowed the appearance of multimodal contexts where sound/music and body gestures may interact in real-time, by means of micro--integration.According to Leman (2008, p.140), "the main novelty of modern digital technology is concerned with the encoding, exchange, and integration of energy, using different levels of description." In dance, movement has a wide variability and a high level of specialty.Dancers acquire high levels of awareness and sensitivity about body movements and their relation to space.The extraction and interpretation of motion capture data, focused on the expressive potential of these movements, is a great challenge.Another issue is related to the real-time integration of this data into musical compositions and other artistic interactions, such as video processing.In the context of HCI, some researchers (CARAMIAUX, 2012;HASHIM et al., 2009;SCHIPHORST, 2009) have explored the expressive potencial of gestures, focusing on movement qualities and, in some cases, applying Laban Movement Analysis (LMA) or similar.Camurri (2000Camurri ( , 2001Camurri ( , 2004)), Schacher (2010) and Maranan et al. (2014) have done works exploring the expressive qualities of dance gestures in interactive situations.Caramiaux (2012, p. 1) developed a model of movement recognition based on some qualities explored in dance: breathing, expansion and reduction.These qualities were defined in collaboration with the Emio Greco dance company, as part of a project to use digital media in dance documentation.They denote inner intentions and may be linked to the other qualities of a movement.Caramiaux captured the movements with a Microsoft Kinect sensor, and from the position data of each joint, the velocity (derived from the position) and the acceleration (derived from velocity) were calculated.Later, he developed an installation that related the qualities of the visitors' hand movements with the modulation of some LED lamps fixed on a wall.Maranan et al. (2014, p. 991) developed a study using LMA and machine learning algorithms for the recognition and classification of movements.An accelerometer fixed on the dominant pulse of the dancers was used to capture the movements.A segmentation method using multiple time windows was employed, and the data was parameterized by six movement descriptors.For the training phase, an LMA-expert dancer was invited to perform the eight Basic Actions; later, in the performance phase, he executed once more the Basic Actions, observed by another LMA-expert, for comparison purposes.The authors applied this system to a dance performance, where the recognized basic actions interacted with sound, light and visual effects.The group also developed a system to generate abstract 1images related to the Basic Actions 1.
Schacher (2010, p. 250) developed a work on interactive dance (dance and music) in collaboration with a composer and a dancer.He developed a multi-layer mapping methodology, which employs two or more types of motion sensors.In this work, (1) cameras and (2) accelerometers coupled to gyroscopes fixed to the dancer's wrists were used.The video images were processed by Computational Vision algorithms, which reduced the data to some descriptors and motion qualifiers, such as speed, orientation in space, centroid acceleration, and lateral/vertical expansion (or contraction).Descriptors involving acceleration and rotation were extracted from the movements of the wrists.The author also points out important aspects of a musical piece conceived to be played by a dancer: The composer has to be aware of the non-linearity of a performer driven musical form.The sonic material is organized in ways, which make recombination in almost all permutations possible.Expressive low-level musical synthesis parameters such as gain, filter frequency or lfo-rates are exposed to be driven by the dancers gestures.Discrete triggering events shape the overall form of the piece; the decisions about selection and removal of sonic materials lie mostly in the performers "hands".(SCHA-CHER, 2010, p.251)

A Tool for Segmenting and Describing Body Movements
Our efforts towards a multimodal interactive tool based on micro-integration began with the analysis of body gestures.Inspired by the LMA effort category, we have developed and implemented in Max-Msp-Jitter a real-time tool to qualify the movements of dancers, captured by a Kinect sensor.We did not adopted a segmentation procedure based on temporal windows, whose lengths may be adjusted manually or automatically (RAN, B et al., 2015, p. 39;MARANAN et al., 2014, p. 3); instead, we opted for a segmentation based on the observation of the zero-crossings of the acceleration curves of each joint.This approach was successfully employed in other works of motion analysis (ZHAO, 2005, p. 84;BINDIGANAVALE, 2000, p. 14), and proved to be quite adequate for our purposes.As can be seen in Figure 1, a simple one-dimensional displacement of an object-from the origin to a new position one meter apart-has a positive speed curve with one peak (and the value zero as starting and end points), and an acceleration curve with a positive peak and a negative valley.

Realtime implementation
We extract the 3D position data generated by Kinect for each of the joints of the body.This data is formatted into OSC 2 messages by the Synapse 3 software, and sent to the Max-Msp-Jitter programming environment.Every joint is assigned with a label, which may be accessed as a variable within the implemented patches and subpatches.The reference axes for the torso data are given by the position of the Kinect (called the world reference); the remaining joints use the torso coordinates as reference (called the body reference).From these data, we estimate the scalar velocity of each joint by calculating the Euclidian distance between two consecutive points (its first derivative).The acceleration curve is estimated by the second derivative.The capture rate is 30 frames per second.In order to smooth out spatial and temporal irregularities of the device, we apply a moving average filtering 4 to this data: we use a 2-point filter for the displacement curves, and a 7-point filter for the velocity and acceleration curves.The beginning of each gesture is determined when the acceleration curve exceeds a positive limit (which is adjustable for each joint), and the end of the gesture is defined when the curve returns to the zero value after having reached a negative limit (Figure 2).The speed and acceleration values are calculated with regard to the sample rate, instead of the common unities for time and length.For each segmented gesture we estimate the following descriptors: a. dur: duration of the gesture in ms, or the time elapsed between its beginning and end; b. the total displacement in mm made by the joint, which is represented by the integration of the Euclidian distances between every adjacent point; c. the displacement modulus, which is the length of the line segment between the start and end points; d. the mean speed, calculated as b/a; e. l_ratio: the ratio between and b and c; f. |accel|: the mean value of the absolute values of every point in the acceleration curve; g. a_ratio: the absolute value of the ratio between the positive and negative acceleration mean values; h.direction of the gesture on x-axis: left or right.It is derived from the difference between the start and end x-values; i. direction of the gesture on the y-axis: up or down.Calculated as in h; j. direction of the gesture on the z-axis: f rontor back.
Note that descriptors h and j are not related to the dancer perspective, but to the Kinect reference axes.Besides, we also extracted the curve of the torso's floor speed, and other qualifiers based on (NAVEDA, 2014, p. 474;CARAMIAUX et al., 2012, p.764): a contraction/expansion index and the distance between different joints, which are not discussed in this paper.Figure 3 depicts the flowchart of the implemented segmentation procedure.

Estimating Laban Effort Factors
The combination of these descriptors may help to classify the gestures according to the LMA Efforts factors.We have implemented algorithms to estimate three of them: space, weight and time.We are still working on a good strategy to cope with the flow factor, for it may be applied either to one or to a possibly large set of gestures.As the effort factors are dynamic qualities, it may occur movements in which one or more factors don't have an expressive emphasis, and thus remain latent.Laban (1978, p. 127) considers such movements as incomplete efforts.Based on this conception, we also found useful to define a neutral region between the two poles of each factor.
Factor Space: this factor is directly related to the descriptor l_ratio.The more it tends to the value 1, the more the gesture is considered direct.Empirical threshold values for the low and high limits of the neutral zone must be defined after much observation, since there is a lot of variation among the different body joints (and also among different individuals).
Condition Result l_ratio<low limit direct low limit <l_ratio<high limit neutral

l_ratio>high limit indirect
Table 2: Conditions for the estimation of the Space Effort factor.
Factor Weight: the estimation of this factor poses insurmountable difficulties for a method based on kinematics, such as the estimation of static forces present in every gesture.Even so, we proposed a tentative procedure for measuring this factor with the chosen motion capture device, based on arguable assumptions: (1) a fast, wide and strong gesture tends to oppose gravity; (2) a strong gesture should spend more kinetic energy (related to higher acceleration and force values) than a lighter one.So far, this factor is calculated by means of a logical combination of three descriptors, organized in two conditional queries.The strength factors for each joint were also defined heuristically.
Condition 1: (a_ratio>1 and direction up) or (a_ratio<1 and direction down).Condition 2: |accel|>strength factor.Factor Time: we use two descriptors to estimate this gesture quality: the duration of the gesture and the mean value of absolute acceleration.Sudden gestures tend to be not only short but also to spend more energy than sustained ones.Once more, useful values for the duration thresh-old and for the strength factor must be defined after observation and analysis.The LMA defines eight basic effort actions, as depicted in Table 1, which are a combination of the three effort factors just described, excepting the flow (for isolated gestures, the flow factor is strongly correlated with the weight factor).

Case Study: Improvising on Excerpts from the Ballet Petrushka
In order to get some insight into the relationships between the pulse and character of the music and the rhythms and qualities of the gestures of a dancer's body, we have chosen some short excerpts from Stravinsky's Petrushka-none of them exceeding 15 s--which present a clear pulse and also rhythmic and orchestral diversity 5 , as depicted in Table 5.A female dancer-an undergraduate student of our university-was asked to improvise freely on each of the excerpts, not long after getting acquainted with them.She was aware of the limitations imposed by Kinect, such as distance, rotation and planes constraints.As mentioned above, we the software Synapse to capture the 3D position data for 15 joints: head, neck, torso, left and right shoulders, elbows, hands, hips, knees and feet.Each rendition was also registered synchronously in audio and video 6 .An excerpt (strav4) had its data corrupted and could not be used.

Results and Discussion
We analyzed the data generated by the procedures of gesture segmentation and qualification in three complementary strategies: a general view of the articulation of gestures with regard to the musical pulses; a quantitative summary of the basic effort actions in each of the renditions; a qualitative approach of the relationship between the musical pulses and the basic effort actions.We also added a neutral basic effort action, defined as a gesture presenting the neutral quality in all three factors.
During (or after) the process of segmentation, it is possible to plot the initial moment and the duration of each gesture in relation to the pulse of the musical excerpt.This display helps to get a general view of the dancer's strategy for the improvisation, as depicted in Figure 4. We have chosen to discuss the excerpt strav7, because it has an interesting rhythmic structure, beginning with upbeat and finishing with downbeat chords.The pulses of excerpt have been annotated by manual means (pressing the space tab), intending to approach the dancer's perception of pulse.Initially, we can observe a tendency for starting a gesture following the pulses, as if occupying the silent downbeat: lower limbs (feet and knees) on the fourth and sixth pulses, upper limb gestures (elbows and hands) on the fifth and seventh pulses.Then we observe that the gestures tend to start more freely around the pulses and, at the end -when sound and pulses come together-we observe a search for synchrony expressed by several joints.
The extraction of the basic effort actions presented instances of only four actions: punch, glide, slash and float.This means that every strong action is also sudden, and also that every light action is also sustained.Any of them may be direct or indirect.Table 6 presents the amount of each action detected in each rendition.Note that the dancer performed two different renditions of the sixth excerpt.Despite this highly selective process (rejecting 70-85% of the total), there still remains many simultaneous (or quasi-) gestures.On average, the number of neutral actions equals that of extreme actions (punch and float), although there exists significant individual deviations.The music in the excerpt strav8 presents only fast gestures on the piano solo, and was performed by the dancer with a majority of light (float and glide) actions.On the other side, strav7 and strav9 have a very rhythmic character, and were performed with a majority of strong (punch and slash) gestures.Excerpt strav6a presented the smallest percentage of all effort actions taken together (13.8%), and the second highest percent-age of neutral gestures (14.8%).This excerpt has a slow pulse and explores very low and very high sounds.Strav5 has a fast pulse, where the trumpet and the snare-drum have prominency, and was performed with a good amount of indirect and strong actions.Only in strav6b we could observe a steady tendency of starting (or ending) an action around the pulse, as Figure 5 shows.We did not find any strict correlation between the start or end points of the actions (taken individually or as a whole) and the musical pulses.Although our experiment was held with just one individual, it was essential for the development and implementation of tools for extraction and description of body gestures.Next, we must extend this experience to different dancers, and also to request the perfor-mance of strong/sustained and light/sudden actions for finer adjustments.As expected, it is very difficult to systematize the relations between musical pulses and the rhythms performed by the joints.Nevertheless, charts like the ones in Figures 4 and 5 may offer a general view of the dancer's strategies for the choreography.

Future Work
Future work will be directed to the interaction of dance and music gestures.The association of morphological sound criteria with Effort factors seems to be a good starting point.For instance, we may associate the mass and harmonic timbre of a sound with the weight of a body gesture; melodic and mass profile with space; dynamics and allure with time.The union of time and flow factors may be associated with the sonic archetypes, as well as the higher levels of LMA Effort category, such as gestural phrase, motive, dynamics and recovery/execution of a gesture.(Fernandes, 2006, p. 154) The development of strategies for controlling the different levels of an artistic composition in interactive and non-linear ways is also fundamental.For this, we will test the layered conceptual framework methodology (Camurri et al., 2001, p. 2;Camurri et al., 2004, p. 5) in prospective studies and in complete interactive music/dance works for stage presentation.
With this work, we hope to contribute a little to shorten the distance between traditional practices of music (and dance) and ideas like multimodality, micro-integration and collective creation based on shared concepts.

Figure 1 :
Figure 1: Curves of (a) displacement, (b) speed and (c) acceleration of a hypothetical one-dimensional movement, covering the distance of one meter in two seconds.The curves (b) and (c) were downscaled, for comparison purposes.

Figure 2 :
Figure 2: Segmentation of the Right Hand Acceleration Curve with a threshold value 2.

Figure 3 :
Figure 3: Flowchart of the gesture segmentation procedure implemented in real-time.

Figure 5 :
Figure 5: Effort Actions and Musical Pulses in excerpt strav6b.

Table 1 :
classifies eight basic Actions: The Eight Basic Effort Actions in LMA.

Table 3 :
Conditions for the estimation of the Weight Effort factor

Table 4 :
Conditions for the estimation of the Time Effort factor.