Final merge

Overview

In this notebook, we will merge all the data we have been preparing so far, i.e., timeseries data for acoustics and motion (see merging script), and annotations we have been working with in the previous notebook.

We will also add annotations of sounding/silence created in Praat (Boersma and Weenink 2025).

Code to prepare the environment

import os
import glob
import xml.etree.ElementTree as ET
import pandas as pd

curfolder = os.getcwd()

# Here we store the merged timeseries data
mergedfolder = curfolder + '\\..\\03_TS_processing\\TS_merged\\'
mergedfiles = glob.glob(mergedfolder + '/merged*.csv')
mergedfiles = [x for x in mergedfiles if 'anno' not in x]

# Here we store the predicted motion annotations
annofolder = curfolder + '\\..\\04_TS_movementAnnotation\\TS_annotated_logreg\\'
annofolders = glob.glob(annofolder + '*0_6\\')

# Here we store the annotations of vocalizations (from AC)
vocannofolder = curfolder + '\\..\\04_TS_movementAnnotation\\ManualAnno\\R1\\'
vocfiles = glob.glob(vocannofolder + '\\*ELAN_tiers.eaf')

# Create folder for the txt annotations
if not os.path.exists(curfolder + '\\Annotations_txt'):
    os.makedirs(curfolder + '\\Annotations_txt')

txtannofolder = curfolder + '\\Annotations_txt\\'

Getting vocalization annotations from ELAN file

We have used Praat to annotate the sounding/silence in the trials and loaded it to ELAN file with the remaining annotations of movement to save all in a single file. Now we want to get the annotations of sounding/silence from the ELAN file and merge it with the rest of the data.

(Note that we are working on automatic annotator of speech that would allow to get the annotations of sounding/silence without the need for external software, similar to our movement annotation pipeline.)

Custom functions

# Function to parse ELAN annotation
def parse_eaf_file(eaf_file, rel_tiers):
    tree = ET.parse(eaf_file)
    root = tree.getroot()

    time_order = root.find('TIME_ORDER')
    time_slots = {time_slot.attrib['TIME_SLOT_ID']: time_slot.attrib['TIME_VALUE'] for time_slot in time_order}

    annotations = []
    relevant_tiers = {rel_tiers}
    for tier in root.findall('TIER'):
        tier_id = tier.attrib['TIER_ID']
        if tier_id in relevant_tiers:
            for annotation in tier.findall('ANNOTATION/ALIGNABLE_ANNOTATION'):
                # Ensure required attributes are present
                if 'TIME_SLOT_REF1' in annotation.attrib and 'TIME_SLOT_REF2' in annotation.attrib:
                    ts_ref1 = annotation.attrib['TIME_SLOT_REF1']
                    ts_ref2 = annotation.attrib['TIME_SLOT_REF2']
                    # Get annotation ID if it exists, otherwise set to None
                    ann_id = annotation.attrib.get('ANNOTATION_ID', None)
                    annotation_value = annotation.find('ANNOTATION_VALUE').text.strip()
                    annotations.append({
                        'tier_id': tier_id,
                        'annotation_id': ann_id,
                        'start_time': time_slots[ts_ref1],
                        'end_time': time_slots[ts_ref2],
                        'annotation_value': annotation_value
                    })

    return annotations

# Here we store the vocalization annotations
vocal_anno = txtannofolder + '\\vocalization_annotations.txt'

with open(vocal_anno, 'w') as f:
    for file in vocfiles:
        print('working on ' + file)
        # Filename
        filename = file.split('\\')[-1]
        filename = filename.replace('_ELAN_tiers.eaf', '')
        # Parse the ELAN file
        annotations = parse_eaf_file(file, 'vocalization')
        # Save it to the file
        for annotation in annotations:
            f.write(f"{annotation['start_time']}\t{annotation['end_time']}\t{annotation['annotation_value']}\t{filename}\n")

Preparing movement annotations

Similarly, we also want to get ready our movement annotations to simple txt file. We store all the predicted annotations separately per tier and per trial, so now we merge all the files into a single txt file, per each tier separately.

As already mentioned in previous script, we need to handle two issues that stem from the the fact that the classifier can create flickering annotations, as the confidence values continuously vary throughout each trial.

Similarly to Pouw et al. (2021), we apply two rules to handle this flickering:

- Rule 1: If there is a nomovement event between two movement events that is shorter than 200 ms, this is considered as part of the movement event.
- Rule 2: If there is a movement event between two nomovement events that is shorter than 200 ms, this is considered as part of the nomovement event.

Afterwards, we take the first movement event and the very last movement event, and consider everything in between as a movement.

Then we write the final movement annotations to a txt file.

Custom functions

# Function to get chunks of annotations
def get_chunks(anno_df):
    anno_df['chunk'] = (anno_df['anno_values'] != anno_df['anno_values'].shift()).cumsum()
    anno_df['idx'] = anno_df.index

    # Calculate start and end of each chunk, grouped by anno_values, save also the first and last index
    chunks = anno_df.groupby(['anno_values', 'chunk']).agg(
        time_ms_min=('time_ms', 'first'),
        time_ms_max=('time_ms', 'last'),
        idx_min=('idx', 'first'),
        idx_max=('idx', 'last')
    ).reset_index()

    # Order the chunks
    chunks = chunks.sort_values('idx_min').reset_index(drop=True)

    return chunks

for folder in annofolders:
    # get tierID
    tier = folder.split('\\')[-2].split('_')[0]

    if tier == 'head':
        tier = 'head'
    elif tier == 'upperBody':
        tier = 'upper'
    elif tier == 'lowerBody':
        tier = 'lower'

    # This is the file we want to create
    txtfile = txtannofolder + 'movement_' + tier + '.txt'

    # List all files in the folder
    files = glob.glob(folder + '*.csv')

    for file in files:
        print('processing: ' + file)

        # Filename
        filename = file.split('\\')[-1].split('.')[0]
        filename = filename.split('_')[2:6]
        filename = '_'.join(filename)

        # Now we process the annotations made by the logreg model
        anno_df = pd.read_csv(file)

        # Chunk the df to see unique annotated chunks
        chunks = get_chunks(anno_df)

        # Check for fake pauses (i.e., nomovement annotation that last for less than 200ms)
        for i in range(1, len(chunks)-1):
            if chunks.loc[i, 'anno_values'] == 'no movement' and chunks.loc[i-1, 'anno_values'] == 'movement' and chunks.loc[i+1, 'anno_values'] == 'movement':
                if chunks.loc[i, 'time_ms_max'] - chunks.loc[i, 'time_ms_min'] < 200:
                    print('found a chunk of no movement between two movement chunks that is shorter than 200 ms')
                    # Change the chunk into movement
                    anno_df.loc[chunks.loc[i, 'idx_min']:chunks.loc[i, 'idx_max'], 'anno_values'] = 'movement'

        # Calculate new chunks
        chunks = get_chunks(anno_df)

        # Now check for fake movement (i.e., movement chunk that is shorter than 200ms)
        for i in range(1, len(chunks)-1):
            if chunks.loc[i, 'anno_values'] == 'movement' and chunks.loc[i-1, 'anno_values'] == 'no movement' and chunks.loc[i+1, 'anno_values'] == 'no movement':
                if chunks.loc[i, 'time_ms_max'] - chunks.loc[i, 'time_ms_min'] < 200:
                    print('found a chunk of movement between two no movement chunks that is shorter than 250 ms')
                    # change the chunk to no movement in the original df
                    anno_df.loc[chunks.loc[i, 'idx_min']:chunks.loc[i, 'idx_max'], 'anno_values'] = 'no movement'

        
        # Now, similarly to our human annotators, we consider movement anything from the very first movement to the very last movement
        if 'movement' in anno_df['anno_values'].unique():
            # Get the first and last index of movement
            first_idx = anno_df[anno_df['anno_values'] == 'movement'].index[0]
            last_idx = anno_df[anno_df['anno_values'] == 'movement'].index[-1]
            # Change all between to movement
            anno_df.loc[first_idx:last_idx, 'anno_values'] = 'movement'

        # Calculate new chunks
        chunks = get_chunks(anno_df)

        # Rewrite "no movement" in anno_values to "nomovement" (to match the manual annotations)
        chunks['anno_values'] = chunks['anno_values'].apply(
            lambda x: 'nomovement' if x == 'no movement' else x
        )

        # TrialID
        chunks['TrialID']  = str(filename)

        # Write to the text file
        with open(txtfile, 'a') as f:
            for _, row in chunks.iterrows():
                f.write(
                    f"{row['time_ms_min']}\t{row['time_ms_max']}\t{row['anno_values']}\t{row['TrialID']}\n")

Final merge

Now we take the merged timeseries with acoustic and movement data, and add columns for vocalization annotations and movement annotations. We will also add tier for general movement, concatenating the movement annotations from all tiers to see when a movement (of any articulator) starts and when it ends.

Finally, we save the merged data to a single csv file per each trial

Custom functions

# Function to load annotations from txt file to timeseries
def anno_to_df(df, anno, anno_col):
    for row in anno.iterrows():
        start = row[1][0]
        end = row[1][1]
        value = str(row[1][2])
        df.loc[(df['time'] >= start) & (df['time'] <= end), anno_col] = value

# Here we will store the merged timeseries with annotations
TSfinal = curfolder + '\\TS_final\\'

# Here we store the annotations of vocalizations (from AC)
voc_anno = txtannofolder + '\\vocalization_annotations.txt'
# Here we store the annotations of the movement
head_anno = txtannofolder + '\\movement_head.txt'
upper_anno = txtannofolder + '\\movement_upper.txt'
lower_anno = txtannofolder + '\\movement_lower.txt'
arms_anno = txtannofolder + '\\movement_arms.txt'

# Load the annotatins
voc_df = pd.read_csv(voc_anno, sep='\t', header=None)
head_df = pd.read_csv(head_anno, sep='\t', header=None)
upper_df = pd.read_csv(upper_anno, sep='\t', header=None)
lower_df = pd.read_csv(lower_anno, sep='\t', header=None)
arms_df = pd.read_csv(arms_anno, sep='\t', header=None)

for file in mergedfiles:
    print('working on ' + file)

    # TrialID
    trialid = file.split('\\')[-1].split('.')[0]
    trialid = trialid.replace('merged_', '')
    
    # Load the file
    merged = pd.read_csv(file)

    # Get the annotations for this trialID
    voc_anno_trial = voc_df[voc_df[3] == trialid]
    #print(voc_anno_trial)
    head_anno_trial = head_df[head_df[3] == trialid]
    upper_anno_trial = upper_df[upper_df[3] == trialid]
    lower_anno_trial = lower_df[lower_df[3] == trialid]
    arms_anno_trial = arms_df[arms_df[3] == trialid]

    # Prepare error log
    error_log = []

    # If any of the annotations is empty, we skip this trial and save a message - these should be practice trials in our case
    if any([voc_anno_trial.empty, head_anno_trial.empty, upper_anno_trial.empty, lower_anno_trial.empty, arms_anno_trial.empty]):
        print('no annotations for ' + trialid)
        error_log.append('no annotations for ' + trialid)
        continue

    else:
        merged['vocalization'] = ''
        anno_to_df(merged, voc_anno_trial, 'vocalization')
        merged['head_mov'] = ''
        anno_to_df(merged, head_anno_trial, 'head_mov')
        merged['upper_mov'] = ''
        anno_to_df(merged, upper_anno_trial, 'upper_mov')
        merged['lower_mov'] = ''
        anno_to_df(merged, lower_anno_trial, 'lower_mov')
        merged['arms_mov'] = ''
        anno_to_df(merged, arms_anno_trial, 'arms_mov')

    # Also create a column 'movement_in_trial' that combines all movement annotations
    merged['movement_in_trial'] = None
    # Loop through rows and if any of the movement columns is 'movement', then fill the movement_in_trial column with 'movement'
    try:
        first_movement = merged[(merged['head_mov'] == 'movement') | (merged['upper_mov'] == 'movement') | (merged['lower_mov'] == 'movement') | (merged['arms_mov'] == 'movement')].index[0]
        last_movement = merged[(merged['head_mov'] == 'movement') | (merged['upper_mov'] == 'movement') | (merged['lower_mov'] == 'movement') | (merged['arms_mov'] == 'movement')].index[-1]
    except IndexError:
        print('no movement annotations for ' + trialid)
        # fill the movement_in_trial column with 'nomovement'
        merged['movement_in_trial'] = 'nomovement'
        
    # Fill the movement_in_trial column
    merged.loc[first_movement:last_movement, 'movement_in_trial'] = 'movement'
    # Fill the rest with 'nomovement'
    merged['movement_in_trial'] = merged['movement_in_trial'].fillna('nomovement')

    # Save the merged file
    merged.to_csv(TSfinal + '/merged_anno_' + trialid + '.csv', index=False)

    # Save the error log
    with open(TSfinal + '\\error_log.txt', 'a') as f:
        for line in error_log:
            f.write(line + '\n')

This is how our final multimodal dataset with annotation looks like for a single trial.

	time	left_back	right_forward	right_back	left_forward	COPXc	COPYc	COPc	TrialID	FileInfo	...	lowerbody_power	leg_power	head_power	arm_power	vocalization	head_mov	upper_mov	lower_mov	arms_mov	movement_in_trial
0	0.0	1.086809	0.830746	1.491993	1.384194	0.000019	-0.000184	0.000185	0_2_103_p1	p1_zout_geluiden_c1	...	23.657219	4.718786	2.372685	19.838907	silent	nomovement	nomovement	nomovement	nomovement	nomovement
1	2.0	1.087116	0.830946	1.492349	1.384381	0.000043	-0.000173	0.000178	0_2_103_p1	p1_zout_geluiden_c1	...	23.701134	4.724236	2.376751	19.895821	silent	nomovement	nomovement	nomovement	nomovement	nomovement
2	4.0	1.087440	0.831186	1.492731	1.384605	0.000065	-0.000165	0.000178	0_2_103_p1	p1_zout_geluiden_c1	...	23.745049	4.729686	2.380816	19.952735	silent	nomovement	nomovement	nomovement	nomovement	nomovement
3	6.0	1.087778	0.831459	1.493137	1.384858	0.000085	-0.000160	0.000181	0_2_103_p1	p1_zout_geluiden_c1	...	23.788964	4.735136	2.384881	20.009650	silent	nomovement	nomovement	nomovement	nomovement	nomovement
4	8.0	1.088125	0.831761	1.493562	1.385137	0.000102	-0.000157	0.000188	0_2_103_p1	p1_zout_geluiden_c1	...	23.832878	4.740586	2.388947	20.066564	silent	nomovement	nomovement	nomovement	nomovement	nomovement
5	10.0	1.088481	0.832086	1.494006	1.385435	0.000118	-0.000156	0.000196	0_2_103_p1	p1_zout_geluiden_c1	...	23.876793	4.746036	2.393012	20.123478	silent	nomovement	nomovement	nomovement	nomovement	nomovement
6	12.0	1.088841	0.832430	1.494464	1.385748	0.000132	-0.000156	0.000204	0_2_103_p1	p1_zout_geluiden_c1	...	23.920708	4.751486	2.397077	20.180392	silent	nomovement	nomovement	nomovement	nomovement	nomovement
7	14.0	1.089204	0.832789	1.494934	1.386073	0.000145	-0.000156	0.000213	0_2_103_p1	p1_zout_geluiden_c1	...	23.964623	4.756936	2.401143	20.237306	silent	nomovement	nomovement	nomovement	nomovement	nomovement
8	16.0	1.089568	0.833159	1.495415	1.386406	0.000157	-0.000156	0.000222	0_2_103_p1	p1_zout_geluiden_c1	...	24.008538	4.762386	2.405208	20.294220	silent	nomovement	nomovement	nomovement	nomovement	nomovement
9	18.0	1.089931	0.833538	1.495903	1.386744	0.000169	-0.000156	0.000230	0_2_103_p1	p1_zout_geluiden_c1	...	24.052452	4.767836	2.409273	20.351134	silent	nomovement	nomovement	nomovement	nomovement	nomovement
10	20.0	1.090291	0.833922	1.496398	1.387084	0.000179	-0.000156	0.000238	0_2_103_p1	p1_zout_geluiden_c1	...	24.096367	4.773286	2.413339	20.408049	silent	nomovement	nomovement	nomovement	nomovement	nomovement
11	22.0	1.090647	0.834309	1.496896	1.387423	0.000189	-0.000155	0.000245	0_2_103_p1	p1_zout_geluiden_c1	...	24.140282	4.778736	2.417404	20.464963	silent	nomovement	nomovement	nomovement	nomovement	nomovement
12	24.0	1.090998	0.834698	1.497396	1.387761	0.000199	-0.000153	0.000251	0_2_103_p1	p1_zout_geluiden_c1	...	24.184197	4.784186	2.421469	20.521877	silent	nomovement	nomovement	nomovement	nomovement	nomovement
13	26.0	1.091343	0.835085	1.497897	1.388096	0.000207	-0.000150	0.000256	0_2_103_p1	p1_zout_geluiden_c1	...	24.228112	4.789636	2.425535	20.578791	silent	nomovement	nomovement	nomovement	nomovement	nomovement
14	28.0	1.091680	0.835471	1.498397	1.388425	0.000216	-0.000146	0.000261	0_2_103_p1	p1_zout_geluiden_c1	...	24.272027	4.795086	2.429600	20.635705	silent	nomovement	nomovement	nomovement	nomovement	nomovement

15 rows × 534 columns

Now we are ready to proceed with the analysis.

References

Boersma, Paul, and David Weenink. 2025. “Praat: Doing Phonetics by Computer.” http://www.praat.org/.

Pouw, Wim, Jan de Wit, Sara Bögels, Marlou Rasenberg, Branka Milivojevic, and Asli Ozyurek. 2021. “Semantically Related Gestures Move Alike: Towards a Distributional Semantics of Gesture Kinematics.” In Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior, edited by Vincent G. Duffy, 269–87. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-77817-0_20.