Final merge

Overview

In this notebook, we will merge all the data we have been preparing so far, i.e., timeseries data for acoustics and motion (see merging script), and annotations we have been working with in the previous notebook.

We will also add annotations of sounding/silence created in Praat (Boersma and Weenink 2025).

Code to prepare the environment
import os
import glob
import xml.etree.ElementTree as ET
import pandas as pd

curfolder = os.getcwd()

# Here we store the merged timeseries data
mergedfolder = curfolder + '\\..\\03_TS_processing\\TS_merged\\'
mergedfiles = glob.glob(mergedfolder + '/merged*.csv')
mergedfiles = [x for x in mergedfiles if 'anno' not in x]

# Here we store the predicted motion annotations
annofolder = curfolder + '\\..\\04_TS_movementAnnotation\\TS_annotated_logreg\\'
annofolders = glob.glob(annofolder + '*0_6\\')

# Here we store the annotations of vocalizations (from AC)
vocannofolder = curfolder + '\\..\\04_TS_movementAnnotation\\ManualAnno\\R1\\'
vocfiles = glob.glob(vocannofolder + '\\*ELAN_tiers.eaf')

# Create folder for the txt annotations
if not os.path.exists(curfolder + '\\Annotations_txt'):
    os.makedirs(curfolder + '\\Annotations_txt')

txtannofolder = curfolder + '\\Annotations_txt\\'

Getting vocalization annotations from ELAN file

We have used Praat to annotate the sounding/silence in the trials and loaded it to ELAN file with the remaining annotations of movement to save all in a single file. Now we want to get the annotations of sounding/silence from the ELAN file and merge it with the rest of the data.

(Note that we are working on automatic annotator of speech that would allow to get the annotations of sounding/silence without the need for external software, similar to our movement annotation pipeline.)

Custom functions
# Function to parse ELAN annotation
def parse_eaf_file(eaf_file, rel_tiers):
    tree = ET.parse(eaf_file)
    root = tree.getroot()

    time_order = root.find('TIME_ORDER')
    time_slots = {time_slot.attrib['TIME_SLOT_ID']: time_slot.attrib['TIME_VALUE'] for time_slot in time_order}

    annotations = []
    relevant_tiers = {rel_tiers}
    for tier in root.findall('TIER'):
        tier_id = tier.attrib['TIER_ID']
        if tier_id in relevant_tiers:
            for annotation in tier.findall('ANNOTATION/ALIGNABLE_ANNOTATION'):
                # Ensure required attributes are present
                if 'TIME_SLOT_REF1' in annotation.attrib and 'TIME_SLOT_REF2' in annotation.attrib:
                    ts_ref1 = annotation.attrib['TIME_SLOT_REF1']
                    ts_ref2 = annotation.attrib['TIME_SLOT_REF2']
                    # Get annotation ID if it exists, otherwise set to None
                    ann_id = annotation.attrib.get('ANNOTATION_ID', None)
                    annotation_value = annotation.find('ANNOTATION_VALUE').text.strip()
                    annotations.append({
                        'tier_id': tier_id,
                        'annotation_id': ann_id,
                        'start_time': time_slots[ts_ref1],
                        'end_time': time_slots[ts_ref2],
                        'annotation_value': annotation_value
                    })

    return annotations
# Here we store the vocalization annotations
vocal_anno = txtannofolder + '\\vocalization_annotations.txt'

with open(vocal_anno, 'w') as f:
    for file in vocfiles:
        print('working on ' + file)
        # Filename
        filename = file.split('\\')[-1]
        filename = filename.replace('_ELAN_tiers.eaf', '')
        # Parse the ELAN file
        annotations = parse_eaf_file(file, 'vocalization')
        # Save it to the file
        for annotation in annotations:
            f.write(f"{annotation['start_time']}\t{annotation['end_time']}\t{annotation['annotation_value']}\t{filename}\n")

Preparing movement annotations

Similarly, we also want to get ready our movement annotations to simple txt file. We store all the predicted annotations separately per tier and per trial, so now we merge all the files into a single txt file, per each tier separately.

As already mentioned in previous script, we need to handle two issues that stem from the the fact that the classifier can create flickering annotations, as the confidence values continuously vary throughout each trial.

Similarly to Pouw et al. (2021), we apply two rules to handle this flickering:

- Rule 1: If there is a nomovement event between two movement events that is shorter than 200 ms, this is considered as part of the movement event.
- Rule 2: If there is a movement event between two nomovement events that is shorter than 200 ms, this is considered as part of the nomovement event.

Afterwards, we take the first movement event and the very last movement event, and consider everything in between as a movement.

Then we write the final movement annotations to a txt file.

Custom functions
# Function to get chunks of annotations
def get_chunks(anno_df):
    anno_df['chunk'] = (anno_df['anno_values'] != anno_df['anno_values'].shift()).cumsum()
    anno_df['idx'] = anno_df.index

    # Calculate start and end of each chunk, grouped by anno_values, save also the first and last index
    chunks = anno_df.groupby(['anno_values', 'chunk']).agg(
        time_ms_min=('time_ms', 'first'),
        time_ms_max=('time_ms', 'last'),
        idx_min=('idx', 'first'),
        idx_max=('idx', 'last')
    ).reset_index()

    # Order the chunks
    chunks = chunks.sort_values('idx_min').reset_index(drop=True)

    return chunks
for folder in annofolders:
    # get tierID
    tier = folder.split('\\')[-2].split('_')[0]

    if tier == 'head':
        tier = 'head'
    elif tier == 'upperBody':
        tier = 'upper'
    elif tier == 'lowerBody':
        tier = 'lower'

    # This is the file we want to create
    txtfile = txtannofolder + 'movement_' + tier + '.txt'

    # List all files in the folder
    files = glob.glob(folder + '*.csv')

    for file in files:
        print('processing: ' + file)

        # Filename
        filename = file.split('\\')[-1].split('.')[0]
        filename = filename.split('_')[2:6]
        filename = '_'.join(filename)

        # Now we process the annotations made by the logreg model
        anno_df = pd.read_csv(file)

        # Chunk the df to see unique annotated chunks
        chunks = get_chunks(anno_df)

        # Check for fake pauses (i.e., nomovement annotation that last for less than 200ms)
        for i in range(1, len(chunks)-1):
            if chunks.loc[i, 'anno_values'] == 'no movement' and chunks.loc[i-1, 'anno_values'] == 'movement' and chunks.loc[i+1, 'anno_values'] == 'movement':
                if chunks.loc[i, 'time_ms_max'] - chunks.loc[i, 'time_ms_min'] < 200:
                    print('found a chunk of no movement between two movement chunks that is shorter than 200 ms')
                    # Change the chunk into movement
                    anno_df.loc[chunks.loc[i, 'idx_min']:chunks.loc[i, 'idx_max'], 'anno_values'] = 'movement'

        # Calculate new chunks
        chunks = get_chunks(anno_df)

        # Now check for fake movement (i.e., movement chunk that is shorter than 200ms)
        for i in range(1, len(chunks)-1):
            if chunks.loc[i, 'anno_values'] == 'movement' and chunks.loc[i-1, 'anno_values'] == 'no movement' and chunks.loc[i+1, 'anno_values'] == 'no movement':
                if chunks.loc[i, 'time_ms_max'] - chunks.loc[i, 'time_ms_min'] < 200:
                    print('found a chunk of movement between two no movement chunks that is shorter than 250 ms')
                    # change the chunk to no movement in the original df
                    anno_df.loc[chunks.loc[i, 'idx_min']:chunks.loc[i, 'idx_max'], 'anno_values'] = 'no movement'

        
        # Now, similarly to our human annotators, we consider movement anything from the very first movement to the very last movement
        if 'movement' in anno_df['anno_values'].unique():
            # Get the first and last index of movement
            first_idx = anno_df[anno_df['anno_values'] == 'movement'].index[0]
            last_idx = anno_df[anno_df['anno_values'] == 'movement'].index[-1]
            # Change all between to movement
            anno_df.loc[first_idx:last_idx, 'anno_values'] = 'movement'

        # Calculate new chunks
        chunks = get_chunks(anno_df)

        # Rewrite "no movement" in anno_values to "nomovement" (to match the manual annotations)
        chunks['anno_values'] = chunks['anno_values'].apply(
            lambda x: 'nomovement' if x == 'no movement' else x
        )

        # TrialID
        chunks['TrialID']  = str(filename)

        # Write to the text file
        with open(txtfile, 'a') as f:
            for _, row in chunks.iterrows():
                f.write(
                    f"{row['time_ms_min']}\t{row['time_ms_max']}\t{row['anno_values']}\t{row['TrialID']}\n")
                

Final merge

Now we take the merged timeseries with acoustic and movement data, and add columns for vocalization annotations and movement annotations. We will also add tier for general movement, concatenating the movement annotations from all tiers to see when a movement (of any articulator) starts and when it ends.

Finally, we save the merged data to a single csv file per each trial

Custom functions
# Function to load annotations from txt file to timeseries
def anno_to_df(df, anno, anno_col):
    for row in anno.iterrows():
        start = row[1][0]
        end = row[1][1]
        value = str(row[1][2])
        df.loc[(df['time'] >= start) & (df['time'] <= end), anno_col] = value
# Here we will store the merged timeseries with annotations
TSfinal = curfolder + '\\TS_final\\'

# Here we store the annotations of vocalizations (from AC)
voc_anno = txtannofolder + '\\vocalization_annotations.txt'
# Here we store the annotations of the movement
head_anno = txtannofolder + '\\movement_head.txt'
upper_anno = txtannofolder + '\\movement_upper.txt'
lower_anno = txtannofolder + '\\movement_lower.txt'
arms_anno = txtannofolder + '\\movement_arms.txt'

# Load the annotatins
voc_df = pd.read_csv(voc_anno, sep='\t', header=None)
head_df = pd.read_csv(head_anno, sep='\t', header=None)
upper_df = pd.read_csv(upper_anno, sep='\t', header=None)
lower_df = pd.read_csv(lower_anno, sep='\t', header=None)
arms_df = pd.read_csv(arms_anno, sep='\t', header=None)

for file in mergedfiles:
    print('working on ' + file)

    # TrialID
    trialid = file.split('\\')[-1].split('.')[0]
    trialid = trialid.replace('merged_', '')
    
    # Load the file
    merged = pd.read_csv(file)

    # Get the annotations for this trialID
    voc_anno_trial = voc_df[voc_df[3] == trialid]
    #print(voc_anno_trial)
    head_anno_trial = head_df[head_df[3] == trialid]
    upper_anno_trial = upper_df[upper_df[3] == trialid]
    lower_anno_trial = lower_df[lower_df[3] == trialid]
    arms_anno_trial = arms_df[arms_df[3] == trialid]

    # Prepare error log
    error_log = []

    # If any of the annotations is empty, we skip this trial and save a message - these should be practice trials in our case
    if any([voc_anno_trial.empty, head_anno_trial.empty, upper_anno_trial.empty, lower_anno_trial.empty, arms_anno_trial.empty]):
        print('no annotations for ' + trialid)
        error_log.append('no annotations for ' + trialid)
        continue

    else:
        merged['vocalization'] = ''
        anno_to_df(merged, voc_anno_trial, 'vocalization')
        merged['head_mov'] = ''
        anno_to_df(merged, head_anno_trial, 'head_mov')
        merged['upper_mov'] = ''
        anno_to_df(merged, upper_anno_trial, 'upper_mov')
        merged['lower_mov'] = ''
        anno_to_df(merged, lower_anno_trial, 'lower_mov')
        merged['arms_mov'] = ''
        anno_to_df(merged, arms_anno_trial, 'arms_mov')

    # Also create a column 'movement_in_trial' that combines all movement annotations
    merged['movement_in_trial'] = None
    # Loop through rows and if any of the movement columns is 'movement', then fill the movement_in_trial column with 'movement'
    try:
        first_movement = merged[(merged['head_mov'] == 'movement') | (merged['upper_mov'] == 'movement') | (merged['lower_mov'] == 'movement') | (merged['arms_mov'] == 'movement')].index[0]
        last_movement = merged[(merged['head_mov'] == 'movement') | (merged['upper_mov'] == 'movement') | (merged['lower_mov'] == 'movement') | (merged['arms_mov'] == 'movement')].index[-1]
    except IndexError:
        print('no movement annotations for ' + trialid)
        # fill the movement_in_trial column with 'nomovement'
        merged['movement_in_trial'] = 'nomovement'
        
    # Fill the movement_in_trial column
    merged.loc[first_movement:last_movement, 'movement_in_trial'] = 'movement'
    # Fill the rest with 'nomovement'
    merged['movement_in_trial'] = merged['movement_in_trial'].fillna('nomovement')

    # Save the merged file
    merged.to_csv(TSfinal + '/merged_anno_' + trialid + '.csv', index=False)

    # Save the error log
    with open(TSfinal + '\\error_log.txt', 'a') as f:
        for line in error_log:
            f.write(line + '\n')

This is how our final multimodal dataset with annotation looks like for a single trial.

time left_back right_forward right_back left_forward COPXc COPYc COPc TrialID FileInfo ... lowerbody_power leg_power head_power arm_power vocalization head_mov upper_mov lower_mov arms_mov movement_in_trial
0 0.0 1.086809 0.830746 1.491993 1.384194 0.000019 -0.000184 0.000185 0_2_103_p1 p1_zout_geluiden_c1 ... 23.657219 4.718786 2.372685 19.838907 silent nomovement nomovement nomovement nomovement nomovement
1 2.0 1.087116 0.830946 1.492349 1.384381 0.000043 -0.000173 0.000178 0_2_103_p1 p1_zout_geluiden_c1 ... 23.701134 4.724236 2.376751 19.895821 silent nomovement nomovement nomovement nomovement nomovement
2 4.0 1.087440 0.831186 1.492731 1.384605 0.000065 -0.000165 0.000178 0_2_103_p1 p1_zout_geluiden_c1 ... 23.745049 4.729686 2.380816 19.952735 silent nomovement nomovement nomovement nomovement nomovement
3 6.0 1.087778 0.831459 1.493137 1.384858 0.000085 -0.000160 0.000181 0_2_103_p1 p1_zout_geluiden_c1 ... 23.788964 4.735136 2.384881 20.009650 silent nomovement nomovement nomovement nomovement nomovement
4 8.0 1.088125 0.831761 1.493562 1.385137 0.000102 -0.000157 0.000188 0_2_103_p1 p1_zout_geluiden_c1 ... 23.832878 4.740586 2.388947 20.066564 silent nomovement nomovement nomovement nomovement nomovement
5 10.0 1.088481 0.832086 1.494006 1.385435 0.000118 -0.000156 0.000196 0_2_103_p1 p1_zout_geluiden_c1 ... 23.876793 4.746036 2.393012 20.123478 silent nomovement nomovement nomovement nomovement nomovement
6 12.0 1.088841 0.832430 1.494464 1.385748 0.000132 -0.000156 0.000204 0_2_103_p1 p1_zout_geluiden_c1 ... 23.920708 4.751486 2.397077 20.180392 silent nomovement nomovement nomovement nomovement nomovement
7 14.0 1.089204 0.832789 1.494934 1.386073 0.000145 -0.000156 0.000213 0_2_103_p1 p1_zout_geluiden_c1 ... 23.964623 4.756936 2.401143 20.237306 silent nomovement nomovement nomovement nomovement nomovement
8 16.0 1.089568 0.833159 1.495415 1.386406 0.000157 -0.000156 0.000222 0_2_103_p1 p1_zout_geluiden_c1 ... 24.008538 4.762386 2.405208 20.294220 silent nomovement nomovement nomovement nomovement nomovement
9 18.0 1.089931 0.833538 1.495903 1.386744 0.000169 -0.000156 0.000230 0_2_103_p1 p1_zout_geluiden_c1 ... 24.052452 4.767836 2.409273 20.351134 silent nomovement nomovement nomovement nomovement nomovement
10 20.0 1.090291 0.833922 1.496398 1.387084 0.000179 -0.000156 0.000238 0_2_103_p1 p1_zout_geluiden_c1 ... 24.096367 4.773286 2.413339 20.408049 silent nomovement nomovement nomovement nomovement nomovement
11 22.0 1.090647 0.834309 1.496896 1.387423 0.000189 -0.000155 0.000245 0_2_103_p1 p1_zout_geluiden_c1 ... 24.140282 4.778736 2.417404 20.464963 silent nomovement nomovement nomovement nomovement nomovement
12 24.0 1.090998 0.834698 1.497396 1.387761 0.000199 -0.000153 0.000251 0_2_103_p1 p1_zout_geluiden_c1 ... 24.184197 4.784186 2.421469 20.521877 silent nomovement nomovement nomovement nomovement nomovement
13 26.0 1.091343 0.835085 1.497897 1.388096 0.000207 -0.000150 0.000256 0_2_103_p1 p1_zout_geluiden_c1 ... 24.228112 4.789636 2.425535 20.578791 silent nomovement nomovement nomovement nomovement nomovement
14 28.0 1.091680 0.835471 1.498397 1.388425 0.000216 -0.000146 0.000261 0_2_103_p1 p1_zout_geluiden_c1 ... 24.272027 4.795086 2.429600 20.635705 silent nomovement nomovement nomovement nomovement nomovement

15 rows × 534 columns


Now we are ready to proceed with the analysis.

References

Boersma, Paul, and David Weenink. 2025. “Praat: Doing Phonetics by Computer.” http://www.praat.org/.
Pouw, Wim, Jan de Wit, Sara Bögels, Marlou Rasenberg, Branka Milivojevic, and Asli Ozyurek. 2021. “Semantically Related Gestures Move Alike: Towards a Distributional Semantics of Gesture Kinematics.” In Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Human Body, Motion and Behavior, edited by Vincent G. Duffy, 269–87. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-77817-0_20.