Physical activity monitoring devices: energy expenditure comparison in a setting of free-living activities

The aim of this study was to evaluate the validity of Energy Expenditure (EE) estimation provided by 3 wearable devices [Fitbit-One (FO), Sensewear Armband (AR) and Actiheart (AC)] in a setting of free-living activities. 43 participants (24 females; 23.4±.4,5yrs) performed 9 activities: sedentary (watching video, reading), walking (on treadmill and outdoor), running (on treadmill and outdoor) and moderate-to-vigorous activities (Wii gaming, taking the stairs and playing football). Mean Absolute Percentage Error (MAPE) and Pearson’s correlation were calculated to assess the validity of each instrument in comparison to a portable metabolic analyser (PMA). In overall comparison MAPE’s were 7,7% for AR (r=.86; p<.0001), 8,6% for FO (r=.69; P<.001), and 11.6% for AC (r=.81; p<.0001). These findings support the accuracy of the wearables. The AR was the most accurate in the whole protocol. However, MAPE results suggest that devices algorithms should be improved for better measure of EE during moderate-to-vigorous activities.


Introduction
Physical activity (PA) has a fundamental role in human health and help to prevent cardiovascular disease, some cancers, osteoporosis, type-2 diabetes, anxiety and depression [1,2]. However, adults tend to be less active than the guidelines prescriptions, in fact in Italy only 1 on 3 adults use to practice PA or sport during leisure time and 42% of total population is completely inactive.
Wearable devices for PA are light and affordable and during the last years the availability in commerce increased considerably [3]. Thanks to monitor immediate feedbacks or mobile and internet applications, they give to users various information parameters such as step count, calories burned, time spent in active or sedentary activities and covered distance. Use of objective methods to quantify PA is worldwide increasing as a research and consumerbased tool and it could encourage people to practice more PA [4,5]. In fact, according to Giroir et al [6] and the PA Guidelines for Americans [7], the use of wearable devices is a new strategy to enhance PA level in people. In research and field base studies, wearables were tested in free-living condition and in sedentary activities comparing a selection of consumer-level devices and commonly used research-grade accelerometers [8].
However, little is known about the accuracy of these devices during each task at different intensity. Since accuracy measures derive from the mean of the absolute error of the whole protocol, it could be likely that devices output measures have discordant trends during tasks at different intensities, in particular during different type of activities like walking and running both in indoor and in outdoor conditions that could lead interesting data.
Especially in this last condition devices should be preferred to subjective methods such as questionnaires. In fact, it was suggested that these measurement tools underestimate actual sedentary time and the degree of underestimation is extremely variable between subjects [9].
Wearable devices provide also information about the energy expenditure (EE). Usually, in research EE is measured by a gas analyser to avoid the possible lack of accuracy from wearable devices.
In a sort of best practice for the use of wearables, Fredson et al. recommended to calibrate and validate research devices with appropriate protocols and identified the correct use of consumerbased sensors like a challenge for the future of PA studies.
Following these recommendations, Lee, Kim and Welk [10] verified the validity of eight consumerlevel devices to asses EE in healthy young adults. They performed a laboratory set of activities lasted 69 minutes, the consumer-level devices were compared against a metabolimeter for oxygen consumption used as the golden standard tool for indirect calorimetry. Researchers ranked devices based on percent of accuracy ranging from 76.5% to 90.7%. To classify instruments, it was considered the whole protocol EE that included sedentary, moderate and vigorous intensity activities [10].
Thus, the aim of this study was to evaluate the validity of EE estimation provided by 3 wearable devices in a setting of both laboratory and free-living activities in comparison to a portable metabolic analyser (PMA) during tasks at different intensity.

Participants
Forty-three subjects (24 females; 23.4±4,5 years; mean±sd) voluntarily took part to the study. Participants did not have diseases or illnesses and did not use drugs that would affect their body weight or metabolism. Subjects were recruited from University of Pavia students through internet announcements.
Approval from the academic review board of Kinesiology Course was obtained before beginning this study. Participants were aware of the procedures and purpose of the study before they signed the informed consent document.

Measures
Many wearable devices were produced and used in order to provide an objective indicator of PA such as the EE, which represents a fundamental factor to quantify PA. This study aimed to examine the validity of EE, analysed through devices in different types of activities (sedentary, walking, running and moderate to vigorous), that could represent a setting of real-life PA.
Moreover, it was examined more deeply the accuracy of accelerometers in specific conditions showing data that could validate the values of EE output with a PMA (K4b2 COSMED, Rome, Italy), which represented the gold standard of the study.
Before testing, anthropometric data (height, weight and BMI), the age of participants and basal metabolism of each subject were measured. (table 2); at last, all devices were prepared with the personalized data of each subject to make the software of each instruments more accuracy.

Procedures
The protocol of the study was composed by 9 different activities and lasted totally 64 min ( fig. 1). The order of activities intensity is incremental to envoy effects of fatigue and to facilitate the succession of indoor to outdoor trials: all the subjects began with indoor trials as reading a newspaper and watching a video, then they performed both walking and running on treadmill and afterward the subjects went up and down on stairs and played with Nintendo Wii gaming. At least, all the subjects continued with outdoor activities like walking, running and football small match. While the intensity of work on treadmill were the same for all the subjects, walking and running outdoor were at selfselected gait. Every activity was performed for 5 minutes. There was a 1-minute rest between each activity to facilitate transitions and the tracking of data. All the activities were classified into four distinct PA intensities: 1) Sedentary (reading a newspaper, watching a video) 2) Walking (treadmill walking at 4km/h, self-paced over ground walking), 3) Running (treadmill running at 10km/h, self-paced over ground running) and 4) Moderate-to-vigorous activities (going up and down the stairs, Wii dance play and playing football with the researchers).  Most of the wearables did not provide direct access to the raw data; therefore, valuations of EE were obtained directly from the associated software for each device.
The participants were fitted with the PMA and three different types of wearables. PMA was worn on the chest and the back, whereas SenseWear Armband (AR) (Bodymedia, Pittsburgh, USA) was worn on the non-dominant arm. Fitbit One (FO) (Fitbit Inc., San Francisco, CA) was worn on the hip and and Actiheart (AC) (CamNtech Inc, England) was worn on the chest. All these instruments were synchronized and initialized using the participant's personal information (age, gender, height, weight, handedness and smoker/non-smoker) before every measurement.
The characteristics of each instruments is described in table 1.

Analysis
Data are shown as mean ± standard deviation, minimum and maximum values were recorded as range. Breath by breath values from PMA were aggregated to provide minute by minute mean to simplify comparison with accelerometers data. EE of both wearables devices and PMA was calculate during the entire monitoring period and each single task. Resting period between the activities were not evaluated. Statistical analysis aim was to compare EE of every wearable devices with PMA values (criterion measure). Pearson's correlations were calculated to analyse both overall group level and single tasks associations.
Finally, mean absolute percent error (MAPE), calculated as the average of absolute differences between wearable devices and PMA value divided by PMA value and multiplied by 100.
This discrepancy is reduced in running activities, where MAPE's range between 13,0% (AR) and 18,4% (AC). FO is the most accurate device in moderate to vigorous activities showing 16,4% of error, AR's MAPE is 26,4% and AC's MAPE is 41,2%.        Table 4 and table 5 show correlation coefficient (r) between standard values (PMA) and the others wearable devices in different intensities of exercise. Correlation results are divided in the same categories as MAPEs. AR seems to be the device with the strongest correlation with PMA in every kind of activity and shows important correlation values with AC. Other values range from 0,448 to 0,865.

Discussion
The aim of this study was to evaluate the validity of EE estimation in healthy adults provided by 3 wearable PA devices in a setting of both laboratory and free-living activities in comparison to PMA during tasks at different intensities. Despite the accuracy of the EE measures of various wearables has already been analysed [10,11], there's poor information about the precision of these devices during activities at different intensities (from sedentary to high intensity).
All wearable devices showed appropriate results data for the whole protocol evaluation (MAPE range approximatively from 8 to 12%). The error rates in the present study were comparable (< 12%) to results presents in other study [10,12], this means that the wearable devices are providing similar accuracy as the PMA. In particular, we found that AR provided the better correlations with PMA in the whole protocol (r = 0.86) [10] and the best MAPE (7.7%). FO and AC showed good results, respectively 8.6% and 11.6% for MAPE values and correlation of 0,69 and 0,81 in the overall protocol.
Every device shows similar MAPE in sedentary activity between 12,0% and 13,0%. This result could be mainly attributed to the algorithm for EE in basal condition. In walking activities AR had the best performance with 13,3% of MAPE, followed by AC (19,7%) and FO (50,0%). We believed that FO have worse MAPE values because of its placement on the belt, this position could be not optimal to evaluate the walk.
During the walk we observed the major discrepancy of values about all devices. The best correlation coefficient in walking activities is found in AR with r= 0,72. This difference decreased in running activities, where MAPE's range between 13,0% (AR) and 18,4% (AC).
Concerning moderate to vigorous activities, FO was the most accurate device showing 16,4% of error, AR is at 26,4%. AC showed high MAPE's value because it didn't record continuously during vigorous activities.
In light of these results, AR is probably the best device to measure activity at different tasks, while to measure only vigorous activity is better to use the FO.
The strength of this study was to compare the EE from the devices during different intensity task, the protocol was designed to include typical activities that would be reflective of normal adult behaviour. Previous research [13] showed higher correlations with O2 (r= 0.85 -0,93) under laboratory conditions in contrast to lower correlations in free-living conditions (r= 0,48 -0,59). Daily activities include a considerable amount of upper-body movements that not may be captured by the accelerometer-based devices, so it's possible that the device overestimated some activities and underestimated others [3,10]. However, the EE estimation provided by the 3 wearable PA devices used in this study was reasonable. In fact, all the devices had high correlation values in total activity protocol (FO r= 0,70; AC r= 0,81; AR r= 0,86) in accordance with results of Lee at al [10].
Probably, the use of PA wearable devices placed on the arm avoid this problem during freeliving activities, in fact the AR was the most accurate in this study.

Conclusion
In conclusion, the present study supports the accuracy of the three devices used to estimate EE in healthy adults, especially considering the different intensities during the protocol. In particular AR result to be the most accurate in the whole protocol, including walking and running activity. However, the findings regarding MAPE suggest that the internal devices algorithms should be improved for better measure of EE during different tasks, in particular in moderate to vigorous activities. Considering that some bias already exist in EE estimation, these results add new knowledge for specific activities evolution and will help researchers to better use the right device for the peculiar setting of the study.