Introduction

Based on previous literature, the prevalence of olfactory dysfunction varies from 1.4 to 29% [1, 2]. The COVID-19 pandemic has further increased this proportion to some degree [3]. However, it has been observed that many individuals were unaware of their olfactory dysfunction. Surveys such as that conducted by Wehling et al., [4] have shown the proportion of olfactory dysfunction unawareness was 86% in middle-aged participants and 78% in older people. Some people with congenital anosmia may not even realize the absence of the sense of smell for most of their lifes [5, 6]. Although the COVID-19 pandemic has brought widespread awareness of olfactory dysfunction [7], many people only notice an olfactory loss when olfaction is severely or even totally impaired. Unlike meta-cognitive awareness of our vision or audition, our self-rated olfactory function or dysfunction is not reliable and poorly correlated with valid and reliable psychophysical tests. People regularly report having a poor sense of smell despite performing well with respect to psychophysical olfactory tests. In contrast, people claiming to have a sensitive sense of smell often have average or poor scores when tested [8].

Currently, there are few tests available that would allow people to quantitatively and repeatedly examine their sense of smell accurately in a home environment. Li and his colleagues [9] introduced a chemosensory home test including tests for smell, taste, and trigeminal functions, but no odor memory tests were included. Therefore, the development of simple odor memory tests allowing for self-administration appears to be of importance.

Odor memory comprises a complex and advanced cognitive function [10]. The successful completion of an odor recognition memory task not only requires odor detection, discrimination and naming, but also requires the subject to encode, store and retrieve odor information [10]. Therefore, odor recognition memory could be a comprehensive indicator to evaluate one’s olfactory function in general.

Several well-validated, easy and quick tests are available to examine odor memory. For example, the Sniffin’ TOM [11] asks participants to identify 8 targets from 16 odors, or 16 targets from 32 odors in an extended version (TOM-32) [12]; the Olfactory Memory Test Battery (OMTB) uses delayed matching-to-sample and n-back paradigms to assess odor recognition and working memory separately [13]; the Odor Memory Test (OMT) is conducted in a forced-choice procedure with 4 odors repeatedly used in 12 trials, requiring participants to identify (microencapsulated) odors [14]. Despite of the contributions that these tests have made in clinical diagnosis and scientific research, such approaches are of limited value because they are designed as screening tests or because they are difficult to self-administer.

The aim of the present study was to investigate a self-administered olfactory memory test (the Novel Olfactory Sorting Task, NOST) based on a prototype designed by one of the authors as a game that serves as an odor matching memory task, that would allow individuals to repeatedly self-test their olfactory function.

Method

Participants and materials

One hundred and ten participants were recruited (30 men, age = 50.1 ± 9.9 years; 80 women, age = 47.1 ± 11.5 years) for the study; half of them were heathy volunteers, the other half were patients with olfactory dysfunction who presented themselves to the Smell & Taste Clinic, Department of Otorhinolaryngology, Technical University of Dresden (Table 1). Fifty-one healthy participants volunteered for a retest within an interval of a maximum of 14 days. The study design had been approved by the ethics committee at the University Clinic of the Technische Universität Dresden (application number EK378082019). All participants provided written informed consent.

Table 1 Twelve components of NOST

The final set of 12 odors was selected through 4 rounds of evaluations by 3 experts from a pool of 36 odors (see supplementary). These odorants were evaluated on several dimensions in a 0–10 scale: chemical complexity, valence, familiarity, chemical stability, and perceptual stability. First, we excluded odorants which did not possess a single-molecule structure. Second, the evaluation of valence excluded unpleasant odorants scoring below 4 to ensure a favorable user experience and prevent odors being matched by elicited emotions, while those with neutral or pleasant characteristics were retained. Third, the familiarity of each odor was assessed, with swiftly recognizable scents being excluded, such as rose, while odors of moderate to low familiarity (scoring below 7) were preserved for next stage. Then, odorants exhibiting unstable chemical properties (oxidation, discoloration) were excluded to guarantee that the odors would remain consistent and recognizable for at least several months. For example, limonene tends to oxidize, thus resulting in a different smell after a few months, and discoloration of vanillin could serve as a visual indicator for participants. In addition, due to the normal trigeminal function in many patients experiencing olfactory dysfunction, it was agreed by the expert panel to include a pair of trigeminal-related odorant. It should be emphasized that we conducted comprehensive comparisons among odorants with similar smells, retaining only the most suitable one for our study while removing other similar odorants.

Ethyl Maltol and propylene glycol were mixed in a ratio of 1:10, while all other odor materials were kept pure. Meanwhile, odor intensity was evaluated under the effort of 5 experts and it can be confirmed that the final odorants exhibited approximately equal intensities. We prepared 24 glass jars to contain the final 12 pairs of odorants (see Table 2). The task of the participants was to arrange them in matching pairs. Each jar was filled with 0.5 ml of the fragrance and a sling gauze pad was then placed inside to prevent loss of the liquid in the possible event of spillage. The volume of jars was 40 ml with an opening diameter of 35 mm.

Table 2 Demographics and descriptive statistics

Procedure

All participants were informed about the aim, possible risks and overall outline of the study. Following written consent, the test started. Participants always had opportunities to ask questions and to quit the measurements without providing reasons.

First, participants had to complete questionnaires on medical history, their health condition and individual olfactory perception. Then, a validated and reliable olfactory test, the “Sniffin’ Sticks” test (SST) [15] was used to assess general olfactory function to divide the participants into groups with or without olfactory disorder. After a break of several minutes, the odor memory test formally started. Participants received 24 jars at once. They were allowed to unscrew the lids and then close them after smelling. Within a maximum of 15 min, the jars could be sniffed as often as needed. Participants were not allowed to make notes. All jars could be arranged or rearranged for the entire 15 min period. Each jar was labelled by a specific code at its bottom, enabling a post-test review and scoring. Patients could independently complete the entire test. Nevertheless, to ensure accuracy, all tests in the present study were conducted under the supervision and guidance of the same trained experimenter (AB).

After the odor memory test, using 11-point Likert-type scales participants were asked to answer 4 questions to assess their smell ability and performance in the tests (4 questions included). The questions related (1) to the participants’ confidence with their performance in the tests (0 = not satisfied at all, 10 = very satisfied), (2) to the ease of use (0 = very difficult to use, 10 = very easy to use), (3) to the effort they needed to finish the test (0 = easy, 10 = difficult), and (4) to the overall intensity of the odorants in the test (0 = barely perceptible, 10 = very strong).

Statistical analysis

SPSS 29.0 was used for statistical analysis. The NOST test result was the sum of correctly matched pairs. The test validity was assessed by a comparison of NOST score between participants with and without olfactory dysfunction. Pearson correlations were used to assess test-retest reliability and correlations between NOST and SST scores. Behavioral differences between NOST and SST, and sex difference in NOST were examined by F-tests with age as a covariate. We plotted ROC curves to calculate the cutoffs of NOST scores in distinguishing hyposmia.

Results

Table 1 showed demographics and descriptive statistics of tests scores. Half of the 110 participants were patients with olfactory dysfunction, and, expectedly, they showed significantly poorer olfactory performance than the control group, for odor identification, threshold and discrimination tests (all p < 0.01). This can be further indicated by the performance in each pair (see Table 3). Patients with olfactory dysfunction had significantly lower accuracy for each pair.

Table 3 Accuracy of the NOST items in healthy and patient groups

Moreover, there were no pairs of overly similar odors that are frequently misidentified with each other, while patients had the tendency to mismatch odor 2 with 8, odor 4 with 6, 7, 9, odor 6 with 11, and odor 7 with 9 (see Fig. 1). The level of confidence with the test and ease of use in NOST was similar to the validated SST (see Table 4). However, participants felt they had to put in a little more effort when conducting the odor memory test, and the odorants in the NOST were stronger than in the SST.

Table 4 Ratings of NOST and SST

We found a significant age effect (F = 4.50, p = 0.04) but no gender effect even with age as a covariate (F = 0.02, p = 0.90). Also, no interaction between factors “age” and “gender” was observed (F = 0.38, p = 0.54, see Table 5).

Table 5 Age and gender effect

The NOST score showed low but significant correlations with the SST (threshold test: r = 0.27, p = 0.01; discrimination test: r = 0.51, p < 0.01; identification test: r = 0.42, p < 0.01; TDI scores: r = 0.49, p < 0.01, Fig. 2). Hyposmic patients showed worse performance in NOST than normosmic individuals even after introducing age as a covariate (Controls = 7.55 ± 2.64, Patients: 4.86 ± 2.76, F = 23.32, p < 0.01), suggesting its applicability for differentiating hyposmia from normosmia.

The ROC analysis revealed an area under the curve (AUC) of 0.815 (SE = 0.055, asymptotic significance < 0.001, asymptotic 95% CI = 0.707 to 0.924). The NOST score with the maximal Youden’s index was 5.5, leading to a sensitivity of 76.2%, and specificity of 77.6% of detecting severe hyposmic patients (TDI < 24) from normosmic people (TDI > 31) (Fig. 3; Table 6).

Table 6 ROC analysis of NOST score to detect hyposmia
Fig. 1
figure 1

Mismatch rate of the 12 odorants. Note: Purple means a higher mismatch rate, yellow represents a lower rate, and green indicates successful matching

Fig. 2
figure 2

Correlations between NOST and subtests of SST

Fig. 3
figure 3

ROC curve of NOST

Discussion

Despite various possibilities to assess odor memory, currently none of the available tests can be performed by individuals themselves in an unassisted manner. Hence, this study aimed to develop an objective, self-administered assessment based on an odor memory task. To make it easy to use and comprehend, we conducted a memory match game, with 12 pairs needed to be matched from 24 odors.

A relatively stable test performance was observed after 14 days with a significant correlation coefficient of 0.45. However, the retest reliability is not as ideal as expected, which may result from the limited number of items (12 pairs of odorants). It has been reported that the coefficients of correlations between test and retest scores decreased from 0.93 over 0.73 to 0.60 when reducing the number of olfactory items from 32 over 16 to 12 (e.g., [16, 17]). In addition, it has to be kept in mind that the subgroup invited for the test-retest analysis were relatively homogeneous in terms of their olfactory function. If this variance between tested individuals had been larger it is conceivable that the test-retest reliability would have increased. The self-ratings of NOST and SST showed that, compared with the SST, participants could properly finish the NOST, and were satisfied with their own test performance.

The present NOST showed a good validity relating to the SST, a standardized clinical test. It is well-established that olfactory impairments in neurodegenerative diseases can be detected through olfactory identification, odor sensitivity and other dimensions of olfactory function [18]. Cognitive deficits in memory loss, dementia, are important early signs of prodromal neurodegenerative disease [19]. Therefore, an olfactory memory test, a combined index of olfactory and cognitive function, could be a promising method for the diagnosis for neurodegenerative disease [20].

As a home test, the NOST presents some advantages that are not satisfied by traditional olfactory tests. First, it can be conducted by individuals themselves so that patients can establish their olfactory disorders quantitatively without the immediate need to present themselves repetitively in specialized centers – these may be of specific significance in rural areas with some distance to larger cities, or also in clinics with no established smell and taste dysfunction department. Having said that, however, the test is not meant to replace professional diagnosis and counseling.

Home tests are often available for online purchase or pharmacies, making them accessible to a wide range of people. It removes the necessity of making appointments and waiting in clinics, thereby saving patients’ time. Besides, it seems to be useful for the rehabilitation of patients with olfactory dysfunction. Due to the widespread olfactory dysfunction caused by COVID-19, the NOST can serve as a quick, convenient home test that allows the patient to track olfactory function (which is different form olfactory screening tests based on odor identification where the correct odors are quickly learned and memorized which is a strong bias for consecutive tests). Furthermore, the frequent odor exposure that the NOST may provide, appears to make it a perfect companion for olfactory training [21]. With the present test, individuals could be able to take control of their health of olfactory function by providing them with tools to monitor and manage their well-being independently. This could be another promising direction for the future research and application.

What is more, the NOST revealed a poorer performance of olfactory function in severely hyposmic patients compared to the normosmic group, further suggesting its effect to differentiate between severe hyposmia and normosmia. Hyposmia can be predicted by the present test when using a score of 5.5 with a sensitivity of 76.2% and specificity of 77.6%.

In accordance with the previous results, the present results suggest that older people have worse performance in NOST, suggesting a significant age-related olfactory and cognitive decline [22, 23]. As for the gender effect of olfactory functions, many studies reported that women outperform men [11, 12, 14], while some found no significant differences [24]. These somewhat controversial results can be explained by the weak effect size for the factor “gender” ranging from 0.08 to 0.30 [25]. In fact, for the current study, no gender effect was found.

Despite of these promising results, several limitations remain. First, the current test was performed with small glass containers, which are very practical but bulky. Because of that, the test may not be very convenient to use when space is limited. The NOST could be improved for self-administration if it was presented in a more portable and convenient form, like smaller bottles or other delivery devices. Furthermore, the NOST should also be applied in anosmic patients in future studies to evaluate whether it can distinguish anosmia from hyposmia/normosmia. In addition, as an auxiliary diagnostic tool, the selection of norms and cutoff values for the NOST is crucial, which is expecting a more accurate result by the future studies with larger sample sizes.

In conclusion, the present study showed the good reliability, validity and possible clinical usage of the NOST. Compared with existing tools, it can be comprehended and conducted easily, and without any help from others, which may provide a quick and simple approach to get a global estimation of one’s olfactory and cognitive condition. Among others it may help not only to facilitate the early diagnosis of neurodegenerative diseases, but also to recognize olfactory dysfunction as well as recovery from olfactory loss.