Abstract. We present the first study of the multilingual transfer ability of SignCLIP for Italian Sign Language (LIS) across zero-shot, few-shot, and fine-tuning. We find that its pretraining induces negative zero-shot transfer. In contrast, few-shot results confirm robust sign embeddings. We find monolingual fine-tuning highly effective on small datasets, achieving top results with Global Noise-Contrastive Estimation (GlobalNCE) and parameter-efficient ProLIP, compared to InfoNCE.
Sign Languages are the primary means of communication for millions of deaf individuals worldwide [1], [2]. Isolated Sign Language Recognition (ISLR) remains an open research area at the intersection of computer vision (CV) and natural language processing (NLP) [1], [3]. Similar to spoken language research, there is a large discrepancy in the efficacy of state-of-the-art solutions between high-resource and low-resource languages. Unlike the widely studied American (ASL), British (BSL) and Chinese Sign Language (CSL), Italian Sign Language (LIS) remains under-resourced, lacking a large-scale, annotated corpora required for the training of a deep neural network that can recognise it effectively [2].
To overcome this limitation, recent research has pivoted toward transfer learning and few-shot recognition, leveraging models pre-trained on large multilingual datasets [4]. One of these models is SignCLIP, which utilises contrastive learning to project spoken language text and sign language videos into a shared embedding space. It is pre-trained on Spreadthesign, a dataset containing approximately 500,000 video clips in up to 44 different sign languages [1], including LIS. However, downstream evaluations for LIS recognition and text-video retrieval tasks were entirely omitted in its original benchmarks [1].
Consequently, the applicability of these multi-lingual priors to low-resource LIS datasets remains unexamined. To address this research gap, we present the first investigation for LIS that explores the performance of zero-shot, few-shot, and fine-tuning paradigms using SignCLIP as a foundation model, evaluating both cross-modal Video-Text retrieval and ISLR. For this evaluation, we used two datasets: A3LIS-147, introduced in [5], and SignIT [2]. These datasets enable complementary evaluations by contrasting a controlled, balanced multi-signer environment with domain-specific vocabulary against naturalistic, unbalanced, core-vocabulary signs, respectively.
Our main findings are as follows:
LIS ISLR research has largely targeted small-scale, controlled settings, utilising the A3LIS-147 dataset as the primary benchmark [6], [7]. Early approaches with Hidden Markov Models (HMMs) have been improved upon by more recent work reaching an accuracy of 80.4% with fully-supervised CNN models (Inception3D and SlowFast) [6].
The above work suffers from several structural limitations in the context of scalable SLR. The architectures employed are incapable of adapting to out-of-dictionary vocabulary without retraining. Furthermore, they are optimised for clean artifact-free datasets, potentially suffering in performance with out-of-distribution noisy data seen during real-world deployment [8].
To address the latter, the SignIT dataset was recently introduced to benchmark LIS ISLR on real-world data. Baseline evaluations of the SignIT dataset demonstrate that current state-of-the-art approaches struggle to effectively classify LIS signs at the gloss level, as opposed to the categorical level [2]1.
Early Zero-Shot SLR attempts struggled due to cross-lingual complexities between signs and natural language, as well as high variation in sign execution [10], [11], resulting in a pivot towards few-shot, visual retrieval paradigms [4].
Bilge et al. introduced Few-Shot Sign Language Recognition (FSSLR) via a meta-learning framework across sign languages, proving sparse source examples can generalise to unseen target languages. They discovered “synonym” subsets between languages failed to yield higher performance, suggesting signs are heavily diversified rather than net-iconic [4].
Similarly, Vandendriessche et al. (2025) embedded pose key points for distance-based visual retrieval, enabling one-shot ISLR that generalises to out-of-domain vocabularies without any retraining. Both frameworks operate entirely within a visual domain; achieving high cross-lingual transferability, but lack any inherent coupling to natural language text or semantic meaning [8].
In contrast, Cheng et al. utilise contrastive learning in CiCo to model retrieval as a cross-lingual problem, successfully aligning a single sign language video modality directly to a spoken language text space (e.g., ASL to English). It trains a domain-agnostic sign encoder before the domain-aware retrieval. [12].
SignCLIP aligns multilingual signs to a single text space in English (as a matter of efficiency). Their work relies on the ‘Iconicity Hypothesis’ - that universal motion primitives are semantically shared across sign languages, and adapts the distributional hypothesis to sign language. The model captures the core meaning of a sign as a ‘cluster centre’ in the embedding space, preserving the individual variance of different signers. However, mapping these clusters is made difficult by the Spreadthesign corpus, which is skewed to only one video per sign per language [1].
SignCLIP uses cross-lingual contrastive learning with prefixed language identifying tokens, e.g ‘<en> <ase> {word}’ for ASL. Ultimately, the authors note that the model’s zero-shot performance on out-of-domain data is deficient, and they posit that few-shot learning or fine-tuning is necessary to achieve noticeable performance [1].
The authors do not investigate the underlying architectural or semantic mechanisms that cause this failure, leaving the specific limitations of their cross-modal alignment unexamined.2
We follow the same pipeline used for the training of the frozen SignCLIP backbone [1], including:
We investigate whether SignCLIP’s multilingual pretraining generalises to LIS, a language present in the pretraining corpus but excluded from the original evaluation. Our approach tests this through three phases: Zero-shot evaluation, to assess the frozen multilingual prior’s native LIS structure; few-shot adaptation, to evaluate whether this structure supports recognition from a minimal number of examples using a frozen backbone; and a fine-tuning ablation, to determine the performance ceiling of lightweight fine-tuning whilst preserving cross-modal alignment.
A3LIS-147 and SignIT together evaluate model transfer across three axes:
The zero-shot evaluation applies the frozen SignCLIP checkpoint directly to both datasets. Predictions are generated by computing the cosine similarity between the video embedding and the text embedding of the English gloss (prompted as <en> <lis> [gloss]).
We report Recall@1, 5, 10, and Median Rank. To better investigate the “Iconicity Hypothesis” and transfer ability, we perform a per-class analysis stratified by Category, Median Rank, qualitative ASL/BSL similarity (iconicity proxy), and Spreadthesign presence.
Translation. We manually translated A3LIS-147 using Spreadthesign. Remaining out-of-vocabulary (OOV) terms were translated as accurately as possible. We also recreated the unavailable categories. Both are listed in Appendix D.
We evaluate few-shot ISLR to determine whether the solid results reported in the SignCLIP paper generalise to LIS.
We initialise from the baseline checkpoint and fine-tune on A3LIS-147 using a 70/10/20 signer-stratified split. This ensures that our evaluation measures generalisation to unseen signers (see Appendix D for the exact partition). Each configuration is trained for 50 epochs and evaluated across zero-shot retrieval, linear probing, and prototypical retrieval. For all the details about the hyperparameters used, see Appendix B.
The text Transformer and CNN backbone are frozen to preserve pre-trained semantic anchors. We unfreeze the visual adaptation parameters , denoting the video token MLP, video Transformer encoder, and logit-scale temperature, respectively.
Because contrastive models exhibit high sensitivity to objectives and batch scales on low-resource datasets, we conduct an ablation evaluating the following optimisation regimes:
The single best-performing fine-tuning regime identified on A3LIS-147 is applied to SignIT (details in Appendix B). To address the dataset’s naturalistic acquisition and long-tailed distribution, we apply light spatial augmentation to preserve semantic meaning, and heavier temporal augmentation (aug_sigma_temporal: 0.25, aug_sigma_spatial: 0.15, aug_sigma_noise: 0.002, aug_p_flip: 0.0,aug_strength_max: 3.5). SignIT’s richer macro-categories, and its previous literature motived additional experiments on category zero-shot and few-shot retrieval. For these experiments, we include recall, precision, and F1 alongside R@1 for better comparison with the original authors.
Baseline zero-shot evaluations in Table 1 and Table 2 show poor overall performance, in line with Jiang et al. findings for out-of-domain transfer [1]. However, there is stratification between categories. In SignIT, the ‘Food’ domain achieves the highest exact retrieval (10.96% R@1), while ‘Emotions’ demonstrates superior neighbourhood alignment (51.60% R@10). Similarly, A3LIS-147 exhibits a split between early recall (‘Common Life’, 7.22% R@1) and broader neighbourhood density (‘Public Institute’, MedR 42.5). This variance indicates that while overall cross-lingual transfer is weak, the model successfully transfers universal, cross-lingual iconic primitives from the pretraining distribution for specific semantic clusters. For gloss-level and more category details, see Appendix C.
| Cat. | R@1 | R@5 | R@10 | MedR |
|---|---|---|---|---|
| Animals | 0.041 | 0.149 | 0.3108 | 23.6 |
| Colors | 0.0572 | 0.2321 | 0.4353 | 18.2 |
| Emotions | 0.04 | 0.3117 | 0.516 | 13.2 |
| Family | 0.0071 | 0.0155 | 0.0496 | 42.1 |
| Food | 0.1096 | 0.3614 | 0.4947 | 13.7 |
| Overall | 0.0506 | 0.1876 | 0.326 | 24.0 |
| Cat. | R@1 | R@5 | R@10 | MedR |
|---|---|---|---|---|
| Common Life | 0.0722 | 0.2167 | 0.2722 | 48.9 |
| Education | 0.0433 | 0.1233 | 0.17 | 61.0 |
| Highway | 0.025 | 0.125 | 0.25 | 50.1 |
| Hospital | 0.0263 | 0.0895 | 0.1684 | 46.0 |
| Public Institute | 0.0447 | 0.1342 | 0.2 | 42.5 |
| Railway Station | 0.0083 | 0.0333 | 0.0833 | 47.9 |
| Overall | 0.0356 | 0.1114 | 0.1732 | 49.2 |
| Tier | MedR Range | Portion | Cum. MedR |
|---|---|---|---|
| Great | 1–3 | 0.537 | 1.6 |
| Good | 3.1–15 | 0.1678 | 7.2 |
| Fair | 15.1–40 | 0.2282 | 18.4 |
| Neutral | 40.1–74 | 0.3087 | 33.0 |
| Adverse | 74.1–148 | 0.2416 | 49.2 |
| Tier | MedR Range | Portion | Cum. MedR |
|---|---|---|---|
| Great | 1–3 | 0.43 | 1.9 |
| Good | 3.1–10 | 0.226 | 5.6 |
| Fair | 10.1–25 | 0.366 | 12.3 |
| Neutral | 25.1–47 | 0.226 | 18.5 |
| Adverse | 47.1–93 | 0.140 | 24.0 |
| LIS Sign in STS | R@1 | R@5 | R@10 | MedR |
|---|---|---|---|---|
| No | 0.0413 | 0.1109 | 0.1543 | 51.1 |
| Yes | 0.0337 | 0.1169 | 0.1888 | 48.4 |
| Yes, but different | 0.0286 | 0.0786 | 0.1357 | 47.9 |
| Iconicity Proxy (UK/US) | R@1 | R@5 | R@10 | MedR |
|---|---|---|---|---|
| Kind of | 0.0143 | 0.0571 | 0.1214 | 60.1 |
| No | 0.026 | 0.1135 | 0.1698 | 50.5 |
| Yes | 0.0667 | 0.1256 | 0.2 | 42.0 |
SignCLIP’s cross-lingual alignment induces a structurally bimodal transfer effect. We argue that since the pre-trained text encoder operates in an English-centric semantic space, language prefix identifiers provide insufficient separation. Consequently, the objective forces visually disparate sign videos toward a quasi-singular text anchor. This semantic asymmetry creates an optimisation conflict that marginalises low-resource languages, resulting in negative transfer, evidenced by the adverse tiers in A3LIS-147 (24.16%) and SignIT (14.0%) in Table 3 and Table 4.
For iconic signs, the shared anchor is beneficial (achieving a MedR of 42.0); for non-iconic signs, the anchor provides a weak or adversarial signal, collapsing retrieval accuracy (MedR 60.1) in Table 6. Pre-training exposure does not overcome this issue, Table 5 shows OOV LIS signs marginally outperform in-vocabulary signs at R@1 (4.13% vs. 3.37%).
We believe data scaling is unlikely to resolve these failures. Shared human articulatory constraints result in heavy overlap in the discriminative features between languages, a problem further complicated by high individual signer-variance (Figure 1 in Appendix). Thus, diversification within synonym classes [4] and cross-lingual “false friends” lead to gradient conflicts. Our findings suggest these factors limit zero-shot performance for any architecture imposing a single joint embedding space without language-gated alignment. These issues can be resolved by monolingual fine-tuning (Table 8), likely at the expense of multilingual understanding, but this remains unexamined.
Linear probing on the frozen backbone achieves 66.78% R@1 on A3LIS-147 in Table 7, confirming that the video encoder learns robust representations.
Table 7 shows that GlobalNCE yields the strongest fine-tuning performance on A3LIS and the linear-probe matches previous SOTA [6]. We attribute this to its global negative sampling across distributed batches, providing the critical density of hard negatives required to stabilise contrastive gradients. ProLIP achieves within 0.3% R@1 of GlobalNCE at zero-shot (75.84% vs. 76.17%) while adapting only the final MLP layer and logit scale, making it the preferred regime when compute or overfitting risk is the primary concern.
| Method | R@1 | R@5 | R@10 | MedR | |
|---|---|---|---|---|---|
| Baseline | Zero | 0.0369 | 0.1309 | 0.1946 | 40 |
| Proto | 0.6477 | 0.9094 | 0.9698 | 1 | |
| LP | 0.6678 | 0.9329 | 0.9698 | 1 | |
| GlobalNCE16 | Zero | 0.7617 | 0.9262 | 0.9698 | 1 |
| Proto | 0.7886 | 0.9430 | 0.9732 | 1 | |
| LP | 0.8020 | 0.9430 | 0.9732 | 1 | |
| PLIP16 | Zero | 0.7584 | 0.9161 | 0.953 | 1 |
| Proto | 0.7718 | 0.9396 | 0.9597 | 1 | |
| LP | 0.7785 | 0.9364 | 0.9564 | 1 | |
Table 8 shows that augmentation of SignIT improves generalisation. Our results trail the LLaVA-OneVision (Acc 0.238 video+pose) of the SignIT authors [2]. We outperform all non-video baselines they evaluated, including pose-only LLaVA (Acc 0.121), establishing a competitive key point-only result.
| Model | Mode | R@1 | R@5 | R@10 | MedR |
|---|---|---|---|---|---|
| Baseline | Zero | 0.0359 | 0.1692 | 0.2769 | 22.0 |
| Proto | 0.0974 | 0.3077 | 0.4462 | 13.0 | |
| LP | 0.0923 | 0.3641 | 0.5333 | 10.0 | |
| Fine-tune | Zero | 0.1385 | 0.4308 | 0.5538 | 7.0 |
| Proto | 0.1487 | 0.4256 | 0.5692 | 8.0 | |
| LP | 0.1538 | 0.4308 | 0.5744 | 7.0 | |
| Fine-tune + Aug | Zero | 0.1436 | 0.4308 | 0.6154 | 8.0 |
| Proto | 0.1744 | 0.4103 | 0.6103 | 8.0 | |
| LP | 0.1744 | 0.4462 | 0.5897 | 8.0 |
Zero-shot on categories achieves an F1-score (0.48) that is competitive with some fully supervised video baselines, such as I3D (0.34 F1)[2]. Because this relies on measuring the distance between visual embeddings and the textual embeddings of broad macro-categories, these results highlight an advantage of contrastive pretraining: the latent space is semantically organised, allowing the model to generalise to categorical distributions it never explicitly encountered during pretraining. Our strongest few-shot linear-probe configuration reaches 64.62% R@1, approaching the performance of SignIT’s best fully supervised MLP (0.726 Accuracy) [2].
| Model | Mode | R@1 | Pr | Re | F1 |
|---|---|---|---|---|---|
| Baseline | Zero | 0.3744 | 0.54 | 0.34 | 0.30 |
| Proto | 0.4103 | 0.3909 | 0.3921 | 0.3844 | |
| LP | 0.5846 | 0.61 | 0.52 | 0.55 | |
| Fine-tune | Zero | 0.4872 | 0.48 | 0.55 | 0.48 |
| Proto | 0.5641 | 0.5219 | 0.5371 | 0.5251 | |
| LP | 0.6462 | 0.64 | 0.59 | 0.61 | |
| Fine-tune + Aug | Zero | 0.4974 | 0.49 | 0.52 | 0.48 |
| Proto | 0.5949 | 0.5561 | 0.5708 | 0.5503 | |
| LP | 0.6103 | 0.68 | 0.57 | 0.59 |
| Random Chance | R@1 | R@2 | MedR |
|---|---|---|---|
| 0.1250 | 0.3510 | 0.6523 | 2.0 / 8 |
False positives: lsf - 20, bsl - 688, ngt - 227, and lse - 32. | |||
The Sign language identification of Table 10 complicates our earlier finding that in-vocabulary LIS signs do not outperform OOV. This simplified retrieval task suggests that SignCLIP does learn some language separation, as shown by the R@2 (65.23%). However, performance drops sharply at R@1 (35.10%), with substantial confusion between LIS, BSL, and NGT (Appendix A.3). It may be worth investigating if this is due to higher inter-language iconicity.
This work demonstrates that SignCLIP’s contrastive alignment induces a structurally bimodal transfer effect on LIS, beneficial for iconic vocabulary, adverse for non-iconic signs, indicating a geometric limitation of the shared embedding space paradigm rather than a data-scaling problem. Few-shot and fine-tuning strategies mitigate these limitations, confirming that the video encoder learns discriminative representations that zero-shot retrieval cannot exploit without fine-tuning in a monolingual context.
We see two promising directions for future research. Since pretraining exposure to LIS signs does not guarantee positive transfer, fine-tuning on the LIS-specific Spreadthesign subset could be adequate for OOD LIS. A more effective multilingual embedding space requires language-conditioned projections that both allow for iconicity transfer and decouple text anchors for non-iconic glosses across sign languages.
| Metric | Mean | Std. Dev. |
|---|---|---|
| R@1 | 0.7148 | 0.0631 |
| R@5 | 0.9403 | 0.0276 |
| R@10 | 0.9725 | 0.0176 |
Signer variability presented in Figure 1 primarily degrades R@1, seen by its 6.3% standard deviation. Broader retrieval remains robust. This variance underscores cross-signer generalisation as a persistent difficulty.
In Section 4.5, we presented a condensed view of our A3LIS-147 fine-tuning ablation, highlighting the performance of the default SignCLIP objective (NCE) against our best-performing GlobalNCE regime. Table 11 presents the comprehensive results across all evaluated loss functions, batch sizes, and sampling strategies.
| Method | R@1 | R@5 | R@10 | MedR | |
|---|---|---|---|---|---|
| Baseline | Zero | 0.0369 | 0.1309 | 0.1946 | 40 |
| Proto | 0.6477 | 0.9094 | 0.9698 | 1 | |
| LP | 0.6678 | 0.9329 | 0.9698 | 1 | |
| InfoNCE128 | Zero | 0.7248 | 0.906 | 0.9396 | 1 |
| Proto | 0.7584 | 0.9430 | 0.9765 | 1 | |
| LP | 0.7617 | 0.9597 | 0.9799 | 1 | |
| SupCon32x4 | Zero | 0.5912 | 0.8591 | 0.9128 | 1 |
| Proto | 0.7013 | 0.9128 | 0.9664 | 1 | |
| LP | 0.7785 | 0.9396 | 0.9765 | 1 | |
| Cross-Entropy 16 | Zero | 0.0503 | 0.1611 | 0.245 | 33 |
| Proto | 0.772 | 0.946 | 0.987 | 1 | |
| LP | 0.7651 | 0.9463 | 0.9799 | 1 | |
| GlobalNCE 16 | Zero | 0.7617 | 0.9262 | 0.9698 | 1 |
| Proto | 0.7886 | 0.9430 | 0.9732 | 1 | |
| LP | 0.802 | 0.943 | 0.9732 | 1 | |
| ProLIP 16 | Zero | 0.7584 | 0.9161 | 0.953 | 1 |
| Proto | 0.7718 | 0.9396 | 0.9597 | 1 | |
| LP | 0.7785 | 0.9364 | 0.9564 | 1 | |
| DHN-NCE 64 | Zero | 0.7081 | 0.8926 | 0.9295 | 1 |
| Proto | 0.7651 | 0.9497 | 0.9732 | 1 | |
| LP | 0.7617 | 0.9564 | 0.9765 | 1 | |
| Target language | Count | Proportion |
|---|---|---|
<en> <lis> | 523 | 0.351 |
<en> <ase> | 0 | 0 |
<en> <dgs> | 0 | 0 |
<en> <lsf> | 20 | 0.0134 |
<en> <bsl> | 688 | 0.4618 |
<en> <ngt> | 227 | 0.1523 |
<en> <lse> | 32 | 0.0215 |
<en> <csl> | 0 | 0 |
| Parameter | Value |
|---|---|
| Base Checkpoint | signclip_v1_1 |
| Model Architecture | MMFusionSeparate |
| Video Encoder | MMBertForEncoder (12 layers, dim: 609) |
| Text Encoder | BertModel (bert-base-cased) |
| Loss Function | GlobalNCE |
| Optimiser | Adam () |
| Base Learning Rate | 5.0e-05 |
| LR Scheduler | Polynomial Decay (122 warmup updates) |
| Weight Decay | 0.02 |
| Gradient Clipping | 2.0 (Max Norm) |
| Max Epochs | 50 |
| Batch Size | 16 |
| Precision | FP16 Mixed Precision |
| Max Sequence Length | Video: 256 frames / Text: 64 tokens |
| Pose Components | reduced_face |
| Data Augmentation | Temporal (), Spatial (), Noise () |
Note for ProLIP, there are two additional hyperparamters set: prolip_lambda: 0.5, and prolip_lambda_mode: inv_n
| Parameter | Value |
|---|---|
| Base Checkpoint | signclip_v1_1 |
| Model Architecture | MMFusionSeparate |
| Video Encoder | MMBertForEncoder (12 layers, dim: 609) |
| Text Encoder | BertModel (bert-base-cased) |
| Loss Function | (depends on experiment) |
| Video SupCon Weight | 0.5 |
| Optimiser | Adam () |
| Base Learning Rate | 5.0e-05 |
| LR Scheduler | Polynomial Decay (122 warmup updates) |
| Weight Decay | 0.01 |
| Gradient Clipping | 2.0 (Max Norm) |
| Max Epochs | 50 |
| Batch Size | 16 |
| Precision | FP16 Mixed Precision |
| Max Sequence Length | Video: 256 frames / Text: 64 tokens |
| Pose Components | reduced_face |
| Data Augmentation | Temporal Augmentation Enabled |
1. Great (1-3): bear, bread, color, watermelon.
2. Good (3.1-10): anger, brown, cake, chocolate, cow, fear, fuchsia, giraffe, grey, joy, light colors, orange, pizza, relatives, rooster, salt, sheep, snail, tiger, vegetable, wine.
3. Fair (10.1-25): apple, banana, bird, blue, butterfly, candy, cat, dark colors, disgust, donkey, family, fish, frog, fruit, grandfather, green, horse, light blue, lion, meat, monkey, parents, pasta, pear, pig, pineapple, pink, purple, rabbit, rice, spider, turtle, yellow, zebra.
4. Neutral / Random (25.1-47): aunt, black, brother-in-law, bull, cousin, crocodile, dad, daughter-in-law, dog, elephant, goat, goose, grandmother, milk, parrot, red, sadness, sky blue, uncle, water, wolf.
5. Perverse (47.1-93): boyfriend, brother, hen, husband, mom, mouse, nephew, sister, snake, son, son-in-law, white, wife.
1. Great (1-3): caldo, data, falconara, freddo, giudizio, iniezione, scadenza, senigallia.
2. Good (3.1-15): abitare, affitto, ancona, aperto, avviso, consegnare, dirigente, dolore, emergenza, jesi, macerata, modello, modulo, multa, notte, pomeriggio, presente, pubblica, ritirare_il_numero, sciopero, sostegno, traffico, tratta, vacanze, verde.
3. Fair (15.1-40): acqua, allegare, ambulanza, annullato, arrivo, ascoli, banca, binario, cambio, commissione, compilare, costo, cura, domenica, esame, fermo, giallo, giovedì, giorno, infermiere, infezione, istituto, marche, mattina, medico, operazione, partenza, promosso, provincia, ritardo, s.benedetto, tassa, torino, università.
4. Neutral / Random (40.1-74): abbonamento, allergia, amministrazione, andata, andata_e_ritorno, assente, assistente_alla_comunicazione, bidello, biglietto, bocciato, casa, casello, chiuso, cibo, civitanova, comune, diploma, disinfettare, fano, venerdì, giorni, ieri, laurea, litro, lunedì, martedì, mercoledì, mesi, obliterare, ospedale, pesaro-urbino, posta, rallentamenti, regione, ricevuta, ritorno, roma, rosso, segretario, sera, sindaco, stazione, strada, treno.
5. Perverse (74.1-148): asilo_nido, assessore, assistente, autostrada, domani, elementari, ente_pubblico, entro, flebo, impiegato, interprete, lingua_dei_segni, malattia, mangiare, marca_da_bollo, medie, nota, oggi, obliteratrice, orari, preside, professore, pronto_soccorso, registro, sabato, sala_d’attesa, scuola, scuola_materna, sil, superiori, sportello, studente, tecnico, telefono, ufficio_informazioni, voto.
| Category | Great | Good | Fair | Neutral | Adverse |
|---|---|---|---|---|---|
| animals | 1 | 6 | 14 | 8 | 3 |
| colors | 1 | 5 | 7 | 3 | 1 |
| emotions | 0 | 3 | 1 | 1 | 0 |
| family | 0 | 1 | 3 | 7 | 9 |
| food | 0 | 2 | 6 | 9 | 2 |
| Category | Great | Good | Fair | Neutral | Adverse |
|---|---|---|---|---|---|
| common life | 2 | 5 | 2 | 5 | 4 |
| education | 2 | 3 | 6 | 6 | 13 |
| highway | 0 | 3 | 0 | 3 | 2 |
| hospital | 1 | 3 | 7 | 4 | 4 |
| public inst. | 3 | 8 | 8 | 11 | 8 |
| railway station | 0 | 3 | 11 | 17 | 5 |
The following table provides the full mapping used for our A3LIS-147 analysis, including category classification, presence in the SpreadTheSign (STS) corpus, and our qualitative iconicity proxy (visual similarity to English-speaking sign languages).
| Italian | English | Category | In STS? | Iconicity Proxy |
|---|---|---|---|---|
| abbonamento | subscription | railway station | yes but different | no |
| abitare | live | common life | yes | no |
| acqua | water | common life | yes | no |
| affitto | rent | common life | yes | no |
| allegare | attach | education | no | yes |
| allergia | allergy | hospital | yes | no |
| ambulanza | ambulance | hospital | yes | no |
| amministrazione | administration | public institute | yes | yes |
| ancona | ancona | public institute | no | no |
| andata | one way | railway station | no | no |
| andata_e_ritorno | round trip | railway station | no | no |
| annullato | cancelled | railway station | yes | yes |
| aperto | open | common life | yes | yes |
| arrivo | arrival | railway station | yes | no |
| ascoli | ascoli | public institute | no | no |
| asilo_nido | day nursery | education | yes but different | no |
| assente | absent | education | yes | no |
| assessore | assessor | public institute | no | no |
| assistente | assistant | public institute | yes | no |
| assistente_alla_comunicazione | communication assistant | public institute | no | no |
| autostrada | motorway | highway | yes | kind of |
| avviso | notice | education | yes | yes |
| banca | bank | public institute | yes | no |
| bidello | janitor | education | no | no |
| biglietto | ticket | railway station | yes | yes |
| binario | platform | railway station | yes but different | no |
| bocciato | failed | education | yes but different | no |
| caldo | hot | common life | yes but different | no |
| cambio | change | railway station | no | no |
| casa | home | common life | yes | no |
| casello | toll gate | highway | yes | yes |
| chiuso | closed | common life | yes but different | yes |
| cibo | food | common life | yes | yes |
| civitanova | civitanova | public institute | no | no |
| commissione | commission | education | yes but different | no |
| compilare | compile | public institute | yes | no |
| comune | municipality | public institute | yes | no |
| consegnare | deliver | common life | yes | yes |
| costo | cost | common life | yes | kind of |
| cura | care | hospital | yes | yes |
| data | date | public institute | yes | no |
| diploma | diploma | education | yes | yes |
| dirigente | executive | public institute | yes | yes |
| disinfettare | disinfect | hospital | no | no |
| dolore | pain | hospital | yes but different | no |
| domani | tomorrow | railway station | yes but different | kind of |
| domenica | sunday | railway station | yes | no |
| elementari | elementary school | education | no | no |
| emergenza | emergency | hospital | yes | no |
| ente_pubblico | public body | public institute | no | no |
| entro | within | education | no | no |
| esame | exam | education | yes | no |
| falconara | falconara | public institute | no | no |
| fano | fano | public institute | no | no |
| fermo | still | railway station | no | no |
| flebo | intravenous drip | hospital | no | no |
| freddo | cold | common life | yes | yes |
| giallo | yellow | hospital | yes | no |
| giorni | days | railway station | yes | no |
| giorno | day | railway station | yes | no |
| giovedì | thursday | railway station | yes but different | no |
| giudizio | judgement | education | no | yes |
| ieri | yesterday | railway station | yes | yes |
| impiegato | employee | public institute | yes but different | no |
| infermiere | nurse | hospital | yes but different | kind of |
| infezione | infection | hospital | no | no |
| iniezione | injection | hospital | no | yes |
| interprete | interpreter | public institute | yes | no |
| inviare_sms | messaging | common life | no | no |
| istituto | institute | education | yes | no |
| jesi | jesi | public institute | no | no |
| laurea | graduation | education | yes | no |
| lingua_dei_segni | sign language | common life | yes but different | no |
| litro | litre | common life | yes | yes |
| lunedì | monday | railway station | yes | no |
| macerata | macerata | public institute | no | no |
| malattia | illness | hospital | yes | no |
| mangiare | eat | common life | yes | yes |
| marca_da_bollo | revenue stamp | public institute | no | no |
| marche | marche | public institute | no | no |
| martedì | tuesday | railway station | yes | no |
| mattina | morning | railway station | yes | kind of |
| medico | doctor | hospital | yes | yes |
| medie | middle school | education | no | no |
| mercoledì | wednesday | railway station | yes | no |
| mesi | months | railway station | yes | no |
| modello | model | public institute | yes | no |
| modulo | form | public institute | yes | yes |
| multa | fine | highway | yes | yes |
| nota | note | education | yes | kind of |
| notte | night | railway station | yes | yes |
| obliterare | stamp | railway station | no | no |
| obliteratrice | stamping machine | railway station | no | no |
| oggi | today | railway station | yes | no |
| operazione | operation | hospital | no | no |
| orari | times | railway station | no | no |
| ospedale | hospital | hospital | yes | no |
| partenza | departure | railway station | yes | no |
| pesaro-urbino | pesaro-urbino | public institute | no | no |
| pomeriggio | afternoon | railway station | yes | yes |
| posta | public institute | yes | kind of | |
| presente | present | education | yes | no |
| preside | headmaster | education | yes | no |
| professore | professor | education | yes | no |
| promosso | promoted | education | no | yes |
| pronto_soccorso | first aid | hospital | yes | yes |
| provincia | province | public institute | yes but different | no |
| pubblica | public | public institute | yes | yes |
| rallentamenti | slowdowns | highway | no | yes |
| regione | region | public institute | yes | kind of |
| registro | log book | education | yes | yes |
| ricevuta | receipt | public institute | no | no |
| ritardo | delay | railway station | no | no |
| ritirare_il_numero | take the number | public institute | no | no |
| ritorno | return | railway station | no | no |
| roma | rome | public institute | yes | no |
| rosso | red | hospital | yes | kind of |
| s.benedetto | s.benedetto | public institute | no | no |
| sabato | saturday | railway station | yes | no |
| sala_d’attesa | waiting room | hospital | yes | no |
| scadenza | expiration | education | yes | no |
| sciopero | strike | railway station | yes | yes |
| scontrino | receipt | public institute | yes | kind of |
| scuola | school | education | yes | no |
| scuola_materna | nursery school | education | yes | no |
| segretario | secretary | education | yes | no |
| senigallia | senigallia | public institute | no | no |
| sera | evening | railway station | yes | kind of |
| sil | silence sign | common life | no | no |
| sindaco | mayor | public institute | yes | no |
| sostegno | aid | education | yes | kind of |
| sportello | reception window | public institute | yes | yes |
| stazione | station | railway station | yes | no |
| strada | street | highway | yes | yes |
| studente | student | education | yes | no |
| superiori | high school | education | yes | yes |
| tassa | fee | public institute | yes | kind of |
| tecnico | technician | highway | yes | yes |
| telefono | telephone | common life | yes | yes |
| torino | turin | public institute | no | no |
| traffico | traffic | highway | yes | no |
| tratta | section | highway | no | yes |
| treno | train | railway station | yes | kind of |
| ufficio_informazioni | information office | public institute | no | no |
| università | university | education | yes | no |
| vacanze | vacation | common life | yes | yes |
| venerdì | friday | railway station | yes | no |
| verde | green | hospital | yes | no |
| voto | voting | education | yes | yes |