The current protocol has been registered in the PROSPERO database (ID number: CRD42023394729).
Language and date
Only papers where the full text is available in French or English will be included. Since our review focuses on mNCD, which was introduced into the literature with publication of the DSM-5 (2013), searches will be limited to the last ten years (2012-2023).
We will select studies that include healthy control subjects and patients with mild neurocognitive disorders (i.e., mile cognitive impairment (MCI), early-stage Alzheimer's Disease, Primary Progressive Aphasia, Parkinson's Disease, Multiple Sclerosis, Subjective Cognitive Decline, Right Hemispheric Stroke, Cortico Basal Syndrome, Behavioural Fronto Temporal Dementia, Lewy Body Dementia, Minor Stroke, Mild Traumatic Brain Injury).
We will exclude studies targeting individuals with major motor speech impairment (e.g., apraxia of speech) and/or individuals younger than 18 or older than 75. Studies with fewer than 10 participants in the patient group, with participants who score below the cut-off for mild cognitive impairment in screening tests (i.e., 21/30 for the Mini-Mental State Examination (MMSE) or 22/30 for the Montreal Cognitive Assessment (MoCA)), or without any control group will be excluded.
We will include studies that include recorded and transcribed connected speech tasks, such as picture-based narratives, storytelling, or conversations. Studies with reading and/or writing and/or verbal fluency tasks and/or repetition tasks and/or naming tasks that do not contain connected speech tasks will be excluded.
All studies must include at least one standardised neuropsychological test for screening cognition. Other neuropsychological tests will be reported if they are standardized and reported normalized scores.
In this section, we will scope the analysed linguistic features and their linguistic domain (e.g., lexicon, phonetics), as well as the statistical methods used to compare cases and controls or/and the machine learning models used to classify populations and their outcomes (e.g., p-values).
Studies published in peer-reviewed journals will be eligible for inclusion. Grey literature (e.g., theses, preprints) and proceedings will not be analysed.
Randomized Controlled Trials and case-control studies will be included. Meta-analyses, reviews or systematic reviews, case studies, qualitative studies, and methodological articles will be excluded.
We will search published studies using electronic bibliographical databases. We hand search the references in included studies and during the selection phase we will add any relevant articles identified by our search alert.
The search equations will be re-run just before the final analyses and any newly-available studies will be retrieved for inclusion.
We will search the following databases: PubMed, ScienceDirect, Embase, Web of Science, Google Scholar.
To gain time, we will directly import the results from the online databases to Zotero and verify the metadata (e.g., whether the authors’ names, publication date etc are correctly reported).
We will develop a search strategy based on keywords related to our review question. We will identify main keywords for the case population, connected speech tasks and methods. We will add synonyms to the search terms to extend our results. We will also add exclusion criteria such as “intervention” to avoid unrelated studies. With the help of Boolean operators, our prototypical syntax will be “case population” AND “connected speech task” AND “methods” NOT “exclusion criteria” (e.g., “early-stage alzheimer’s disease” AND “picture-based description” NOT “therapy”).
We will follow the PRISMA statement (Moher et al., 2009) and report our process in a PRISMA flowchart using an adaptation of the screening and selection procedure found in Pati and Lorusso (2018) and Mateo (2020). Pati and Lorusso (2018) propose a step-by-step method to conduct a PRISMA systematic review with extensive details and examples. Matteo (2020) describes helpful, easy-to-use methodological tools implemented in Excel and Zotero.
We will conduct our article selection following a three-stage procedure:
1. Pre-screening: This will focus on publication language, publication type, duplicates, incorrect metadata, and will be conducted by one author (AR).
2. Title and abstract screening: Studies that do not focus on connected speech will be excluded, this incudes studies that use naming tasks or verbal fluency tasks. The following exclusion criteria will be applied:
· Studies involving individuals with major motor disorders (e.g., apraxia of speech)
· Studies focusing exclusively on reading and writing tasks. While reading may be treated as spoken language, a strict definition would not classify it as connected speech. Similarly, writing is sometimes mistakenly classified as connected speech.
· Studies treating unrelated topics (e.g., qualitative studies in sociology focusing on discourse analysis and case studies). We expect to find studies with unrelated topics because of inclusion of the keyword polysemy.
This stage will be conducted by two authors (AR & ML). We will conduct an inter-rater agreement (%) analysis and aim to reach 80%. In the case of disagreement for the remaining 20% of the data, the two authors will examine the title and abstract a second time, discuss their eligibility, and try to reach a consensus. If no consensus is found, a third author (KR) will be asked to decide if the study should be included.
3. Full-text selection: We will assess each article's eligibility using the following exclusion criteria:
· Population: No control group; age below 18 or above 75 (mean); fewer than 10 participants per group.
· Tasks: Studies focusing on reading and writing tasks that do not include at least one connected speech task.
· Analysis: Studies that do not explicitly analyse speech parameters.
· Methods and study design: Descriptive statistics only, meta-analyses; systematic reviews, case studies; qualitative studies; theoretical methodological articles.
This part of the process will be carried out by two authors (AR & ML). We will conduct an inter-rater agreement (% and Cohen’s κ) and aim for -near perfect agreement (κ > .81). Any remaining disagreement will be discussed with a third author (KR) until agreement is reached.
Data collection process
Data extraction and inclusion
Data from four categories will be manually extracted:
· Population: Number and age for both case and control populations.
· Tasks: Type involving connected speech.
· Tests: Neuropsychological tests used, cognitive domains tested.
· Methods: Linguistic features, statistical tests, outcomes.
In the case of missing information we will contact the authors to request the data. Unavailable data will be identified as such in our summary tables. All extracted data will be recorded in an excel database and summarised in narrative and/or table formats.
We will use the National Institute of Health Quality Assessment of Case-Control Studies scale to evaluate the quality of each study. Due to the nature of our data, we will also evaluate study quality on the basis of: the connected speech tasks, the linguistic variables, and the statistical tests used to compare the case and control populations. Two reviewers (AR & ML) will conduct a quality assessment giving each article a quality score (Poor, Fair, Good). We will compare scores using an inter- rater agreement (%) and discuss any remaining disagreement with a third author (KR).
This section will provide information on the reliability of linguistic features as markers for subtle cognitive decline. We will incorporate the data in a narrative synthesis organised according to the study population, connected speech tasks, analysed linguistic features, and methods used to compare cases and controls. We will develop descriptive themes to describe the data, look for similarities and differences across studies, and when possible, merge similar data into higher-level categories and themes. For example, we will report the description of linguistic features and the measures for each article. We will then group and label those elements that share similar characteristics. We aim to provide easy-to-use guidelines to harmonise linguistic data across studies.
The reliability of a marker will be evaluated according to 1) the clarity with which it is described in the paper 2) Its ability to discriminate between individuals with mNCD and control subjects.
The data will be summarised in tables and figures depending on their category.
If appropriate, we will group studies according to the cognitive status of the participants, categorising patients based on scores in screening tests.