Connect with us

AI Research

Effects of generative artificial intelligence on cognitive effort and task performance: study protocol for a randomized controlled experiment among college students | Trials

Published

on


Intervention description {11a}

In the intervention group, the computer screen will be set up in a split-screen format. On the left side of the screen, the participant will receive instructions on the writing prompt, writing requirements, time requirements, grading feedback, and the grading rubric. The instructions will also highlight to the participant that they can use ChatGPT in any way they like to assist their writing, and there is no penalty in their writing score for how ChatGPT is used. The right side of the screen will display a blank ChatGPT interface where the participant can prompt questions and receive answers.

Explanation for the choice of comparators {6b}

In the control group, as in the intervention group, the computer screen will be set up in a split-screen format. On the left side of the screen, the participant will receive the same instructions on the writing prompt, writing requirements, time requirements, grading feedback, and the grading rubric. Additionally, the instructions will highlight to the participant that they can use a text editor in any way they like to assist their writing. On the right side, instead of ChatGPT, a basic text editor interface will be displayed. In summary, this comparator will keep the split-screen format consistent between the two groups and ensure that participants in the control group will complete the writing task with minimal support.

Criteria for discontinuing or modifying allocated interventions {11b}

This study is of minimal risk, and we do not anticipate needing to discontinue or modify the allocated interventions during the experiment. Participants can withdraw from the study at any time.

Strategies to improve adherence to interventions {11c}

Adherence to the interventions will be high because the procedures are straightforward and will be clearly explained in the step-by-step instructions on the computer screen. The participant will be alone in a noise-canceling room during the entire experiment. The participant can reach out to the experimenter through an intercom if they need any clarification.

Relevant concomitant care permitted or prohibited during the trial {11d}

Not applicable. This is not a clinical study.

Provisions for post-trial care {30}

Not applicable. This is a minimal-risk study.

Outcomes {12}

The study has two primary outcomes. First, we will measure participants’ writing performance scores on the analytical writing task. The task is adapted from the Analytical Writing section in the GRE, a worldwide standardized computer-based exam developed by the Educational Testing Service (ETS) [27]. Participants’ writing performance will be scored using the GRE 0–6 rubric and by an automatic essay-scoring platform called ScoreItNow!, which is powered by ETS’s e-rater engine [32, 33]. We chose to adapt from the GRE writing materials for two reasons. First, their writing task and grading rubrics were established writing materials designed to measure critical thinking and analytical writing skills and have been used in research as practice materials for writing (e.g. [34]). Second, OpenAI’s technical report shows that ChatGPT (GPT-4) can score 4 out of 6 (~ 54th percentile) on the GRE analytical writing task [31]. This gives us a benchmark for assessing the potential increase in writing performance when individuals collaborate with generative AI.

Second, we will measure participants’ cognitive effort during the writing process. Participants’ cognitive effort will be measured using a psychophysiological proxy—i.e., changes in pupil size [35, 36]. Pupil diameter and gaze data will be collected using the Tobii Pro Fusion eye tracker at a sampling rate of 120 Hz. During the preparation stage of the study, the room light will be adjusted so that the illuminance at the participants’ eyes is at a constant value of 320 LUX. Baseline pupil diameters will be recorded during a resting task in the experiment preparation stage that asks the participant to stare at a cross that will appear for 10 s each on the left, center, and right sections of the computer screen. Pupil diameters and gaze data will be recorded throughout the writing process.

The study has several secondary outcomes. First, to identify the neural substrates of cognitive effort during the writing process, we developed an additional psychophysiological proxy, changes in the cortical hemodynamic activity in the frontal lobe of the brain. Specifically, we will examine hemodynamic changes in oxyhemoglobin (HbO). Brain activity will be recorded throughout the writing process using the NIRSport 2 fNIRS device and the Aurora software with a predefined montage (Fig. 2). The montage consists of eight sources, eight detectors, and eight short-distance detectors. The 18 long-distance channels (source-detector distance of 30 mm) and eight short-distance channels (source-detector distance of 8 mm) are located over the prefrontal cortex (PFC) and supplementary motor area (SMA) (Fig. 2). The PFC is often involved in executive function (e.g., cognitive control, cognitive efforts, inhibition) [37, 38]. The SMA is associated with cognitive effort [39, 40]. The sampling rate of the fNIRS is 10.2 Hz. Available fNIRS cap sizes are 54 cm, 56 cm, and 58 cm. The cap size selected will always be rounded down to the nearest available size based on the participant’s head measurement. The cap is placed on the center of the participant’s head based on the Cz point from the 10–20 system.

Fig. 2

Design of the fNIRS montage

Third, we will measure participants’ subjective perceptions of the writing task by self-reported survey measures in the post-survey (Table 1). We will measure participants’ subjective perceptions of the two primary outcomes—that is, their self-perceived writing performance and self-perceived cognitive effort. Self-perceived writing performance will be measured with a one-item scale using the same grading rubric described in the instructions for their writing task and used in the scoring tool. Self-perceived cognitive effort will be measured using a one-item scale adapted from the National Aeronautics and Space Administration-task load index (NASA-TLX) [41, 42]. We will also measure participants’ subjective perceptions of several mental health and learning-related outcomes, including stress, challenge, and self-efficacy in writing. Self-perceived stress will be measured using a one-item scale adapted from the Primary Appraisal Secondary Appraisal scale (PASA) [43, 44]. Self-perceived challenge will be measured using a one-item sub-scale adapted from the Primary Appraisal Secondary Appraisal scale (PASA) [43, 44]. Self-efficacy in writing will be measured using a 16-item scale that measures three dimensions of writing self-efficacy: ideation, convention, and self-regulation [45]. Furthermore, we will measure participants’ situational interest in analytical writing using a four-item Likert scale adapted from the situational interest scale [46]. Additionally, we will measure participants’ behavioral intention to use ChatGPT in the future for essay writing tasks [47].

Table 1 Scales in the post-survey

Participant timeline {13}

The time schedule is provided via the schematic diagram below (Fig. 3). The entire experiment will last for approximately 1–1.5 h for each participant.

Fig. 3
figure 3

Schedule of enrollment, interventions, and assessments of the study

Sample size {14}

To estimate the required sample size, we conducted a simulation analysis on the intervention effect on writing performance using ordinary least squares (OLS) regression. Recent empirical evidence suggests that the effect size of generative AI on writing tasks ranges around Cohen’s d = 0.4–0.5, such as [1, 48]. In our simulation analysis, the simulated data assumes normally distributed data, equal and standardized standard deviations between the two conditions, and an anticipated effect size of Cohen’s d = 0.45. In the end, our analysis indicated that recruiting a minimum of 160 participants would be necessary to achieve a statistical power greater than 0.8 under an alpha level of 0.05. The simulation was implemented in R, and the corresponding code is available at the Open Science Framework (OSF) via https://osf.io/9jgme/.

We opt to base our sample size estimation on writing performance, but not on the other primary outcome, cognitive effort, for two reasons. First, the effect of generative AI on performance outcomes has been studied [1, 48], but we did not find prior evidence on the effect size of generative AI on cognitive effort using physiological measures. Second, our physiological measure of cognitive effort may likely be powered once the sample size satisfies our behavioral measure of writing performance. Pupillometry studies on cognitive efforts, such as the N-back test, typically recruit 20–50 participants in short, repeated, within-subject trials (e.g., [49]). These studies provide a general estimation of participants needed. Although our study design (i.e., a between-subject RCT) differs from common pupillometry studies, cognitive effort is still a repeated outcome measure using time series pupil data throughout the entire writing process. Repeated outcome measures generally can enhance statistical power by taking into account within-subject variability [50].

Recruitment {15}

The recruitment will follow a convenience sampling strategy. To aim for a student population with diverse academic backgrounds, participants will be recruited broadly through social media platforms, email lists, and flyers at the research university where the experiment will be conducted. Given that the experiment will start during the summer, the research team can recruit summer school students as participants. Thus, the study sample will not be limited to the students presently at the university. The recruitment materials include a brief description of the study, the eligibility criteria for participation, and the compensation for participation. Individuals who are interested in participation can sign up on a calendar by selecting available time slots provided by the experimenters. Participants will receive 30 euros in compensation upon completion of the experiment. Participants who withdraw in the middle of the experiment will receive partial compensation, prorated based on the amount of time they spend in the experiment.



Source link

AI Research

China’s Moonshot AI releases open-source model to reclaim market position

Published

on


BEIJING (Reuters) -Chinese artificial intelligence startup Moonshot AI released a new open-source AI model on Friday, joining a wave of similar releases from local rivals, as it seeks to reclaim its position in the competitive domestic market.

The model, called Kimi K2, features enhanced coding capabilities and excels at general agent tasks and tool integration, allowing it to break down complex tasks more effectively, the company said in a statement.

Moonshot claimed the model outperforms mainstream open-source models in some areas, including DeepSeek’s V3, and rival capabilities of leading U.S. models such as those from Anthropic in certain functions such as coding.

The release follows a trend among Chinese companies toward open-sourcing AI models, contrasting with many U.S. tech giants like OpenAI and Google that keep their most advanced AI models proprietary. Some American firms, including Meta Platforms, have also released open-source models.

Open-sourcing allows developers to showcase their technological capabilities and expand developer communities as well as their global influence, a strategy likely to help China counter U.S. efforts to limit Beijing’s tech progress.

Other Chinese companies that have released open-source models include DeepSeek, Alibaba, Tencent and Baidu.

Founded in 2023 by Tsinghua University graduate Yang Zhilin, Moonshot is among China’s prominent AI startups and is backed by internet giants including Alibaba.

The company gained prominence in 2024 when users flocked to its platform for its long-text analysis capabilities and AI search functions.

However, its standing has declined this year following DeepSeek’s release of low-cost models, including the R1 model launched in January that disrupted the global AI industry.

Moonshot’s Kimi application ranked third in monthly active users last August but dropped to seventh place by June, according to aicpb.com, a Chinese website that tracks AI products.

(Reporting by Liam Mo and Brenda Goh, Editing by Louise Heavens)



Source link

Continue Reading

AI Research

AI is rewriting the rules of the insurance industry

Published

on


Despite its traditionally risk-averse nature, the insurance industry is being fundamentally reshaped by AI.

AI has already become vital for the insurance industry, touching everything from complex risk calculations to the way insurers talk to their customers. However, while nearly eight out of ten companies are dipping their toes in the AI water, a similar number admit it hasn’t actually made them any more money.

Such figures reveal a simple truth: just buying the fancy new tech isn’t enough. The real winners will be the ones who figure out how to weave it into the very fabric of who they are and everything they do.

You can see the most dramatic changes right at the heart of the business: handling claims. That mountain of paperwork and endless phone calls, a process that could drag on for weeks, is finally being bulldozed by AI.

A deployment by New York-based insurer Lemonade back in 2021 resulted in settling over a third of its claims in just three seconds, with no human input. Or look at a major US travel insurer that handles 400,000 claims a year; it went from a completely manual system to one that was 57% automated, cutting down processing times from weeks to just minutes.

However, this isn’t just about moving faster; it’s about getting it right. AI can slash the kind of costly human errors that lead to claims leakage in the insurance industry by as much as 30%. The knock-on effect is a huge productivity leap, with adjusters able to handle 40-50% more cases. This frees up the real experts to stop being paper-pushers and start focusing on the tricky cases where a human touch and genuine empathy make all the difference.

It’s a similar story for the underwriters, the people who calculate the risks. AI is giving them superpowers, letting them analyse colossal amounts of data from all sorts of places – like telematics or credit scores – that a person could never sift through alone. It can even draft an initial risk report with incredible accuracy by looking at past data and policies in the blink of an eye.

In practice, this helps create pricing that is fairer and more accurately reflects a person’s unique situation. Zurich, for example, used a modern platform to build a risk management tool that made their assessments 90% more accurate.

Suddenly, underwriting isn’t about looking in the rearview mirror anymore—it’s a living, breathing process that can adapt on the fly to new, complex threats like cyberattacks or the effects of climate change.

But this isn’t just about back-office wizardry. When deployed in the insurance industry, AI is completely changing the conversation between insurers and the people they serve. It’s allowing a move away from simply reacting to problems to proactively helping customers.

AI chatbots can offer 24/7 support, getting smarter with every question they answer. This lets the human team focus on the more difficult conversations. The real game-changer, though, is making things personal. 

By understanding a customer’s policy and behaviour, AI can gently nudge them with a renewal reminder or suggest a product that actually fits their life, like usage-based car insurance. It’s about showing customers you actually get them, which builds the kind of loyalty that’s been so hard to come by in an industry where over 30% of claimants feel dissatisfied, and 60% blame slow settlements.

This protective instinct also helps the whole system. AI is a brilliant fraud detective for the insurance industry and beyond, spotting weird patterns in data that a person would miss, and has the potential to cut fraud-related losses by up to 40%. It keeps everyone honest and protects the business and its customers.

What’s pouring fuel on this fire of change? A new breed of low-code platforms. They are the accelerators, letting insurers build and launch new apps and services much faster than before. In a world where customer tastes and rules can change overnight, that kind of speed is everything.

The best part of such tools is they democratise access and put the power to innovate into more hands. They allow regular business users – or ‘citizen developers’ – to build the tools they need without having to be coding geniuses. These platforms often come with strong security and controls, meaning this newfound speed doesn’t have to mean sacrificing safety or compliance, which is non-negotiable for an industry like insurance.

When you step back and look at the big picture, it’s clear that getting on board with AI isn’t just a tech project; it’s a make-or-break business strategy. Those who jumped in early are already pulling away from the pack, seeing things like a 14% jump in customer retention and a 48% rise in Net Promoter Scores. 

The market for this technology is set to explode to over $14 billion dollars by 2034, and some believe AI could add $1.1 trillion in value to the industry every year. But the biggest roadblocks aren’t about the technology itself; they’re about people and old habits.

Data, especially in an industry like insurance, is often stuck in old systems which stops AI from seeing the whole picture. To get past this, you need more than clever software. You need leaders with a clear vision, a willingness to change the company culture, and a commitment to training their people.

The winners in this new era won’t be the ones tinkering with AI in a corner—they’ll be the ones who lead from the top, with a clear plan to make it a part of their DNA. This will require an understanding that it’s not just about doing old things better, but about finding entirely new ways to bring value and build trust.

Learn more about how AI is rewriting the rules of the insurance industry at the upcoming webinar “From Complexity to Clarity: AI + Agility Layer for Intelligent Insurance” on July 16, 2025, at 7PM BST / 2PM ET. Industry experts from Appian and EXL will share real-world examples and practical insights into how leading carriers are implementing these technologies. Registration is available at the webinar link.

Featured speakers include:

  • Vikram Machado, Senior Vice President & Practice Leader – Life, Annuities, Retirements & Group Insurance, EXL
  • Vikrant Saraswat, Vice President – AI Consulting, EXL
  • Jack Moroney, Enterprise Account Executive – Insurance & Financial Services, Appian
  • Andrew Kearns, Insurance Industry Lead, Appian
  • Michaela Morari, Senior Solution Consultant – Insurance & Financial Services, Appian

See also: UK and Singapore form alliance to guide AI in finance



Source link

Continue Reading

AI Research

Clarivate Unveils Enhanced 2025 G20 Research, Innovation Scorecard with Expanded Data, AI Insights

Published

on


Clarivate (NYSE:CLVT) is one of the cheap IT stocks hedge funds are buying. On July 9, Clarivate released its annual 2025 G20 Research and Innovation Scorecard. This scorecard was developed by experts at the Institute for Scientific Information/ISI at Clarivate and provides a data-driven overview of the research and innovation capabilities of G20 member nations.

The 2025 scorecard now incorporates data from the Emerging Sources Citation Index/ESCI, which is a part of the Web of Science Core Collection, to provide a more comprehensive view of global research. The scorecard has been refined to better emphasize collaboration and impact, reflecting South Africa’s Ubuntu philosophy, the G20 host for 2025.

Clarivate Unveils Enhanced 2025 G20 Research, Innovation Scorecard with Expanded Data, AI Insights

A state-of-the-art computer lab filled with engineers working on new analytics technologies.

Dynamic visualizations are included to showcase each member’s research performance within their economic context and academic priorities. New additions also include OECD field-level breakdowns, insights into open access, and research aligned with Sustainable Development Goals (SDGs), highlighting how G20 nations are collaborating to address global challenges.

Clarivate (NYSE:CLVT) is an information services provider in the Americas, the Middle East, Africa, Europe, and the Asia Pacific.

While we acknowledge the potential of CLVT as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you’re looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the best short-term AI stock.

READ NEXT: 30 Stocks That Should Double in 3 Years and 11 Hidden AI Stocks to Buy Right Now.

Disclosure: None. This article is originally published at Insider Monkey.



Source link

Continue Reading

Trending