Package 'openscoring'

Title: 'Open Scoring' API Client
Description: Creativity research involves the need to score open-ended problems. Usually done by humans, automatic scoring using AI becomes more and more accurate. This package provides a simple interface to the 'Open Scoring' API <https://openscoring.du.edu/docs>, leading creativity scoring technology by Organiscak et al. (2023) <doi:10.1016/j.tsc.2023.101356>. With it, you can score your own data directly from an R script.
Authors: Jakub Jędrusiak [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-6481-8210>, affiliation: University of Wroclaw), Peter Organisciak [ctb] (ORCID: <https://orcid.org/0000-0002-9058-2280>, affiliation: University of Denver), Selcuk Acar [ctb] (ORCID: <https://orcid.org/0000-0003-4044-985X>, affiliation: University of North Texas), Denis Dumas [ctb] (ORCID: <https://orcid.org/0000-0002-8446-4720>, affiliation: University of Georgia), Pier-Luc de Chantal [ctb] (ORCID: <https://orcid.org/0000-0002-6974-6172>, affiliation: Université du Québec à Montréal), Kelly Berthiaume [ctb] (ORCID: <https://orcid.org/0000-0002-5285-0512>, affiliation: University of North Texas)
Maintainer: Jakub Jędrusiak <[email protected]>
License: MIT + file LICENSE
Version: 1.1.0
Built: 2026-05-23 08:48:41 UTC
Source: https://github.com/jakub-jedrusiak/openscoring

Help Index


Score with an AI

Description

A basic function to score the creativity with an AI. See the OpenScoring site for more information. Requires an internet connection.

Usage

ocsai(
  df,
  item,
  answer,
  model = c("2", "2-xs", "1.6", "1-4o", "davinci3", "chatgpt2", "1.5", "chatgpt",
    "babbage2", "davinci2"),
  language = "English",
  scores_col = ".originality",
  quiet = FALSE,
  chunk_size = 25,
  task = "uses",
  short_prompt = TRUE,
  question = NULL
)

Arguments

df

A data frame.

item

The column name of the items or other kind of prompt.

answer

The column name of the responses. Commas will be replaced with spaces for scoring.

model

The model to use. Should be one of "2" or "2-xs". Deprecated models are kept for compatibility.

language

The language of the input. Only works for the 1.5 model upwards. Should be one of "Arabic", "Chinese", "Dutch", "English", "French", "German", "Hebrew", "Italian", "Polish", "Russian", "Spanish".

scores_col

The column name to store the scores in. Defaults to ".originality".

quiet

Whether to print the citation reminder.

chunk_size

The number of rows to send to the API at once. Defaults to 25. If a request is too large, it will be split into 10-row chunks.

task

The name of the task to be scored. Can be "uses" (default), "completion", "consequences", "instances" or "metaphors".

short_prompt

Whether the prompt is a short prompt (TRUE) or a full question (FALSE). Defaults to TRUE.

question

You can set this arg instead of providing the item column.

Details

Available models:

  • ocsai2: Cross-lingual originality scoring model. Trained with cluster-based deduplication to score semantically equivalent responses the same across languages. GPT-4.1-mini base.

  • ocsai2-xs: Smaller, faster version of Ocsai 2. Same cross-lingual training approach as ocsai2, with GPT-4.1-nano base for lower latency and cost.

  • ocsai-1.6: Update to the multi-lingual, multi-task 1.5 model, trained on GPT 4o instead of 3.5.

  • ocsai1-4o: GPT-4o-based model, trained with more data and supporting multiple tasks. Last update to the Ocsai 1 models (i.e. the original ones).

  • ocsai-chatgpt2: GPT-3.5-size chat-based model, trained with more data and supporting multiple tasks. Scoring is slower, with slightly better performance than ocsai-davinci.

  • ocsai-davinci3: GPT-3 Davinci-size model. Trained with the method from Organisciak et al. 2023, but with the additional tasks (uses, consequences, instances, complete the sentence) from Acar et al 2023, and trained with more data.

  • ocsai-1.5: Beta version of new multi-lingual, multi-task model, trained on GPT 3.5.

  • ocsai-chatgpt: GPT-3.5-size chat-based model, trained with same format and data as original models. Scoring is slower, with slightly better performance than ocsai-davinci2. For more tasks and trained on more data, use davinci-ocsai2

  • ocsai-babbage2: GPT-3 Babbage-size model from the paper, retrained with new model API. Deprecated, mainly because other models work better.

  • ocsai-davinci2: GPT-3 Davinci-size model from the paper, retrained with a new model API.

Value

The input data frame with the scores added.

Examples

df <- data.frame(
  stimulus = c("brick", "hammer", "sponge"),
  response = c("butter for trolls", "make Thor jealous", "make it play in a kids show")
)

df <- ocsai(df, stimulus, response, model = "davinci3")

# The 1.5 model and upwards works for multiple languages
df_polish <- data.frame(
  stimulus = c("cegła", "młotek", "gąbka"),
  response = c("masło dla trolli", "wywoływanie zazdrości u Thora", "postać w programie dla dzieci")
)

df_polish <- ocsai(df_polish, stimulus, response, model = "2", language = "Polish")