Estimation Methods – catsim.estimation
¶
Estimators are the objects responsible for estimating of examinees proficiency values, given a dichotomous (binary) response vector and an array of the items answered by the examinee. In the domain of IRT, there are two main types of ways of estimating \(\hat\theta\): and these are the Bayesian methods and maximumlikelihood ones.
Maximumlikelihood methods choose the \(\hat\theta\) value that maximizes
the likelihood (see catsim.irt.log_likelihood()
) of an examinee having
a certain response vector, given the corresponding item parameters.
Bayesian methods used a priori information (usually assuming proficiency and parameter distributions) to make new estimations. The knowledge of new estimations is then used to make new assumptions about the parameter distributions, refining future estimations.
All implemented classes in this module inherit from a base abstract class
Estimator
. Simulator
allows that a custom estimator be
used during the simulation, as long as it also inherits from
Estimator
.
catsim
implements a few types of maximumlikelihood estimators.

class
catsim.estimation.
DifferentialEvolutionEstimator
(bounds: tuple)[source]¶ Bases:
catsim.simulation.Estimator
Estimator that uses
scipy.optimize.differential_evolution()
to minimize the negative loglikelihood functionParameters: bounds – a tuple containing both lower and upper bounds for the differential evolution algorithm search space. In theory, it is best if they represent the minimum and maximum possible \(\theta\) values; in practice, one could also use the smallest and largest difficulty parameters in the item bank, in case no better bounds for \(\theta\) exist. 
avg_evaluations
¶ Average number of function evaluations for all tests the estimator has been used
Returns: average number of function evaluations

calls
¶ How many times the estimator has been called to maximize/minimize the loglikelihood function
Returns: number of times the estimator has been called to maximize/minimize the loglikelihood function

estimate
(index: int = None, items: numpy.ndarray = None, administered_items: list = None, response_vector: list = None, **kwargs) → float[source]¶ Uses
scipy.optimize.differential_evolution()
to return the theta value that minimizes the negative loglikelihood function, given the current state of the test for the given examinee.Return type: float
Parameters:  index (
int
) – index of the current examinee in the simulator  items (
ndarray
) – a matrix containing item parameters in the format that catsim understands (see:catsim.cat.generate_item_bank()
)  administered_items (
list
) – a list containing the indexes of items that were already administered  response_vector (
list
) – a boolean list containing the examinee’s answers to the administered items
Returns: the current \(\hat\theta\)
 index (

evaluations
¶ Total number of times the estimator has evaluated the loglikelihood function during its existence
Returns: number of function evaluations


class
catsim.estimation.
HillClimbingEstimator
(precision: int = 6, dodd: bool = False, verbose: bool = False)[source]¶ Bases:
catsim.simulation.Estimator
Estimator that uses a hillclimbing algorithm to maximize the likelihood function
Parameters:  precision – number of decimal points of precision
 verbose – verbosity level of the maximization method

avg_evaluations
¶ Average number of function evaluations for all tests the estimator has been used
Returns: average number of function evaluations

calls
¶ How many times the estimator has been called to maximize/minimize the loglikelihood function
Returns: number of times the estimator has been called to maximize/minimize the loglikelihood function

dodd
¶ Whether Dodd’s method will be called by estimator in case the response vector is composed solely of right or wrong answers.
Returns: boolean value indicating if Dodd’s method will be used or not.

estimate
(index: int = None, items: numpy.ndarray = None, administered_items: list = None, response_vector: list = None, est_theta: float = None, **kwargs) → float[source]¶  Returns the theta value that minimizes the negative loglikelihood function, given the current state of the
 test for the given examinee.
Return type: float
Parameters:  index (
int
) – index of the current examinee in the simulator  items (
ndarray
) – a matrix containing item parameters in the format that catsim understands (see:catsim.cat.generate_item_bank()
)  administered_items (
list
) – a list containing the indexes of items that were already administered  response_vector (
list
) – a boolean list containing the examinee’s answers to the administered items  est_theta (
float
) – a float containing the current estimated proficiency
Returns: the current \(\hat\theta\)

evaluations
¶ Total number of times the estimator has evaluated the loglikelihood function during its existence
Returns: number of function evaluations
Comparison between estimators¶
The plots below show a comparison of the different estimator available. Given three dichotomous (binary) response vectors with different numbers of correct answers, all the estimators find values for \(\hat\theta\) that maximize the loglikelihood function. Some estimators evaluate the loglikelihood less times than others, while reaching similar results, which may make them (although not necessarily) more efficient estimators.