Data Envelopment Analysis software


Data envelopment analysis (DEA): a glossary of terms.

This glossary of data envelopment analysis terms is intended to help new users of Frontier Analyst and DEA to understand data envelopment analysis and the terms associated with it. This glossary is not intended as an exhaustive list, but deals with the key terms used in DEA. The glossary is presented in alphabetical order.

To the best of our knowledge, the information given here is correct, but please let us know if you find anything in this glossary which you think is incorrect. Please email corrections to Thank you.

Aggregate efficiency A term used to describe the measure of efficiency from the CCR model.
Allocative efficiency The efficiency of a production process in converting inputs to outputs, where the cost of production is minimized for a given set of input prices. Allocative efficiency can be calculated by the ratio of cost efficiency to technical efficiency.
BCC The BCC (ratio) model is the DEA model used in Frontier Analyst when a variable returns to scale relationship is assumed between inputs and outputs. It is named BCC after Banker, Charnes and Cooper who first introduced it in a article published in Management Science (1984, Vol. 30/9, pp. 1078-1092). The BCC model measures technical efficiency. The convexity constraint in the model formulation ensures that the composite unit is of similar scale size as the unit being measured. The efficiency score obtained from this model gives a score which is at least equal to the score obtained using the CCR model.
Categorical variable Categorical variables can only assume a predefined set of discrete values. For example, when analyzing a chain of retail outlets, the analyst may want to represent the existence of a particular service, say cash dispenser machines at each outlet. Outlets are assigned a ‘1’ to indicate where the service is available and a ‘0’ where the service is not available. Categorical variables are generally used to indicate the presence or lack of a particular attribute. The use of categorical variables requires modifications to the DEA models.
CCR The CCR (ratio) model is probably the most widely used and best known DEA model. It is the DEA model used in Frontier Analyst when a constant returns to scale relationship is assumed between inputs and outputs. It was the first DEA model to be developed, named CCR after Charnes, Cooper and Rhodes who introduced this model in an article published in the European Journal of Operational Research (1978, Vol. 2 pp. 429-444). This model calculates the overall efficiency for each unit, where both pure technical efficiency and scale efficiency are aggregated into one value.
Composite unit The attributes of a composite unit(which is a hypothetical efficient unit) are determined by the projection of an inefficient unit, through the origin, to the efficiency frontier. The attributes are formed from the DMU’s (units) reference units, in the proportions indicated by the dual weights.
Constant returns to scale Constant returns to scale may be assumed if an increase in a unit’s inputs leads to a proportionate increase in its outputs i.e. there is a one-to-one, linear relationship between inputs and outputs. For example, if a 10% increase in inputs yields a 10% increase in outputs, the unit is operating at constant returns to scale. This means that no matter what scale the unit operates at, its efficiency will, assuming its current operating practices, remain unchanged.
Controlled (discretionary) inputs A controlled input is one over which the management of the unit has control and, as a result, can alter the amount of it used. (Controlled inputs are also sometimes referred to as discretionary inputs).
Convexity constraint The convexity constraint, which forms part of the formulation of the BCC model, ensures that each composite unit is a convex combination of its reference units. For a full definition of convexity, please refer to a standard non-linear programming text, such as “Nonlinear programming”, Bazaraa et al. Wiley 1993
Correlation coefficient A measure of the strength of the relationship between two variables. A relationship exists between two variables when as the value of one variable changes the other variable changes, in a related manner. The value of the correlation coefficient lies between +1 and -1. If larger values of one variable are reflected with larger values of the other, then the value of the coefficient will be positive. The stronger the relationship the closer the coefficient will be to +1. If larger values of one variable are reflected with smaller values of the other variable then the value of the coefficient will be negative. The stronger the relationship the closer the coefficient will be to -1. If there is no relationship between the variables the coefficient will be close to zero.
Cost efficiency Cost efficiency (which is also known as economic efficiency) is the ratio of the minimum cost to the actual (observed) cost.
Cross efficiency matrix A tool used to help with the identification of efficient operating practices. A unit with a high average efficiency, from a cross efficiency matrix, offers a good comparator for inefficient units to work towards. A cross efficiency matrix consists of rows and columns (i x j) each equal to the number of units in the analysis. The efficiency of unit j, is computed with the optimal weights for unit i. The higher the values given in the column j, the more likely it is that the unit j is an example of good (truly efficient) operating practices.
Data Envelopment Analysis. (DEA). Data envelopment analysis is a non-parametric technique, used for performance measurement and benchmarking. It uses linear programming to determine the relative efficiencies of a set of homogeneous (comparable) units. It is a “process based” analysis, in other words, it can be applied to any unit based enterprise, regardless of whether or not a “profit” figure is involved in the evaluation. The use of DEA also overcomes some of the problems with traditional performance measurement methods, such as ratio analysis and regression analysis. (DEA is the core analysis used by Frontier Analyst, to which Banxia Software has added a variety of extra features, such as regression analysis, to make an efficiency study easier and to provide a comprehensive efficiency analysis tool).
Data set The data set is the group of units (DMU’s) and the values of their inputs and outputs to be included in the analysis. The data set is usually presented in tabular form (often initially in a spreadsheet), where the unit names constitute the rows and the input and output variables constitute the columns. Zero values are not allowed in DEA and where the value of an input or output is missing, that particular unit may have to be omitted from the data set (unless a substitute value can be agreed upon).
Decision making unit. (DMU). Decision making unit was the name used by Charnes et al (1978) to describe the units being analyzed in DEA. The use of this term is intended to redirect the emphasis of the analysis from profit making businesses to decision making entities. In other words, the analysis which is performed can be applied to any unit based enterprise and need have nothing to do with profit.
Decreasing returns to scale. (DRS). Decreasing returns to scale. (DRS). Decreasing returns to scale are operating when an increase in a unit’s inputs result in a less than proportionate increase in its outputs.
Dual model The dual model and the primal (CCR) model provide two ways of looking at the same problem and the efficiency scores calculated are the same with both. Mathematically, the dual model is much faster to solve (although its formulation looks more complex). The difference between the two is that for each unit the dual model (internally) tries to create a hypothetical composite unit, from the existing units, that will out-perform the unit being analyzed. If, within the dual model this composite unit can be created, then the original unit is found to be inefficient, otherwise the unit is efficient.
Dual weights (l) The dual weights (l) – so called because they are calculated using the dual model and sometimes also called dual multipliers – give an indication of the importance given to a particular unit in determining the input/output mix of the composite unit. In the primal model the weights are associated with the (inputs and outputs in the model). In the dual model the weights are associated with the DMU’s.
Efficient/ efficiency frontier The efficiency frontier is the frontier (envelope) representing “best performance” and is made up of the units in the data set which are most efficient in transforming their inputs into outputs. The units that determine the frontier are those classified as being 100% efficient. Any unit not on the frontier has an efficiency rating of less than 100%.
Empirical production function, empirical production envelope and envelopment surface are all terms which are analogous to efficient frontier.
Efficiency score DEA results in each unit being allocated an efficiency score. This score is between zero (or 0%) and 1 (100%). A unit with a score of 100% is relatively efficient. Any unit with a score of less than 100% is relatively inefficient, e.g. a unit with a score of 60% is only 60% as efficient as the best performing units in the data set analyzed. The efficiency score obtained by a unit will vary depending on the other units and factors included in the analysis. Scores are relative, not absolute – they are relative to the other units in the data set. The web page gives an explanation of how DEA works.
Efficiency study The process of studying efficiency within an organisation.
Envelopment form This term is used to describe the formulation of a DEA model which involves the concept of composite units.
Epsilon (e) Epsilon is a very small positive constant (which at the time of writing is taken as 1 x 10-6 in Frontier Analyst) which is a non-Archimedean variable. This means that no real number exists by which you could multiply epsilon to get a smaller number. Epsilon is a theoretical-mathematical device to allow us to drive slack variable values to zero, without adding or subtracting any “real” amount to the objective function. In practice this means that inputs and outputs are not “abused as free commodities” (Miliotis 1992) and avoids a unit being wrongly classified as efficient.
Environmental factor An environmental factor is neither an economic resource nor a product but rather an attribute of the environment in which the units operate. An environmental factor which adds resource can be included as an input, e.g. an analysis of a chain of shops may include a factor which measures the strength of competition a shop faces in its area. Environmental factors may be measured directly or through the use of surrogate measures.
Facet Each of the segments which make up the efficient frontier is known as a facet. Generally, where efficient units make a reference set, they are located on the same facet. Facet and reference set refer to the same concept.
Global leader A global leader will act as a model of good operating practice for inefficient units. Oral and Yolalan (1990) define a global leader as an efficient unit which appears most frequently in the reference set for inefficient units.
Homogeneous A DEA study requires a set of homogeneous units. Homogeneity refers to the degree of similarity between units. The operational goals of the units should be similar, as should their operational characteristics.
Increasing returns to scale Increasing returns to scale exist when an increase in a unit’s inputs yields a greater than proportionate increase in its outputs.
Inefficient unit An inefficient unit is one which, when compared with the actual performance achieved by other units in the analysis, should be able to produce its current level of outputs with fewer inputs or generate a higher level of outputs given the same inputs.
Inputs An input is any resource used by a unit to produce its outputs (products or services). This can include resources which are not a product but are an attribute of the environment in which the units operate. They can be controlled or uncontrolled.
Input minimization Input minimization is the DEA mode adopted when the analysis tries to minimize the amount of inputs used to produce the specified outputs. (The opposite of input minimization is output maximization).
Input orientated Input orientated is a term used in conjunction with the BCC and CCR ratio models, to indicate that an inefficient unit may be made efficient by reducing the proportions of its inputs but keeping the output proportions constant (see also input minimization and output maximization). (Note: the CCR model will yield the same efficiency score regardless of whether it is input or output orientated. This is not the case with the BCC model).
Input/output mix The term “input/ output mix” refers to the relative proportions of a unit’s inputs and outputs.
Intensity factor. (Z). In the dual model the scalar, Z, is the intensity factor. The intensity factor indicates the proportional reduction in inputs (when using input minimization) or the increase in outputs (if using output maximization) to achieve efficiency.
Local returns to scale Local returns to scale describes what happens to a units outputs when the input levels are changed.
Most productive scale size. (MPSS). The most productive scale size of an efficient unit refers to the point (on the efficient frontier) at which maximum average productivity is achieved for a given input/ output mix. At MPSS constant returns to scale are operating. After reaching MPSS, decreasing returns to scale set in.
Multiplier form Associated with both the BCC and CCR models the multiplier form is both a primal and a dual formulation. The multiplier form of DEA model formulation involves virtual multipliers (see Ali and Seiford 1993).
Ordinal variable A special type of categorical variable where the factor takes on a predefined set of values ranked in a specific order.
Outlier An outlier (some times in statistics referred to as an “obscene outlier”) is a unit whose input/output mix differs significantly from the other units in the data set. Where an outlier is found to be efficient, it may introduce bias into the results.
Output Outputs are the products (goods, services or other outcome) which result from the processing and consumption of inputs (resources). An output may be physical goods or services or a measure of how effectively a unit has achieved its goals.
Output maximization Output maximization is the DEA mode adopted when the analysis tries to maximize the outputs produced for a fixed amount of inputs. (The opposite of output maximization is input minimization).
Output orientated Output orientated is a term used in conjunction with the BCC and CCR ratio models, to indicate that an inefficient unit may be made efficient by increasing the proportions of its outputs while keeping the input proportions constant (see also input minimization and output maximization). (Note: the CCR model will yield the same efficiency score regardless of whether it is input or output orientated. This is not the case with the BCC model).
Peer group Another name for a Reference Set
Primal (CCR) model Some authors differ on which model should be referred to as the primal (the first or original) model and which should be referred as the dual model. Some refer to the dual model as primal because it illustrates better the principles of DEA. Throughout this glossary and all other Frontier Analyst literature, the primal model is that referred to by Charnes et al in their original publication (Charnes et al 1978. See CCR for full reference). The primal model allows a set of optimal weights to be calculated for each variable (input and output) to maximize a unit’s efficiency score. The weights are such that were these weights applied to any other unit in the data set the efficiency score would not exceed 1 (or 100%).
Production function The production function describes the optimal relationship between inputs and outputs with the aim of maximising output for the given inputs. In DEA the equivalent of the production function is the efficiency frontier.
Productive efficiency. (Efficiency). Productive efficiency (often just referred to as efficiency) is a measure of a unit’s ability to produce outputs from a given set of inputs. (Norman and Stoker. 1991). The efficiency of a DMU is always relative to the other units in the set being analysed, so the efficiency score is always a relative measure. A unit’s efficiency is related to its radial distance from the efficient or efficiency frontier (see radial measure). It is the ratio of the distance from the origin to the inefficient unit, over the distance from the origin to the composite unit on the efficient frontier.
Productivity In the case of a process with a single input and a single output, productivity is the ratio of the unit’s outputs to its inputs. DEA does not measure productivity, it measures the efficiency of the production process. Productivity is a function of production technology, the efficiency of the production process and the production environment.
Radial measure Both the BCC and CCR ratio models use a radial or proportional measure to determine a unit’s efficiency score. A unit’s efficiency is defined by the ratio of the distance from the origin to the inefficient unit, divided by the distance from the origin to the composite unit on the efficient frontier.
Ratio models Both the BCC and CCR models are called ratio models because they define efficiency as the ratio of weighted outputs divided by weighted inputs.
Reference contribution Reference contribution indicates the degree to which a reference unit contributes to the calculation of the efficiency score for a unit.
Reference set The reference set of an inefficient unit is the set of efficient units to which the inefficient unit has been most directly compared when calculating its efficiency rating. It contains the efficient units which have the most similar input/output orientation to the inefficient unit and should therefore provide examples of good operating practice for the inefficient unit to emulate.
Results Having conducted an analysis, the DEA model will produce, for each unit, an efficiency score, virtual multipliers, intensity factors, the dual weights and the slacks. From these are calculated the virtual inputs and virtual outputs, the reference sets and improvement targets for each unit.
Scale efficiency Scale efficiency A unit is “scale efficient” when its size of operation is optimal. If its size of operation is either reduced or increased its efficiency will drop. A scale efficient unit is operating at optimal returns to scale. Scale efficiency is calculated by dividing aggregate efficiency (from the CCR model) by technical efficiency (from the BCC model).
Slack(s) Slack represents the under production of output or the over use of input. It represents the improvements needed to make an inefficient unit become efficient. These improvements are in the form of an increase/decrease in inputs or outputs.
Surrogate measures Where a measure is “intangible”, in the sense that no quantitative data exists for it, then surrogate measures can to be used. Surrogate measures are used to represent factors such as environment factors, for example a “score” for the type of neighborhood in which a unit operates, or the achievement of an organizational goal (which does not have a statistically quantifiable outcome) and so on.
Targets The values of the inputs and outputs which would result in an inefficient unit becoming efficient.
Technical efficiency A unit is said to be technically efficient if it maximizes output per unit of input used. Technical efficiency is the efficiency of the production or conversion process and is calculated independently of prices and costs. Technical efficiency is calculated using the BCC model. The impact of scale size is ignored as DMU’s are compared only with units of similar scale sizes.
Uncontrolled (exogeneously fixed) inputs/ outputs An uncontrolled or uncontrollable variable (input or output) is one over which the unit’s management does not have control and hence cannot alter its level of use or production. An example of an uncontrolled input for a retail outlet would be the number of competitors it had in its area. Uncontrollable variables are also referred to as exogeneously fixed and non-discretionary variables.
Unit A “unit” is simply a shorthand for “decision making unit” or “DMU”. Units may be outlets in a branch network of banks or shops. They may be wards in hospitals or direct labor organizations in a public authority. Data envelopment analysis can be applied to any unit based process.
Variable Variables are the input and output factors identified as being of particular importance to the operation of the units under consideration. For example, number of employees, patients treated (per hour), floor space, sales, rent, number of transactions and so on. Classification as inputs or outputs depends on the process being measured and the goals against which units are being measured. What may be an input when measured against one set of goals, may be an output when considered under another.
Variable returns to scale If an increase in a unit’s inputs does not produce a proportional change in its outputs then the unit exhibits variable returns to scale. This means that as the unit changes its scale of operations its efficiency will either increase or decrease.
Virtual input/output Virtual inputs are calculated by multiplying the value of the input with the corresponding optimal weight for the unit as given by the solution to the primal model. Similarly for virtual outputs. Virtual inputs/ outputs define the level of importance attached to each factor. The sum of the virtual inputs for each unit always equals 1. The sum of the virtual outputs is equal to the unit’s efficiency score.
Virtual multipliers Another term used to describe weights.
Weight flexibility.(Weighting/ User defined weights). The CCR (primal) model does not place any restrictions on the weights in the model, other than a minimum (lower bound) on epsilon, as a result it is possible for units to be rated as efficient through a very uneven distribution of weights. This can mean that some or most of the variables have been pretty much ignored. The Wong and Beasley (1990) weighting method (implemented in Frontier Analyst Professional) can be used to add weight restrictions to the model, if it is observed that the kind of bias described above is occurring.
Weights Within DEA models weights are the ‘unknowns’ which are calculated to determine the efficiency of the units. The efficiency score is the weighted sum of outputs divided by the weighted sum of inputs for each unit. The weights are calculated to solve the linear program, in such a way that each unit is shown in the best possible light. Weights indicate the importance attached to each factor (input/ output) in the analysis.
Window analysis Window analysis is a tabular method which allows an analysis of efficiency changes over time. The user chooses a set of time periods and then calculates the efficiency of each unit for each time period. The efficiency of a given unit over each of the time periods is treated as a new unit.